DBRank 2012 : Sixth International Workshop on Ranking in Databases
Call For Papers
In recent years, there has been a great deal of interest in developing effective techniques for ad-hoc search and retrieval in relational databases, XML databases, text and multimedia databases, scientific information systems, biological databases, and so on. In particular, a large number of emerging applications require exploratory querying on such general-purpose or domain-specific databases; examples include users wishing to search bibliographic databases or catalogs of products such as homes, cars, cameras, restaurants, photographs, etc.
Current database query languages such as SQL and XQuery follow the Boolean retrieval model, i.e., tuples or elements that exactly satisfy the selection conditions laid out in the query are returned – no more and no less. While extremely useful for the expert user, this retrieval model is inadequate for ad-hoc retrieval by exploratory users who cannot articulate the perfect query for their needs – either their queries are very specific, resulting in no (or too few) answers, or are very broad, resulting in too many answers.
To address the limitations of the Boolean retrieval model in these emerging ad-hoc search and retrieval applications, top-k queries and ranking query results are gaining increasing importance. In fact, in many of these applications, ranking is an integral part of the semantics, e.g., keyword search, similarity search in multimedia as well as document databases. The increasing importance of ranking is directly derived from the explosion in the volume of data handled by current applications. The user would be overwhelmed by too many unranked results. Furthermore, the sheer amount of data makes it almost impossible to process queries in the traditional compute-then-sort approach. Hence, ranking comes as a great tool for soliciting user preferences and data exploration. Ranking imposes several challenges for almost all data-centric systems.
Relational Data: In relational databases, a large body of work has been recently proposed to support ranking as a first-class construct through rank-aware algebra, ranking operators and new optimization frameworks that integrate ranking in plan enumeration and costing. There has been exciting recent work on automatic learning of appropriate ranking functions for database applications (e.g., based on adaptations of IR ranking functions to leverage dependency information in structured data), on designing expressive languages for user preferences modeling, on adaptation of keyword querying paradigms to relational databases, as well as exciting new developments in new top-k algorithms for relational, documents as well as multimedia databases.
XML and Text Data: Ranking query results in semistructured and XML databases has also been attracting attention and has led to a W3C XML Full-Text proposal. The key challenge is to integrate the content and the structure in the ranking process. Performance challenges also arise especially if XML documents are viewed as graphs (through ID-IDREF edges). Although ranking on text databases has been studied for decades, the challenges are ever increasing due the growing corpus of text documents, the availability of ontological systems, and novel user feedback mechanisms (Web reviews, social networks and blogs). Information extraction systems increasingly provide structure into traditionally unstructured sources of textual data, further blurring the lines between structured database systems and unstructured information retrieval mechanisms.
Multimedia and Domain-Specific Data: Ranking multimedia objects is still an unsolved problem, due to its high complexity. Furthermore, it has become clear that a “one solution fits all” approach is suboptimal. For instance, the ranking techniques need to be adapted for different domains like bibliographic, biological, clinical, and scientific data.
Multidimensional Data Analysis: Recently, there has been an increasing interest in the analysis of multidimensional data, with respect to ranking tools. Skyline queries identify a subset of the data, which contains the top answers to all potential ranking queries with monotone aggregate functions. By identifying the dominance relationships be-tween objects with respect to ranking functions at different subspaces, one can generate interesting partitionings and partial orders for the data. Materialization techniques that accelerate the processing of top-k queries can also be developed, with the help of such rank-based analysis.
Social Data Analysis: The explosive growth of online social networks and web communities, which enable users to engage in content creation, sharing, and online collaboration, has fueled an increasing interest in the analysis of social data. Searching in social networks and online communities may have many different interpretations: searching for the most expert users, the most authoritative user-contributed content, the closest contacts, and so forth. Consequently, ranking query results may require exploring new ranking functions, taking into account qualities such as trust, proximity and authority. These new ranking functions along with the inherent complexity of social data create new computational challenges and call for efficient algorithms for computing the top result.
We strongly believe that this workshop will provide more insight into supporting ranking in various applications and will be an interesting addition to VLDB 2012; the workshop will be a great venue for the many research groups working on ranking worldwide, with a unique opportunity to share their experience in supporting ranking in various data-centric applications, from relational to semi-structures, unstructured and domain-specific data; and on different levels from query formulation and preference modeling to query processing and optimization frameworks. In the following we give a tentative list of topics to be covered by the workshop:
- Top-k search and ranking over different types of data, including:
* relational data
* XML data
* textual data
* the Web
* graph data
- Top-k applications, such as:
* keyword search
* similarity search
* social search
* nearest-neighbor search
- Ranking tools for:
* data exploration
* multidimensional data analysis
* streams and continuous monitoring systems
* distributed and peer-to-peer databases
- Rank-aware query processing and optimization, including:
* New fundamental developments in top-k algorithms
* Cost-models for top-k algorithms and operators
- Ranking and Preferences:
* User preference specification and query languages
* Personalized ranking functions
* Learning user preferences and ranking functions