The INEX Linked Data Track is turning into its second iteration this year and will be held in conjunction with CLEF 2013. For this year, we have created a brand-new document collection that consists of more than 12 million Wikipedia articles. All articles have been transformed into in XML format and were enriched with links to Linked-Open-Data sources such as YAGO2 and DBpedia 3.8. Additionally, all articles contain highly structured RDF property sections for their corresponding Wikipedia entities in order to also facilitate structured retrieval techniques over this kind of semi-structured document format. We provide a benchmark of more than 100 queries ("topics") which are formulated in three different formats in order to encourage approaches pursuing either natural-language question answering, Ad-hoc keyword search, or SPARQL-style retrieval (with full-text filter conditions).
Continuing the long tradition of INEX, the goal of the Linked Data Track is to investigate information retrieval (IR) techniques over a combination of textual and highly structured data, where RDF properties carry additional key information about semantic relations among entities that cannot be captured by keywords alone. We intend to investigate if and how structural information could be exploited to improve Ad-hoc retrieval performance, and how it could be used in combination with structured queries to help users navigate or explore large sets of results, or to address Jeopardy-style natural-language queries which are also translated into a SPARQL-based query format.
The Linked Data Track thus aims to close the gap between Question Answering, IR-style keyword search, and Semantic-Web-style reasoning techniques. Our goal is to bring together different communities and to foster research at the intersection of Information Retrieval, Databases, and the Semantic Web.
See https://inex.mmci.uni-saarland.de/tracks/lod/ for more details about run submissions and the organization of INEX 2013.