MEPDaW 2016 : CFP for the 2nd Workshop on Managing the Evolution and Preservation of the Data Web, MEPDaW 2016@ESWC2016
Call For Papers
There is a vast and rapidly increasing quantity of scientific, corporate, government and crowd-sourced data published on the emerging Data Web. Open Data are expected to play a catalyst role in the way structured information is exploited in the large scale. This offers a great potential for building innovative products and services that create new value from already collected data. It is expected to foster active citizenship (e.g., around the topics of journalism, greenhouse gas emissions, food supply-chains, smart mobility, etc.) and world-wide research according to the “fourth paradigm of science”. The most noteworthy advantage of the Data Web is that, rather than documents, facts are recorded, which become the basis for discovering new knowledge that is not contained in any individual source, and solving problems that were not originally anticipated. In particular, Open Data published according to the Linked Data Paradigm are essentially transforming the Web into a vibrant information ecosystem.
Published datasets are openly available on the Web. A traditional view of digitally preserving them by “pickling them and locking them away” for future use, like groceries, would conflict with their evolution. There are a number of approaches and frameworks, such as the LOD2 stack, that manage a full life-cycle of the Data Web. More specifically, these techniques are expected to tackle major issues such as the synchronisation problem (how can we monitor changes), the curation problem (how can data imperfections be repaired), the appraisal problem (how can we assess the quality of a dataset), the citation problem (how can we cite a particular version of a linked dataset), the archiving problem (how can we retrieve the most recent or a particular version of a dataset), and the sustainability problem (how can we spread preservation ensuring long-term access).
Preserving linked open datasets poses a number of challenges, mainly related to the nature of the LOD principles and the RDF data model. In LOD, datasets representing real-world entities are structured; thus, when managing and representing facts we need to take into consideration possible constraints that may hold. Since resources might be interlinked, effective citation measures are required to be in place to enable, for example, the ranking of datasets according to their measured quality. Another challenge is to determine the consequences that changes to one LOD dataset may have to other datasets linked to it. The distributed nature of LOD datasets furthermore makes archiving a headache.
== TOPICS ==
- Change Discovery
* Change detection and computation in data and/or vocabularies
* Change traceability
* Change notifications (e.g., PubSubHubPub, DSNotify, SPARQL Push)
* Visualisation of evolution patterns for datasets and vocabularies
* Prediction of changes
- Formal models and theory
* Formal representation of changes and evolution
* Change/Dynamicity characteristics tailored to graph data
* Query language for archives
* Freshness guarantee for query results
* Freshness guarantee in databases
- Data Archiving and preservation
* Scalable versioning and archiving systems/frameworks
* Query processing/engines for archives
* Efficient representation of archives (compression)
* Benchmarking archives and versioning strategies
Ideally the proposed solutions should be applicable at web scale.
== SUBMISSION GUIDELINES ==
Papers should be formatted according to the Springer LNCS format. For submissions that are not in the LNCS PDF format, 400 words count as one page. All papers should be submitted tohttps://easychair.org/conferences/?conf=mepdaw2016.
We envision four types of submissions in order to cover the entire spectrum from mature research papers to novel ideas/datasets and industry technical talks:
A) Research Papers (max 15 pages), presenting novel scientific research addressing the topics of the workshop.
B) Position Papers and System and Dataset descriptions (max 5 pages), encouraging papers describing significant work in progress, late breaking results or ideas of the domain, as well as functional systems or datasets relevant to the community.
C) Industry & Use Case Presentations (max 5 pages), in which industry experts can present and discuss practical solutions, use case prototypes, best practices, etc., in any stage of implementation.
D) Open RDF archiving challenge (max 5 pages), is intended to encourage developers, data publishers, and technology/tool creators to apply Semantic Web techniques to create, integrate, analyze or use an archive of linked open datasets. Thus, we expect developments showcasing developments demonstrating one (or all) of:
- useful functionality over RDF archives
- a potential commercial application or RDF archives
- tools to support/manage RDF archives at Web scale
(*) A list of recommended datasets for the challenge is available at the workshop homepage: http://eis.iai.uni-bonn.de/Event/mepdaw2016.html#challenge
All accepted papers will be published in the CEUR workshop proceedings series.
== ORGANIZING COMMITTEE ==
- Jeremy Debattista (Enterprise Information Systems, University of Bonn, Germany / Organized Knowledge, Fraunhofer IAIS, Germany)
- Jürgen Umbrich (Vienna University of Economics and Business)
- Javier D. Fernández (Vienna University of Economics and Business)