WOSP 2016 : 5th International Workshop on Mining Scientific Publications


When Jun 22, 2016 - Jun 23, 2016
Where Newark, NJ, USA
Submission Deadline Apr 27, 2016
Notification Due May 15, 2016
Final Version Due Jun 15, 2016
Categories    scholarly publications   digital libraries   repositories   text mining

Call For Papers

Workshop page:
Conference page:


Digital libraries that store scientific publications are becoming increasingly central to the research process. They are not only used for traditional tasks, such as finding and storing research outputs, but also as a source for discovering new research trends or evaluating research excellence. With the current growth of scientific publications deposited in digital libraries, it is no longer sufficient to provide only access to content. To aid research, it is especially important to leverage the potential of text and data mining technologies to improve the process of how research is being done.

This workshop aims to bring together people from different backgrounds who:
(a) are interested in analysing and mining databases of scientific publications
(b) develop systems that enable such analysis and mining of scientific databases (especially those who run databases of publications)
(c) who develop novel technologies that improve the way research is being done.


The topics of the workshop will be organised around the following themes:

1. The whole ecosystem of infrastructures including repositories, aggregators, text-and data-mining facilities, impact monitoring tools, datasets, services and APIs that enable analysis of large volumes of scientific publications.
2. Semantic enrichment of scientific publications by means of text and data mining, crowdsourcing or other methods.
3. Analysis of large databases of scientific publications to identify research trends, high impact, cross-fertilisation between disciplines, research excellence etc.

Topics of interest relevant to theme 1 include but are not limited to:

- Infrastructures including repositories, aggregators, text-and data-mining facilities, impact monitoring tools, datasets, services and APIs for accessing scientific publications and/or research data.

Topics of interest relevant to theme 2 include but are not limited to:

- Novel information extraction and text-mining approaches to semantic enrichment of publications.
- Automatic categorization and clustering of scientific publications.
- New methods and models for connecting and interlinking scientific publications.
- Models for semantically representing and annotating publications.
- Semantically enriching/annotating publications by crowdsourcing.

Topics of interest relevant to theme 3 include but are not limited to:

- New methods, models and innovative approaches for measuring impact of publications.
- New methods for measuring performance of researchers.
- Evaluating impact of research groups.
- Methods for identifying research trends and cross-fertilization between research disciplines.
- Application and case studies of mining from scientific databases and publications.
- Improving the infrastructure of repositories to support the development and integration of new impact and performance metrics.


We would like to invite the workshop participants to makes use of the CORE publications dataset containing large volume of research publications from a wide variety of research areas. The dataset contains not only full-texts, but also an enriched version of publications' metadata. This dataset provides a framework for developing and testing methods and tools addressing the workshop topics. The use of this dataset is not mandatory, however it is encouraged.

The dataset is available through the CORE portal:


The workshop on Mining Scientific Publications aims to bring together researchers, digital library developers and practitioners from government and industry to address the current challenges in the domain of mining scientific publications.


We invite submissions related to the workshop's topics. Long papers should not exceed 8 pages and short papers should not exceed 4 pages of the ACM style. Furthermore, we welcome demo presentations of systems or methods. A demonstration submission should consist of a maximum two-page description of the system, method or tool to be demonstrated. All submissions will be uploaded to EasyChair for a peer-review.

Papers should be submitted using the EasyChair system provided here:

Successful submissions will be published in the D-Lib Magazine (


April 24 -- New submission deadline
May 15 -- Notification of acceptance
June 15 -- Camera ready
June 22-23 -- Workshop

The dates are at this stage indicative only and can change.


Jie Tang, Tsinghua University
second keynote TBA


Petr Knoth, Knowledge Media institute, The Open University, UK
Drahomira Herrmannova, Knowledge Media institute, The Open University, UK
Lucas Anastasiou, Knowledge Media institute, The Open University, UK
Nancy Pontika, Knowledge Media Institute, The Open University, UK


Pável Calado, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Bradford Demarest, Indiana University Bloomington, USA
Iryna Gurevych, Darmstadt University of Technology, Germany
Antoine Isaac, Europeana & VU University Amsterdam, Netherlands
Roman Kern, Graz University of Technology, Austria
Martin Klein, Los Alamos National Laboratory, USA
Paolo Manghi, ISTI-CNR, Italy
Bruno Martins, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Franco Maria Nardini, ISTI-CNR, Italy
Francesco Osborne, KMi, The Open University, UK
Eloy Rodrigues, Universidade do Minho, Portugal
Angelo Antonio Salatino, KMi, The Open University, UK
Pavel Smrz, Brno University of Technology, Czech Republic
Wojtek Sylwestrzak, ICM Univeristy of Warsaw, Poland
Vetle Torvik, University of Illinois at Urbana-Champaign, USA
Saeed Ul Hassan, Information Technology University, Pakistan

More details available on the workshop website:

