WOSP 2017 : 6th International Workshop on Mining Scientific Publications


When Jun 19, 2017 - Jun 19, 2017
Where Toronto, Ontario, Canada
Submission Deadline May 1, 2017
Notification Due May 18, 2017
Final Version Due Jun 12, 2017
Categories    information retrieval   machine learning   text mining   scholarly data

Call For Papers


Digital libraries that store scientific publications are becoming increasingly central to the research process. They are not only used for traditional tasks, such as finding and storing research outputs, but also as a source for discovering new research trends or evaluating research excellence. With the current growth of scientific publications deposited in digital libraries, it is no longer sufficient to provide only access to content. To aid research, it is especially important to leverage the potential of text and data mining technologies to improve the process of how research is being done.

This workshop aims to bring together people from different backgrounds who:
(a) are interested in analysing and mining databases of scientific publications
(b) develop systems that enable such analysis and mining of scientific databases (especially those who run databases of publications)
(c) who develop novel technologies that improve the way research is being done.


The topics of the workshop will be organised around the following themes:

1. The whole ecosystem of infrastructures including repositories, aggregators, text-and data-mining facilities, impact monitoring tools, datasets, services and APIs that enable analysis of large volumes of scientific publications and surrounding issues, such as interoperability and data sharing.
2. Semantic enrichment of scientific publications by means of text and data mining, crowdsourcing or other methods.
3. Analysis of large databases of scientific publications to identify research trends, high impact, cross-fertilisation between disciplines, research excellence etc.

Topics of interest relevant to theme 1 include but are not limited to:

- Infrastructures including repositories, aggregators, text-and data-mining facilities, impact monitoring tools, datasets, services and APIs for accessing scientific publications and/or research data.
- Interoperability issues in research TDM workflows
- Issues around integration of cutting-edge tools in production systems

Topics of interest relevant to theme 2 include but are not limited to:

- Information extraction and text-mining applied to scholarly data.
- Automatic categorization and clustering of scholalry data
- Approaches to information retrieval of academic publications
- Academic recommender systems
- Models for semantically representing and annotating publications (ontologies, interoperability issues, etc.)
- Literature-based discovery
- (Reproducible) text and data mining workflows for scientific publications
- Scholarly knowledge graphs

Topics of interest relevant to theme 3 include but are not limited to:

- Measuring impact of publications (bibliometrics, webometrics, altmetrics, semantometrics)
- Higher-level impact metrics to assess performance of researchers, departments, universities, etc.
- Analysing research collaboration networks
- Methods for identifying research trends and cross-fertilization between research disciplines.
- Application and case studies of mining from scientific databases and publications.


We would like to invite the workshop participants to makes use of the CORE publications dataset containing large volume of research publications from a wide variety of research areas. The dataset contains not only full-texts, but also an enriched version of publications' metadata. This dataset provides a framework for developing and testing methods and tools addressing the workshop topics. The use of this dataset is not mandatory, however it is encouraged.

The dataset is available through the CORE portal:


The workshop on Mining Scientific Publications aims to bring together researchers, digital library developers and practitioners from government and industry to address the current challenges in the domain of mining scientific publications.


We invite submissions related to the workshop's topics. Long papers should not exceed 8 pages and short papers should not exceed 4 pages of the ACM style. Furthermore, we welcome demo presentations of systems or methods. A demonstration submission should consist of a maximum two-page description of the system, method or tool to be demonstrated. All submissions will be uploaded to EasyChair for a peer-review.

Papers should be submitted using the EasyChair system provided here:

Successful submissions will be published as a special issue in the D-Lib journal ( See previous proceedings at:


Sunday May 1 -- Submission deadline
Thursday May 18 -- Notification of acceptance
Monday June 12 -- Camera ready
Monday June 19 -- Workshop

The dates are at this stage indicative only and can change.


Waleed Ammar, Allen Institute for Artificial Intelligence
Jevin D. West, University of Washington
Topics TBD


Petr Knoth, Knowledge Media institute, The Open University, UK
Dasha Herrmannova, Knowledge Media Institute, The Open University, UK
David Pride, Knowledge Media institute, The Open University, UK
Anita Khadka, Knowledge Media Institute, The Open University, UK

9. PROGRAMME COMMITTEE (tentative - based on 2016)

Pável Calado, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Bradford Demarest, Indiana University Bloomington, USA
Iryna Gurevych, Darmstadt University of Technology, Germany
Antoine Isaac, Europeana & VU University Amsterdam, Netherlands
Roman Kern, Graz University of Technology, Austria
Martin Klein, Los Alamos National Laboratory, USA
Paolo Manghi, ISTI-CNR, Italy
Bruno Martins, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Franco Maria Nardini, ISTI-CNR, Italy
Francesco Osborne, KMi, The Open University, UK
Eloy Rodrigues, Universidade do Minho, Portugal
Angelo Antonio Salatino, KMi, The Open University, UK
Pavel Smrz, Brno University of Technology, Czech Republic
Wojtek Sylwestrzak, ICM Univeristy of Warsaw, Poland
Vetle Torvik, University of Illinois at Urbana-Champaign, USA
Saeed Ul Hassan, Information Technology University, Pakistan
Ziqi Zhang, University of Sheffield, UK

More details available on the workshop website:

