birndl 2017 : Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries
Call For Papers
Call for Papers
**Submission deadline has been extended until 4 June, 2017**
You are invited to participate in the 2nd Joint Workshop on Bibliometric-enhanced IR and NLP for Digital Libraries (BIRNDL), to be held as part of 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017) in Tokyo, Japan on 11th August 2017.
We are happy to announce that the past BIR and NLPIR4DL organizers are proposing this workshop at SIGIR together. In conjunction with the BIRNDL workshop, we will hold the 3rd CL-SciSumm Shared Task in Scientific Document Summarization.
Reports from the shared task systems will be featured as part of a session at the workshop.
Aim of the Workshop
The BIRNDL workshop is the first step to foster a reflection on interdisciplinarity, and the benefits that the disciplines bibliometrics, IR and NLP can derive from it in a digital libraries context. The workshop is intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, text mining and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. Researchers are in need of assistive technologies to track developments in an area, identify the approaches used to solve a research problem over time and summarize research trends. Digital libraries require semantic search, question-answering and automated recommendation and reviewing systems to manage and retrieve answers from scholarly databases. Full document text analysis can help to design semantic search, translation and summarization systems; citation and social network analyses can help digital libraries to visualize scientific trends, bibliometrics and relationships and influences of works and authors. All these approaches can be supplemented with the metadata supplied by digital libraries, inclusive of usage data, such as download counts.
We invite papers and presentations that incorporate insights from IR, bibliometrics and NLP to develop new techniques to address the open problems in Big Science, such as evidence-based searching, measurement of research quality, relevance and impact, the emergence and decline of research problems, identification of scholarly relationships and influences and applied problems such as language translation, question-answering and summarization. Finding relevant scholarly literature is key point of the workshop and sets the agenda for tools and approaches to be discussed and evaluated at BIRNDL. At the workshop, we would also like to address the need for established, standardized baselines, evaluation metrics and test collections.
See the proceedings of the first BIRNDL workshop at JCDL 2016 and a recent report in SIGIR Forum.
This workshop will be relevant to scholars in computer and information science, specialized in IR, bibliometrics and NLP. The Shared Task is expected to be of interest to a broad community including those working in CL and NLP, especially in the sub-disciplines of text summarization, discourse structure in scholarly discourse, paraphrase, textual entailment and text simplification. The workshop will also be of importance for all stakeholders in the publication pipeline: implementers, publishers and policymakers. Formal citation metrics are increasingly a factor in decision-making by universities and funding bodies worldwide, making the need for research in applying these metrics more pressing. Today's publishers continue to provide new ways to support their consumers in disseminating and retrieving the right published works to their audience. Even when only considering the scholarly sites within Computer Science, we find that the field is well-represented - ACM Portal, IEEE Xplore, Google Scholar, PSU's CiteSeerX, MSR's Academic Search, Elsevier’s Mendeley, Tsinghua's ArnetMiner, Trier's DBLP, Hiroshima's PRESRI; with this workshop we hope to bring a number of these contributors together.
We invite stimulating as well as unpublished submissions on topics including - but not limited to - full-text analysis, multimedia and multilingual analysis and alignment as well as the application of citation-based NLP or information retrieval and information seeking techniques in digital libraries. Specific examples of fields of interests include (but are not limited to):
- Infrastructure for scientific mining and IR
- Semantic and Network-based indexing, navigation, searching and browsing in structured data
- Discourse structure identification and argument mining from scientific papers
- Summarisation and question-answering for scholarly DLs;
- Bibliometrics, citation analysis and network analysis for IR
- Task based user modelling, interaction, and personalisation
- Recommendation for scholarly papers, reviewers, citations and publication venues
- Measurement and evaluation of quality and impact
- Metadata and controlled vocabularies for resource description and discovery; Automatic metadata discovery, such as language identification
- Disambiguation issues in scholarly DLs using NLP or IR techniques; Data cleaning and data quality
For the paper sessions, we especially invite descriptions of running projects and ongoing work as well as contributions from industry. Papers that investigate multiple themes directly are especially welcome.
The CL-SciSumm Shared Task
The 3rd Computational Linguistics (CL) Scientific Summarization Shared Task is sponsored by Microsoft Research Asia and will be conducted as a part of this workshop. This is the first medium-scale shared task on scientific document summarization in the computational linguistics domain. The current shared task will be on automatic paper summarization in the Computational Linguistics (CL) domain. The output summaries will be of two types: faceted summaries of the traditional self-summary (the abstract) and the community summary (the collection of citation sentences ‘citances’). We also propose to group the citances by the facets of the text that they refer to.
This task follows up on the successful CLScisumm-2016 task @ JCDL 2016, Newark, NJ, USA and a Pilot Task conducted as a part of the BiomedSumm Track at the Text Analysis Conference 2014 (TAC 2014). In this task, a training corpus of ten topics from CL research papers was released. Participants were invited to enter their systems in a task-based evaluation.
The CLSciSumm17 corpus is expected to be of interest to a broad community including those working in computational linguistics and natural language processing, text summarization, discourse structure in scholarly discourse, paraphrase, textual entailment and text simplification.
Submissions: **June 4, 2017** (changed May 23, 2017)
Notification: June 23, 2017
Camera Ready Contributions: TDB
Workshop: 11th August 2017 in Tokyo, Japan
Check the CL-SciSumm 2017 Shared Task homepage for details on dates with respect to the shared task. The dates are coordinated.
All deadlines for the BIRNDL workshop are calculated as 11:59pm Baker Island Time (BIT: UTC/GMT-12).
All submissions must be written in English, following the Springer LNCS author guidelines (max. 6 pages for short and 12 pages for full papers; exclusive of unlimited pages for references) and should be submitted as PDF files to EasyChair. All submissions will be reviewed by at least two independent reviewers. Please be aware of the fact that at least one author per paper needs to register for the workshop and attend the workshop to present the work. In case of no-show the paper (even if accepted) will be deleted from the proceedings and from the program Submissions and reviewing will be managed by the EasyChair conference management system.
Workshop proceedings will be deposited online in the CEUR workshop proceedings publication service (ISSN 1613-0073) and on the ACL anthology (Anthology prefix W17-33xx) - This way the proceedings will be permanently available and citable (digital persistent identifiers and long term preservation)