MLP-OSS 2018 : Workshop for Natural Language Processing Open Source Software

posted by user: grupocole || 2072 views || tracked by 2 users: [display]

MLP-OSS 2018 : Workshop for Natural Language Processing Open Source Software

When	Jul 19, 2018 - Jul 20, 2018
Where	Melbourne, Australia
Submission Deadline	Mar 25, 2018
Notification Due	Apr 29, 2018
Final Version Due	May 13, 2018

Categories NLP

Call For Papers

Workshop for Natural Language Processing Open Source Software (NLP-OSS)

With great scientific breakthrough comes solid engineering and open
communities. The Natural Language Processing (NLP) community has benefited
greatly from the open culture in sharing knowledge, data, and software. The
primary objective of this workshop is to further the sharing of insights on
the engineering and community aspects of creating, developing, and
maintaining NLP open source software (OSS) which we seldom talk about in
scientific publications. Our secondary goal is to promote synergies between
different open source projects and encourage cross-software collaborations
and comparisons.

We refer to Natural Language Processing OSS as an umbrella term that not
only covers traditional syntactic, semantic, phonetic, and pragmatic
applications; we extend the definition to include task-specific applications
(e.g., machine translation, information retrieval, question-answering
systems), low-level string processing that contains valid linguistic
information (e.g. Unicode creation for new languages, language-based
character set definitions) and machine learning/artificial intelligence
frameworks with functionalities focusing on text applications.

There are many workshops focusing open language resource/annotation creation
and curation (e.g. BUCC, GWN, LAW, LOD, WAC). Moreover, we have the flagship
LREC conference dedicated to linguistic resources. However, the engineering
aspects of NLP OSS is overlooked and under-discussed within the
community. There are open source conferences and venues (such as FOSDEM,
OSCON, Open Source Summit) where discussions range from operating system
kernels to air traffic control hardware but the representation of NLP
related presentations is limited. In the Machine Learning (ML) field, the
Journal of Machine Learning Research - Machine Learning Open Source Software
(JMLR-MLOSS) is a forum for discussions and dissemination of ML OSS
topics. We envision that the Workshop for NLP-OSS becomes a similar avenue
for NLP OSS discussions.

To our best knowledge, this is the first workshop proposal in the recent
years that focuses more on the building aspect of NLP and less on scientific
novelty or state-of-art development. A decade ago, there was the SETQA-NLP
(Software Engineering, Testing, and Quality Assurance for Natural Language
Processing) workshop that raised awareness of the need for good software
engineering practices in NLP. In the earlier days of NLP, linguistic
software was often monolithic and the learning curve to install, use, and
extend the tools was steep and frustrating. More often than not, NLP OSS
developers/users interact in siloed communities within the ecologies of
their respective projects. In addition to engineering aspects of NLP
software, the open source movement has brought a community aspect that we
often overlook in building impactful NLP technologies.

An example of precious OSS knowledge comes from SpaCy developer Montani
(https://ines.io/blog/spacy-commercial-open-source-nlp), who shared her thoughts
and challenges of maintaining commercial NLP
OSS, such as handling open issues on the issue tracker, model release and
packaging strategy and monetizing NLP OSS for sustainability.

Rehurek (https://rare-technologies.com/mummy-effect-bridging-gap-between-academia-industry/)
shared another example of insightful discussion on bridging the gap between
the gap between academia and industry through creating open source and
student incubation programs. Rehurek discussed the need to look beyond the
publish-or-perish culture to avoid the brittle "mummy effect" in SOTA
research code/techniques.

We hope that the NLP-OSS workshop becomes the intellectual forum to collate
various open source knowledge beyond the scientific contribution, announce
new software/features, promote the open source culture and best practices
that go beyond the conferences.

Sponsors
-------------------
Sponsorship helps keep NLP-OSS sustainable to widest possible audience. The
NLP-OSS workshop is organized by volunteers from both academia and
industry. Sponsorship goes to covering the cost of invited speakers.

If you or your company or institution are interested in sponsoring the
NLP-OSS, please send us an email at nlposs.workshop@gmail.com.

Call for papers
---------------------------
We invite full papers (8 pages) or short papers (4 pages) on topics related
to NLP-OSS broadly categorized into (i) software development, (ii)
scientific contribution and (iii) NLP-OSS case studies.

Software Development
- Designing and developing NLP-OSS
- Licensing issues in NLP-OSS
- Backwards compatibility and stale code in NLP-OSS
- Growing an NLP-OSS community
- Maintaining and motivating an NLP-OSS community
- Best practices for NLP-OSS documentation and testing
- Contribution to NLP-OSS without coding
- Incentivizing OSS contributions in NLP
- Commercialization and Intellectual Property of NLP-OSS
- Defining and managing NLP-OSS project scope
- Issues in API design for NLP
- NLP-OSS software interoperability
- Analysis of the NLP-OSS community

Scientific Contribution
- Surveying OSS for specific NLP task(s)
- Demonstration and tutorial of NLP-OSS
- New NLP-OSS introductions
- Small but useful NLP-OSS
- NLP components in ML OSS
- Citations and references for NLP-OSS
- OSS vs experiment replicability
- Gaps between existing NLP-OSS
- Task-generic vs task-specific software

Case studies
- Case studies of how a specific bug is fixed or feature is added
- Writing wrappers for other NLP-OSS
- Writing open-source APIs for open data
- Teaching NLP with OSS
- NLP-OSS in the industry

Submission information
-------------------------------------------
Authors are invited to submit a
- Full paper up to 8 pages of content
- Short paper up to 4 pages of content

All papers are allowed unlimited but sensible pages for references. Final
camera ready versions will be allowed an additional page of content to
address reviewers' comments.

Submission should be formatted according to the ACL2018 templates. We
strongly recommend you to prepare your manuscript using LaTeX:
- LaTeX
- MS Word

Submissions should be uploaded to Softconf conference management system at
https://www.softconf.com/acl2018/NLPOSS.

Organizers
------------------------
Lucy Park, NAVER Corp.
Masato Hagiwara, Duolingo Inc.
Dmitrijs Milajevs, NIST and Queen Mary University of London
Liling Tan, Rakuten Institute of Technology

Invited speakers
-------------------------------
Christopher Manning, Stanford University
Matthew Honnibal and Ines Montani, Explosion AI
Joel Nothman, University of Sydney

Important dates
------------------------------
The NLP-OSS workshop will be co-located with the ACL 2018 conference.

Paper Submission: 25th March (Sun) 23:59 American Samoa Time
Notification of Acceptance: 29th April (Sun)
Camera-Ready Version: 13th May (Sun)
Workshop: 19th/20th July (Thu/Fri)

Program committee
-------------------------------------
Martin Andrews, Red Cat Labs
Francis Bond, Nanyang Technological University
Jason Baldridge, Google
Steven Bethard, University of Arizona
Fred Blain, University of Sheffield
James Bradbury, Salesforce Research
Denny Britz, Prediction Machines
Marine Carpuat, University of Maryland
Kyunghyun Cho, New York University
Grzegorz Chrupala, Tilburg University
Hal Daume III, University of Maryland
Jon Dehdari, Think Big Analytics
Christian Federmann, Microsoft Research
Mary Ellen Foster, University of Glasgow
Michael Wayne Goodman, University of Washington
Arwen Twinkle Griffioen, Zendesk Inc.
Joel Grus, Allen Institute for Artificial Intelligence
Chris Hokamp, Aylien Inc.
Matthew Honnibal, Explosion AI
Sung Kim, Hong Kong University of Science and Technology
Philipp Koehn, Johns Hopkins University
Taku Kudo, Google
Christopher Manning, Stanford University
Diana Maynard, University of Sheffield
Tomas Mikolov, Facebook AI Research (FAIR)
Ines Montani, Explosion AI
Andreas Muller, Columbia University
Graham Neubig, Carnegie Mellon University
Vlad Niculae, Cornell CIS
Joel Nothman, University of Sydney
Matt Post, Johns Hopkins University
David Przybilla, Idio
Amandalynne Paullada, University of Washington
Delip Rao, Joostware AI Research Corp
Radim Rehurek, RaRe Technologies
Elijah Rippeth, MITRE Corporation
Abigail See, Stanford University
Carolina Scarton, University of Sheffield
Rico Sennrich, University of Edinburgh
Dan Simonson, Georgetown University
Vered Shwartz, Bar-Ilan University
Ian Soboroff, NIST
Pontus Stenetorp, University College London
Rachael Tatman, Kaggle
Tommaso Teofili, Adobe
Emiel van Miltenburg, Vrije Universiteit Amsterdam
Maarten van Gompel, Radboud University
Gael Varoquaux, INRIA
KhengHui Yeo, Institute for Infocomm Research
Marcos Zampieri, University of Wolverham