LifeLong@ASRU 2019 : Life-Long Learning for Spoken Language Systems Workshop
Call For Papers
Machine learning for speech and language understanding tasks often strongly relies on large annotated data-sets to train the models. However, data collection and manual annotation is a time-consuming, expensive process often requiring a variety of bootstrapping methods to produce models that are "good enough". This slows down the development of new features and products.
The literature on bootstrapping ML systems often overlooks the constraints of real-world applications related to:
- annotation processes (examples are often annotated by batches instead of one by one);
- privacy (transfer learning from one language to another often requires to move data from one continent to another, which violates privacy policies);
- training times and resources;
- continual learning (introducing new classes but also merging or removing old ones).
The ability to efficiently move real-world systems to new domains and languages, or to adapt to changing conditions over time also often requires a complex mixture of techniques including active learning, transfer learning, continuous on-line learning, semi-supervised learning, and data augmentation as the models used by existing systems rarely generalize well to new circumstances. For example, current machine reading comprehension models do very well answering general, factoid style questions, but perform poorly on new specialized domains such as legal documents, operational manuals, financial policies, etc. Thus, domain transfer (especially from limited annotated data or using only unsupervised techniques) is needed to make the technology work for new scenarios. To address such issues, efforts for real-world applications need improved methods for targeting new use cases, features or classes. The approach also needs to be scalable to learn from both small limited-data sets at the beginning of a system’s life-cycle to larger data sets with millions of annotated data and/or billions of unannotated data as deployed systems expand to larger user bases and use cases.
In this workshop, we aim to cover challenges in a lifelong process where new users or functionalities are added, and existing functionalities are modified. We believe the challenge is prevalent in research from both academia and industry.
# Topics of Interest
- Semi-supervised learning
- Active learning
- Unsupervised learning
- Incremental learning
- Domain adaptation
- Data generation/augmentation
- Few shot learning
- Zero shot learning
# Submission Guidelines
Please submit your paper using EasyChair.
Format: Submissions must be in PDF format, anonymized for review, written in English and follow the ASRU 2019 formatting requirements, available here. We advise you use the LaTeX template files provided by ASRU 2019.
Length: Submissions consist of up to eight pages of content. There is no limit on the number of pages for references. There is no extra space for appendices. There is no explicit short paper track, but you should feel free to submit your paper regardless of its length. Reviewers will be instructed not to penalize papers for being too short.
Dual Submission: Authors can make submissions that are also under review at other venues, provided it does not violate the policy at those venues.We do NOT require submissions to follow an anonymity period.
Presentation Format: We anticipate most papers will be presented as posters, with only a few selected for oral presentation.