LLMMC 2011 : Symposium on Learning Language Models from Multilingual Corpora


When Apr 6, 2011 - Apr 6, 2011
Where York, UK
Submission Deadline Jan 19, 2011
Notification Due Feb 14, 2011
Final Version Due Feb 28, 2011
Categories    NLP

Call For Papers


Symposium on Learning Language Models from Multilingual Corpora (LLMMC)

(Part of the AISB 2011 Convention, 4-7 April 2011.)


Call for Papers

International organizations, such as the UN and the EU, news agencies, and companies operating internationally are producing large volumes of texts in different languages. As a result, large publicly-available parallel paragraph- or sentence-aligned corpora have been created for many language pairs, e.g., French-English, Chinese-English or Arabic-English. The multilingual nature of the EU has given rise to many documents available in all or many of its official languages, which have been assembled in multi-lingual parallel corpora such as Europarl (11 languages, 34-55M words for each) and JRC-Acquis (22 languages, 11-22M words for each).

These parallel corpora have been used, both monolingually and multilingually, for a variety of NLP tasks, including but not limited to machine translation, cross-lingual information retrieval, word sense disambiguation, semantic relation extraction, named entity recognition, POS tagging, and syntactic parsing. With the advent of Internet, there has been also an explosion in the availability of semi-parallel multilingual online resources like Wikipedia that have been used for similar tasks and have a big potential for future exploration and research.

In this symposium, we are interested in explicit models, usable and verifiable by humans, which could be used for either translation or for modelling individual languages, e.g., as applied to morphology, where the available translations can help identify word forms of the same lexical entry in a given language; or lexical semantics, where parallel corpora can help extract instances of relations, such as synonymy and hypernymy, which are essential for building thesauri and ontologies. The results may be compared against existing approaches for the acquisition of language models (morphology, syntax, semantics) from monolingual corpora, or combined with these in order to use the best of both approaches.

The main purpose of the symposium will be to gather and disseminate the best ideas in this new area. Thus, we welcome discussions of previously published ideas alongside original contributions. The submission format is limited to a 4-page extended abstract, which may be a position paper, or one outlining an initial idea, work in progress or completed research. The camera-ready copy of the article to be published in the symposium proceedings can be the same extended abstract of up to 4 pages or a full-length paper of up to 8 pages in the AISB convention format: All accepted papers will be allocated 25 minutes for presentation and questions. At least one of the authors will need to be registered for the event before a paper can appear in the proceedings. A considerable part of this one-day symposium will be dedicated to discussions to encourage the formations of new collaborations and consortia.

The symposium will take place alongside 9 other symposia on various aspects of AI-related research, drawing strongly on Computer Science, Psychology and Philosophy, among other disciplines. A number of plenary speakers of international fame (Alan Baddeley, Katie Slocombe, Mark Steedman, Stephen Wolfram) will add to the excitement of this interdisciplinary international event, which will take place in the historical city of York, one of the oldest and most visited towns in England.

Duration: a one-day symposium.

Important dates:

Submissions: January 19, 2011

Notification: February 14, 2011

Submission of camera-ready versions: February 28, 2011

Symposium: April 6, 2011


Dimitar Kazakov, The University of York, UK (kazakov AT cs DOT york DOT ac DOT uk)

Preslav Nakov, National University of Singapore, Singapore (preslav DOT nakov AT gmail DOT com)

Ahmad R. Shahid, The University of York, UK (ahmad AT cs DOT york DOT ac DOT uk)

Program Committee:

Graeme Blackwood, University of Cambridge, UK

Phil Blunsom, University of Oxford, UK

Francis Bond, Nanyang Technological University, Singapore

Yee-Seng Chan, University of Illinois at Urbana-Champaign, USA

Daniel Dahlmeier, National University of Singapore, Singapore

Marc Dymetman, Xerox Research Centre Europe, France

Andreas Eisele, Directorate-General for Translation, Luxembourg

Michel Galley, Stanford University, USA

Kuzman Ganchev, University of Pennsylvania, USA

Corina R Girju, University of Illinois at Urbana-Champaign, USA

Philipp Koehn, University of Edinburgh, UK

Krista Lagus, Aalto University School of Science and Technology, Finland

Wei Lu, National University of Singapore, Singapore

Elena Paskaleva, Bulgarian Academy of Sciences, Bulgaria

Katerina Pastra, Institute for Language and Speech Processing, Greece

Khalil Sima'an, University of Amsterdam, The Netherlands

Ralf Steinberger, Joint Research Centre, Italy

Joerg Tiedemann, Uppsala University, Sweden

Marco Turchi, Joint Research Centre, Italy

Jaakko V´┐Żyrynen, Aalto University School of Science and Technology, Finland

