posted by user: grupocole || 2514 views || tracked by 3 users: [display]

CREDISLAS 2012 : Workshop on CREATING CROSS-LANGUAGE RESOURCES FOR DISCONNECTED LANGUAGES AND STYLES

FacebookTwitterLinkedInGoogle

Link: http://www-lium.univ-lemans.fr/credislas2012
 
When May 27, 2012 - May 27, 2012
Where Istanbul, Turkey
Submission Deadline Feb 26, 2012
Notification Due Mar 16, 2012
Final Version Due Mar 30, 2012
Categories    NLP
 

Call For Papers

Workshop on

CREATING CROSS-LANGUAGE RESOURCES FOR DISCONNECTED LANGUAGES AND STYLES

Co-located with LREC 2012 (http://www.lrec-conf.org/lrec2012/)
Istanbul, Turkey
May 27, 2012 (afternoon session)

Deadline for paper submissions: February 26, 2012

http://www-lium.univ-lemans.fr/credislas2012



This half-day workshop aims at developing strategies and sharing
experiences on creating resources for reducing the linguistic gap
between those language pairs for which cross-language resources are
scarce. Although this specific situation has been most commonly
addressed for the case of minority languages that have scarce
resources by themselves, it also happens to be an important issue in
some other situations such as: majority languages that, because of
their cultural, historical and/or geographical disconnection, do not
count with a significant amount of cross-language resources between
them (as Chinese and Spanish, just to mention an excellent example in
this category); or, single languages for which new communication
trends and styles do not have available cross-language resources
between the main formal language and it (as chat speak style
communications and formal languages).

Current computational and data storage capabilities have favoured the
proliferation of data-driven and statistical approaches in natural
language processing and computational linguistics. Empirical evidence
has demonstrated in a large number of cases and applications how the
availability of appropriate datasets can boost the performance of
processing methods and analysis techniques. In this scenario, the
availability of data has become to play a fundamental role. On the
other hand, both the diversity of languages and the emergence of new
communication media and stylistic trends are responsible for the
scarcity of resources in the case of some specific tasks and
applications. In this sense, this workshop attempts to focus its
attention on those specific applications or cases for which data
scarcity poses a restrictive problem for data-driven approaches. This
includes the following three specific situations:

Minority Languages, for which scarcity of resources is a consequence
of the minority nature of the language itself. In this case, attention
is focused on the development of both monolingual and cross-lingual
resources. Some examples in this category include: Basque, Pashto and
Haitian Creole, just to mention a few.

Disconnected Languages, for which a large amount of monolingual
resources are available, but due to cultural, historical and/or
geographical reasons cross-language resources are actually
scarce. Some examples in this category include language pairs such as
Chinese and Spanish, Russian and Portuguese, and Arabic and Japanese,
just to mention a few.

New Language Styles, which represent different communication forms or
emerging stylistic trends in languages for which the available
resources are practically useless. This case includes the typical
examples of tweets and chat speak communications, as well as other
informal form of communications, in many languages.

The main topics of interest for this workshop include, but are not
limited to, the following ones:
* Construction and collection of monolingual resources
* Construction and collection of cross-language resources
* Annotation guidelines and evaluation
* Automatic extraction of linguistic resources
* Automatic annotation of linguistic resources
* Use of crowdsourcing for generating and annotating resources
* Use of pivot languages for bridging unconnected languages
* Methods to adapt existing resources to new domains and styles
* Generation of resources for informal communication styles
* Evaluation of monolingual resources: tasks and protocols
* Evaluation of cross-language resources: tasks and protocols


Submission instructions
-----------------------------------------
Authors are invited to submit papers on original and previously
unpublished work. Formatting should be according to LREC 2012
specifications using LaTeX or MS-Word style files (available soon at
the conference website, see http://www.lrec-conf.org/lrec2012/).

Submission is electronic in PDF format using the START submission
system at

https://www.softconf.com/lrec2012/CREDISLAS2012/

Double submission policy: Parallel submission to other meetings or
publications are possible but must be immediately notified to the
workshop contact person (see below).

Authors of accepted papers will be invited to present their research
at the workshop. The workshop papers will be part of the LREC
proceedings and published on the web site of LREC 2012 before the
conference.


Important dates
----------------------------
February 26, 2012: Paper submissions due
March 16, 2012: Notification of acceptance
March 30, 2012: Camera ready papers due
May 27, 2012: Workshop in Istanbul (afternoon session)


Organizers
---------------------
Contact person: Patrik Lambert
(e-mail: patrik.lambert@lium.univ-lemans.fr )

Patrik Lambert (University of Le Mans),
Marta R. Costa-jussĂ  (Barcelona Media Innovation Center),
Rafael E. Banchs (Institute for Infocomm Research)


Programme committe
--------------------------------------
Marianna Apidianaki, LIMSI-CNRS, Orsay, France
Jordi Atserias, Yahoo! Research, Barcelona, Spain
Victoria Arranz, ELDA, Paris, France
Gareth Jones, Dublin City University, Ireland
Min-Yen Kan, National University of Singapore
Philipp Koehn, University of Edinburgh, UK
Udo Kruschwitz, University of Essex, UK
Yanjun Ma, Baidu Inc. Beijing, China
Sara Morrissey, Dublin City University, Ireland
Maja Popovic, DFKI, Berlin, Germany
Paolo Rosso, Universidad de Valencia, Spain
Marta Recasens, Stanford University, USA
Wade Shen, Massachusetts Institute of Technology, Cambridge, USA
Haifeng Wang, Baidu Inc. Beijing, China

Related Resources

SNAM-Special Issue 2024   Datasets, Language Resources and Algorithmic Approaches on Online Wellbeing and Social Order in Asian Languages
AISC 2024   12th International Conference on Artificial Intelligence, Soft Computing
TAL-ALD 2024   Special issue of the journal Traitement Automatique des Langues (TAL) Abusive Language Detection : Linguistic Resources, Methods and Applications
JANT 2024   International Journal of Antennas
COLA - Lua Special Issue 2024   CFP: Journal of Computer Languages - Special Issue Celebrating 30 Years of the Lua Programming Language
CST 2024   11th International Conference on Advances in Computer Science and Information Technology
IberSPEECH 2024   IberSPEECH - XIII Jornadas en TecnologĂ­as del Habla and IX Iberian SLTech
IJME 2024   International Journal of Microelectronics Engineering
SLE 2024   17th ACM SIGPLAN International Conference on Software Language Engineering
NLE Special Issue 2024   Natural Language Engineering- Special issue on NLP Approaches for Computational Analysis of Social Media Texts for Online Well-being and Social Order