posted by user: grupocole || 2720 views || tracked by 9 users: [display]

DANLP 2010 : ACL 2010 Workshop on Domain Adaptation for Natural Language Processing


When Jul 15, 2010 - Jul 15, 2010
Where Uppsala, Sweden
Submission Deadline Apr 5, 2010
Notification Due May 6, 2010
Final Version Due May 16, 2010
Categories    NLP

Call For Papers


ACL 2010 Workshop on Domain Adaptation
for Natural Language Processing (DANLP 2010)

July 15, 2010, Uppsala, Sweden


Most modern Natural Language Processing (NLP) systems are subject to
the well known problem of lack of portability to new domains/genres:
there is a substantial drop in their performance when tested on data
from a new domain, i.e., their test data is drawn from a related but
different distribution as their training data. This problem is
inherent in the assumption of independent and identically distributed
(i.i.d.) variables for machine learning systems, but has started to
get attention only in recent years. The need for domain adaptation
arises in almost all NLP tasks: part-of-speech tagging, semantic role
labeling, statistical parsing and statistical machine translation, to
name but a few.

Studies on supervised domain adaptation (where there are limited
amounts of annotated resources in the new domain) have shown that
baselines comprising of very simple models (e.g. models based only on
source-domain data, only target-domain data, or the union of the two)
achieve relatively high performance and are "surprisingly difficult to
beat" (Daume III, 2007). Thus, one conclusion from that line of work
is that as long as there is a reasonable (often even small) amount of
labeled target data, it is often more fruitful to just use that.

In contrast, semi-supervised adaptation (i.e., no annotated resources
in the new domain) is a much more realistic situation but is clearly
also considerably more difficult. Current studies on semi-supervised
approaches show very mixed results. For example, Structural
Correspondence Learning (Blitzer et al., 2006) was applied
successfully to classification tasks, while only modest gains could be
obtained for structured output tasks like parsing. Many questions thus
remain open.

The goal of this workshop is to provide a meeting-point for research
that approaches the problem of adaptation from the varied perspectives
of machine-learning and a variety of NLP tasks such as parsing,
machine-translation, word sense disambiguation, etc. We believe there
is much to gain by treating domain-adaptation as a general learning
strategy that utilizes prior knowledge of a specific or a general
domain in learning about a new domain; here the notion of a 'domain'
could be as varied as child language versus adult-language, or the
source-side re-ordering of words to target-side word-order in a
statistical machine translation system.

Sharing insights, methodologies and successes across tasks will thus
contribute towards a better understanding of this problem. For
instance, self-training the Charniak parser alone was not effective
for adaptation (it has been common wisdom that self-training is
generally not effective), but self-training with a reranker was
surprisingly highly effective (McClosky et al., 2006). Is this an
insight into adaptation that can be used elsewhere? We believe that
the key to future success will be to exploit large collections of
unlabeled data in addition to labeled data. Not only because unlabeled
data is easier to obtain, but existing labeled resources are often not
even close to the envisioned target application domain. Directly
related is the question of how to measure closeness (or differences)
among domains.

Workshop Topics

We especially encourage submissions on semi-supervised approaches of
domain adaptation with a deep analysis of models, data and results,
although we do not exclude papers on supervised adaptation. In
particular, we welcome submissions that address any of the following
topics or other relevant issues:

* Algorithms for semi-supervised DA
* Active learning for DA
* Integration of expert/prior knowledge about new domains
* DA in specific applications (e.g., Parsing, MT, IE, QA, IR, WSD)
* Automatic domain identification and model adjustment
* Porting algorithms developed for one type of problem structure to another (e.g.
from binary classification to structured-prediction problems)
* Analysis and negative results: in-depth analysis of results, i.e. which model
parts/parameters are responsible for successful adaptation; what can we learn
from negative results (impact of negative experimental results on learning strategies/
* A complementary perspective: (Better) generalization of ML models, i.e. to
make NLP models more broad-coverage and domain-independent, rather than
* Learning from multiple domains


Papers should be submitted via the ACL submission system:

All submissions are limited to 6 pages (including references) and
should be formatted using the ACL 2010 style file that can be found at:

As the reviewing will be blind, papers must not include the authors'
names and affiliations. Submissions should be in English and should
not have been published previously. If essentially identical papers
are submitted to other conferences or workshops as well, this fact
must be indicated at submission time.

The submission deadline is 23:59 CET on April 5, 2010.

Important Dates

April 5, 2010: Submission deadline
May 6, 2010: Notification of acceptance
May 16, 2010: Camera-ready papers due
July 15, 2010: Workshop

Invited speaker

John Blitzer, University of California, United States


Hal Daumé III, University of Utah, USA
Tejaswini Deoskar, University of Amsterdam, The Netherlands
David McClosky, Stanford University, USA
Barbara Plank, University of Groningen, The Netherlands
Jörg Tiedemann, Uppsala University, Sweden

Program Committee

Eneko Agirre, University of the Basque Country, Spain
John Blitzer, University of California, United States
Walter Daelemans, University of Antwerp, Belgium
Mark Dredze, Johns Hopkins University, United States
Kevin Duh, NTT Communication Science Laboratories, Japan (formerly University of Washington, Seattle)
Philipp Koehn, University of Edinburgh, United Kingdom
Jing Jiang, Singapore Management University, Singapore
Oier Lopez de Lacalle, University of the Basque Country, Spain
Robert Malouf, San Diego State University, United States
Ray Mooney, University Texas, United States
Hwee Tou Ng, National University of Singapore, Singapore
Khalil Sima'an, University of Amsterdam, The Netherlands
Michel Simard, National Research Council of Canada, Canada
Jun'ichi Tsujii, University of Tokyo, Japan
Antal van den Bosch, Tilburg University, The Netherlands
Josef van Genabith, Dublin City University, Ireland
Yi Zhang, German Research Centre for Artificial Intelligence (DFKI GmbH) and Saarland University, Germany


This workshop is kindly supported by the Stevin project PaCo-MT (Parse
and Corpus-based Machine Translation) .



Related Resources

ACL 2019   57th Annual Meeting of the Association for Computational Linguistics
LREC 2020   12th Conference on Language Resources and Evaluation
ACM--NLPIR--Ei Compendex and Scopus 2020   ACM--2020 4th International Conference on Natural Language Processing and Information Retrieval (NLPIR 2020)--Scopus, Ei Compendex
KLP@SAC 2019   Knowledge and Language Processing Track @ The 34th ACM Symposium on Applied Computing - ACM SAC 2019
FLAIRS 2020   Spoken Language Processing and Conversational Systems Track
user2agent 2019   IUI 2019 Workshop on User-Aware Conversational Agents
NLPIR--ACM, Ei and Scopus 2020   ACM--2020 4th International Conference on Natural Language Processing and Information Retrieval (NLPIR 2020)--Scopus, Ei Compendex
FinancialNews&Data-IEEE-BigData 2019   The 3rd International Workshop on Big Data for Financial News and Data
ACL 2020   The Asian Conference on Language (ACL2020)
CLNLP 2020   2020 International Conference on Computational Linguistics and Natural Language Processing (CLNLP 2020)