FSMNLP 2009 : Finite-State Methods and Natural Language Processing


Conference Series : Finite-State Methods and Natural Language Processing
When Jul 21, 2009 - Jul 24, 2009
Where Pretoria, South Africa
Submission Deadline Apr 13, 2009
Notification Due May 12, 2009
Final Version Due May 10, 2009
Categories    finite-state methods   natural language processing   finite automata   computational linguistics

Call For Papers


Finite-State Methods and Natural Language Processing FSMNLP 2009
Second Call for Papers

** Submission deadline for full papers extended to 13 April **

Eight International Workshop
University of Pretoria, South Africa (

21-24 July 2009

As in 2008, FSMNLP is merged with the FASTAR (Finite Automata
Systems Theoretical and Applied Research) workshop.


The International Workshop Series of Finite State Methods and Natural
Language Processing (FSMNLP) is a forum for researchers and
practitioners working on
- NLP applications or language/ language technology research resources,
- theoretical and implementation aspects, or
- their combinations
having *obvious relevance or an explicitly discussed relation* to
Finite-State Methods in NLP.

In the past, seven FSMNLP workshops have been organised (Budapest
1996, Ankara 1998, Helsinki 2001, Budapest 2003, Helsinki 2005,
Potsdam 2007, Ispra 2008).

We invite submissions related to all obvious or traditional FSMNLP
topics, see e.g. FSMNLP 2008. The updated list of topics includes
- all the obvious or traditional topics
- plus some new topics such as:
* common interfaces, portability, and shared methods for testing/
benchmarking/evaluation of finite-state tools
* coping with large alphabets during finite-state compilation and
in real-word applications
* fixed parameter tractability and narrowness in streamed NLP
* conventional/parallel algorithms using/manipulating
conventional/stochastic finite-state automata/paths
* applications of rational kernels to active/statistical machine
learning of finite-state models.


In recognition of its location on the African continent, this year's
FSMNLP has Finite-State Methods for Under-Resourced Languages as a
special theme. The theme is relevant to finite-state methods

- applied to practical tasks such as language survey, elicitation,
data collection, computer-aided annotation, morphological
description, modelling and normalization,
- considering demanding conditions such as linguistic complexity
and diversity, scarce resources, research infrastructures, real-
time grammar updates,
- in language processing fields such as comparative linguistics,
field linguistics, applied linguistics, language teaching, and
computer-aided translation.

The special theme does not restrict the scope but attempts to draw the
attention of contributors to the challenges of computational
linguistics in Africa. We hope that the theme teases out promising and
useful applications of Finite-State Methods in this context.



To catalyze discussion and participation, three special sessions or
subworkshops will be organized, containing presentations of regular
papers and extended abstracts presenting ongoing research, tutorials,
competitions etc. relating to the following topic areas:

1. "Finite-State Methods for African and other Under-Resourced/
Low-Density Languages"

Under-resourced/low-density languages, including many African
languages, often require documentation and language development, in
particular including methods for field linguistics, localized
information technology and basic language resources. However, the
conditions for applying large scale statistical methods do not hold in
general, while existing knowledge-based methods may not directly
generalise to low-density languages. The purpose of this subworkshop
is to take a fresh look at new resources, methods, collaboration and
innovative approaches for these (clusters of) languages. Submissions
may be concerned with the following aspects of the subworkshop topic:

- project proposals and joint projects
- basic language resources
- innovative approaches to language clusters
- field linguistics and language documentation
- iterative language description
- resource-efficient machine learning.

The local organizers are Laurette Pretorius and Sonja Bosch. The
subworkshop is tentatively accompanied by two tutorials given by Kemal
Oflazer and Colin de la Higuera (see TUTORIALS AND INVITED TALKS

2. "Practical Aspects and Experience of FS Methods and Systems"

The last decade has seen a significant increase in the number of
toolkits and implementations of finite state systems for NLP. Work on
such implementations has highlighted a wide variety of implementation
issues. Unfortunately much of this knowledge remains trade-secret or
is only embodied in implementations themselves, and is not presented
academically at conferences. This special session focuses on such
issues and knowledge and covers the following topics (as they relate
to NLP and real-life implementation issues):

- user interfaces, visualisation, tracing and debugging
- specification formalisms and languages (e.g. grammars, regular
relations, etc.)
- application programmer interfaces, interchange formats
- performance, profiling and tuning techniques
- classification, comparison and evaluation
- data-structures and representations
- compression of alphabets, lexicons and rules

3. "Tree Automata and Transducers"

In recent years, applications of formal tree language theory in
natural language processing have been on the rise, as witnessed by
papers at conferences and in journals on formal language theory,
finite automata, natural language processing, and computational
linguistics. FSMNLP 2009 will therefore have a special session/
subworkshop on tree automata and tree transducers in natural
language processing.

This subworkshop includes (but is not limited to) the following topics
as long as they relate to natural language processing:

- unweighted and weighted tree languages,
- unweighted and weighted tree transformations,
- the formalisms to represent and model them (including a.o. tree
grammars, tree automata, tree expressions, tree transducers),
- expressiveness of such models and representations,
- relations to synchronous grammars,
- learning of such models and representations,
- algorithms for pattern matching, accepting, parsing of tree
languages, and
- large-scale applications (including those in statistical machine

For each of the three subworkshops, a subcommittee of the PC supported
by further experts from the field is responsible for the program.
Please refer to the FSMNLP website for the lists of subcommittee
members and further experts.


We expect to have tutorials on the following tentative topics:

- "Developing Computational Morphology for Low- and Middle-Density
Languages" by Kemal Oflazer (Sabanci University, Turkey)
- "Machine Learning with Automata" by Colin de la Higuera
(Jean Monnet University, Saint-Etienne, France)
- "OpenFST" by Johan Schalkwyk (Google, USA)

The INVITED SPEAKERS will be announced soon.


During FSMNLP 2009, we hope to announce a small competition / shared
task related to machine learning of morphology.


FSMNLP will tentatively have a slot for a SIGFSM business meeting.
SIGFSM is currently in the final stages of formal approval as a
Special Interest Group in the Association for Computational
Linguistics (ACL).



We invite submissions of full papers i.e. scientific
contributions presenting new theoretical or experimental
results. Papers should present original, unpublished research results
and should not be submitted elsewhere simultaneously.

We also invite submission of extended abstracts containing or
describing systems descriptions/demos, progress reports/ongoing work,
joint projects/project proposals, small focused contributions,
negative results, and opinion pieces, related to either of the
subworkshop themes or the broader FSMNLP themes.

We particularly invite demo submissions, which should consist of an
extended abstract of the technical content with authors (!), full
contact information, references, acknowledgements, plus a "script
outline" of the presentation and a detailed description of hardware,
software and internet requirements.

Note that the early acceptance notification date for full papers may
help to keep travel costs for international participants reasonably
low. If you come from far away and have only an extended abstract, the
abstract can be submitted earlier as if it were a full paper.

The information about the author(s) should be omitted in the submitted
papers since the review process wil be double blind, except for demo
submissions. Submissions are electronic and in PDF format via a
web-based submission server.

Authors are encouraged to use Springer LNCS style (Proceedings and
Other Multiauthor Volumes) for LaTeX in producing the PDF
document. For graph visualization, Vaucanson-G LaTeX style,
Graphviz/dot and XFig are recommended. If you use a non-roman script
or Microsoft Word, it is advisable to warn the organizers as early as
possible. The page limit is 12 pages for full papers and 8 pages for
extended abstracts.


The on-site proceedings will be on CD.

The actual proceedings with revised regular papers will be published
after the conference in a volume of Lecture Notes in Artificial
Intelligence as a part of the LNCS Series by Springer-Verlag.

High quality extended abstracts may be invited to be included in the
LNCS postproceedings, while other extended abstracts may be
published as arranged by subworkshop organizers.

In addition, a special journal issue on Finite-State Methods and
Models in NLP is being planned. Extended versions of the papers and
abstracts may be submitted to this special issue (the publication
involves a second review/selection cycle).


Full paper submissions due: 13 April 2009 **CHANGED**
Notification of acceptance for full papers: 12 May 2009

Extended abstract submissions due: 17 May 2009
Notification of acceptance for extended abstracts: 14 June 2009

Deadline for inclusion in preproceedings: 28 June 2009



Bruce Watson (University of Pretoria, South Africa)


Andras Kornai (Budapest Institute of Technology, Hungary and
MetaCarta, Cambridge, USA) (PC chair)
Jacques Sakarovitch (Ecole nationale supérieure des
Télécommunications, Paris, France) (PC chair)
Anssi Yli-Jyrä (University of Helsinki) (PC chair)

Cyril Allauzen (Google Research, New York, USA)
Sonja Bosch (University of South Africa, South Africa)
Francisco Casacuberta (Instituto Tecnologico De Informática,
Valencia, Spain)
Damir Cavar (University of Zadar, Croatia)
Jean-Marc Champarnaud (Université de Rouen, France)
Loek Cleophas (University of Pretoria, South Africa)
Maxime Crochemore (King's College London, UK)
Jan Daciuk (Gdańsk University of Technology, Poland)
Frank Drewes (Umea University, Sweden)
Dafydd Gibbon (University of Bielefeld, Germany)
John Goldsmith (University of Chicago, USA)
Karin Haenelt (Fraunhofer Gesellschaft and University of
Heidelberg, Germany)
Thomas Hanneforth (University of Potsdam, Germany)
Colin de la Higuera (Jean Monnet University,
Saint-Etienne, France)
Johanna Högberg (Umea University, Sweden)
Arvi Hurskainen (University of Helsinki, Finland)
Lauri Karttunen (Palo Alto Research Center and
Stanford University, USA)
André Kempe (Yahoo Search Technologies, Paris, France)
Kevin Knight (University of Southern California, USA)
Derrick Kourie (University of Pretoria, South Africa)
Marcus Kracht (University of California, Los Angeles, USA)
Hans-Ulrich Krieger (DFKI GmbH, Saarbrücken, Germany)
Eric Laporte (Université de Marne-la-Vallée, France)
Andreas Maletti (Universitat Rovira i Virgili, Spain)
Michael Maxwell (University of Maryland, USA)
Stoyan Mihov (Bulgarian Academy of Sciences, Sofia, Bulgaria)
Kemal Oflazer (Sabanci University, Turkey)
Jakub Piskorski (Polish Academy of Sciences, Warsaw, Poland)
Laurette Pretorius (University of South Africa, South Africa)
Michael Riley (Google Research, New York, USA)
Strahil Ristov (Ruder Boskovic Institute, Zagreb, Croatia)
James Rogers (Earlham College, USA)
Max Silberztein (Université de Franche-Comté, France)
Bruce Watson (University of Pretoria, South Africa)
Sheng Yu (University of Western Ontario, Canada)
Menno van Zaanen (Tilburg University, the Netherlands)
Lynette van Zijl (Stellenbosch University, South Africa)


Loek Cleophas (University of Pretoria) (OC chair)
Derrick Kourie (University of Pretoria)
Jakub Piskorski (Polish Academy of Sciences, Warsaw, Poland)
Bruce Watson (University of Pretoria)
Anssi Yli-Jyrä (University of Helsinki)

