FSMNLP 2012 : Finite-State Methods and Natural Language Processing


Conference Series : Finite-State Methods and Natural Language Processing
When Jul 23, 2012 - Jul 25, 2012
Where Donostia - San Sebastián, Spain
Submission Deadline May 9, 2012
Notification Due May 31, 2012
Final Version Due Jun 18, 2012
Categories    NLP

Call For Papers

Finite-State Methods and Natural Language Processing - FSMNLP 2012
Tenth International Workshop
University of the Basque Country, Donostia - San Sebastián
July 23-25, 2012


The International Workshop Series of Finite-State Methods and Natural
Language Processing (FSMNLP) is the premier forum of the ACL Special
Interest Group on Finite-State Methods (SIGFSM). It serves researchers and
practitioners working on

* natural language processing (NLP) applications or language resources
* theoretical and implementation aspects, or
* their combinations

that have obvious relevance or an explicitly discussed relation to
Finite-State Methods in NLP. This year, the Finite-State Methods and Natural
Language Processing (FSMNLP) workshop is part of the Alan Turing Year on the
occasion of the Centenary Celebration of his life and work.


FSMNLP invites papers related to themes including but not limited to:

* NLP applications and linguistic aspects of finite-state methods
* Finite-state models of language
* Practices for building morphological models for the world's languages
using finite-state technology
* Machine learning of finite-state models of natural language
* Finite-state manipulation software and tools with relevance to NLP


Finite-State Methods and Natural Language Processing - FSMNLP 2012
Tenth International Workshop
University of the Basque Country (Miramar Palace), Donostia - San Sebastián
July 23-25, 2012

Early registration, June 20:



July 23

15.00-18.00: TUTORIALS
Spelling and grammar correction with FSTs (Tommi Pirinen/University of Helsinki, Iñaki Alegria/University of the Basque Country, Mans Hulden/Ikerbasque (Basque Science Foundation))
Coffee Break
Probabilistic parsing with weighted FSTs (Miikka Silfverberg/University of Helsinki)
Grapheme-to-phoneme training and conversion with WFSTs (Josef Novak/The University of Tokyo)

July 24

9.00: Opening

9.30-10.30: Invited Speaker
Kimmo Koskenniemi

10.30-11.00: Coffee Break

11.00-12.00: Long Papers I
Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction (Tommi Pirinen and Sam Hardwick)
Practical Finite State Optimality Theory (Dale Gerdemann and Mans Hulden)
12.00-13.00: Short Papers I
Handling Unknown Words in Arabic FST Morphology (Khaled Shaalan and Mohammed Attia)
Urdu – Roman Transliteration via Finite State Transducers (Tina Bögel)
Integrating Aspectually Relevant Properties of Verbs into a Morphological Analyzer for English (Katina Bontcheva)
Finite-State Technology In A Verse-Making Tool (Manex Agirrezabal, Iñaki Alegria, Bertol Arrieta and Mans Hulden)

13.00-14.30: Lunch

14.30-15.30: Short Papers II
DAGGER: A Toolkit for Automata on Directed Acyclic Graphs (Daniel Quernheim and Kevin Knight)
WFST-Based Grapheme-To-Phoneme Conversion: Open Source Tools for Alignment, Model-Building and Decoding (Josef Novak, Nobuaki Minematsu and Keikichi Hirose)
Kleene, A Free and Open-Source Language for Finite-State Programming (Kenneth R Beesley)
Implementation of Replace Rules using Preference Operator (Senka Drobac, Miikka Silfverberg and Anssi Yli-Jyrä)

15.30-16.30: Short Papers III
First Approaches On Spanish Medical Record Classification Using Diagnostic Term To Class Transduction (Alicia Pérez, Maite Oronoz, Arantza Casillas, Arantza Díaz de Ilarraza and Koldo Gojenola)
Developing an Open-Source FST Grammar for Verb Chain Transfer in a Spanish-Basque MT System (Aingeru Mayor, Mans Hulden and Gorka Labaka)
Conversion of Procedural Morphologies To Finite-State Morphologies: A Case Study Of Arabic (Mans Hulden and Younes Samih)
A Methodology for Obtaining Concept Graphs from Word Graphs (Marcos Calvo, Jon Ander Gómez, Lluís-F Hurtado and Emilio Sanchis)

16.30-18.00: Posters and coffee

20.30: Dinner (in the Old Town)

July 25

9.30-11.00: Long Papers II
A Finite-State Temporal Ontology and Event-Intervals (Tim Fernando)
A Finite-State Approach to Phrase-Based Statistical Machine Translation (Jorge González)
Finite-State Acoustic and Translation Model Composition in Statistical Speech Translation: Empirical Assessment (Alicia Pérez, M. Inés Torres and Francisco Casacuberta)

11.00-11.30: Coffee Break

11.30-12.30: Long Papers III
Refining the Design of a Contracting Finite-State Dependency Parser (Anssi Yli-Jyrä, Jussi Piitulainen and Atro Voutilainen)
Lattice-Based Minimum Error Rate Training using Weighted Finite-State Transducers with Tropical Polynomial Weights (Aurelien Waite, Graeme Blackwood and William Byrne)

12.30-13.30: Business Meeting


FSMNLP 2012 will have a special focus on the following topics:

* Practical implementations of linguistic descriptions with finite-state
technology, including grammars, machine learning tools, language-specific
challenges to finite-state NLP
* Software tools and utilities for finite-state NLP
* Finite-state models of linguistic theories
* Applications of finite-state-based NLP in closely related fields such
as comparative linguistics, text processing, field linguistics, applied
linguistics, language teaching, and computer-aided translation.

The special theme does not restrict the scope of papers; rather, the purpose
is to attract, apart from other topics, a variety of submissions relating to
any practical dimension of finite-state NLP. Especially welcome are
succinct short paper (see format below) submissions focused on some specific
practical aspect or solution of finite-state NLP. This could be related to
e.g. a grammar, a linguistic phenomenon, a linguistic modeling problem, a
machine learning problem, or a software tool that implements or uses
finite-state technology. We hope that the theme provides us with the
possibility of organizing suitable poster, short presentation, and demo
sessions related to particular practical problems.

For example, today, a large number of languages have
morphological/phonological models based on finite-state technology. While
most such implementations follow standard design patterns, many grammars
also contain elegant non-trivial solutions to some language-specific
modeling problem (vowel harmony, reduplication, long-distance agreement,
opaque phonological alternations, free variation, morphosyntactic
restrictions, etc.). For this year's FSMNLP, we encourage submissions of
short papers that focus on documenting such solutions, at the same time
providing context by summarizing the development and implementation of the
overall grammar.


Nine FSMNLP workshops have been organized in the past in: Blois (2011),
Pretoria (2009), Ispra (2008), Potsdam (2007), Helsinki (2005), Budapest
(2003), Helsinki (2001), Ankara (1998), Budapest (1996).


FSMNLP 2012 will be organized shortly before CIAA. The 17th International
Conference on Implementation and Application of Automata (CIAA) will take
place in Porto, Portugal on July 17-20, 2012 (800 km from Donostia-San

These consecutive periods as well as the European locations facilitate the
attendance at both conferences.


Submission Deadline: May 9, 2012
Notification: May 31, 2012
Camera-ready Version: June 18, 2012

Workshop Dates:
July 23: tutorials& mini-workshops.
July 24-25: main session, poster and demo sessions.


Papers should present original, unpublished research and implementation
results. Simultaneous submission to other venues with published proceedings
is prohibited. FSMNLP accepts two kinds of submissions:

* full papers (8 pages + references) reporting completed, significant research,
* short papers (4 pages + references) reporting ongoing work and partial
results, implementations, grammars, practical tools, interactive software demos, etc.

Short papers are expected to be presented as system demos in a demo session,
posters and/or short presentations, while long papers are presented in a
longer presentation session. Both paper types are published in the ACL

All submissions are electronic and in PDF format via a web-based submission
server. Authors are strongly encouraged to use ACL (2012) style (available
for LaTeX and Word) in producing the PDF document. These templates are
available at:

Information about the author(s) and other identifying information such as
obvious self-references (e.g., "We showed in [12] ...") and financial or
personal acknowledgements should be omitted in the submitted papers whenever

Papers will be submitted electronically in PDF using the EasyChair
system. The paper can contain a clearly marked appendix and data files to
support your claims. This material will not be published. While reviewers
are urged to consult this extra material for better comprehension, it is at
their discretion whether they do so. Such extra material should also be
anonymized to the extent feasible.

Use the following link for submission:

The papers and abstracts will be published in FSMNLP 2012 proceedings and
archived in the ACL Anthology. The publication of selected, revised
versions of accepted papers in a special journal issue is planned.


Kimmo Koskenniemi (University of Helsinki)


We have planned a series of short tutorials (30-60 min. each) addressing
some specific application or topic in a concise manner. So far, the
following tutorials are planned:

* Spelling and grammar correction with FSTs (Iñaki Alegria, Mans Hulden)
* Probabilistic parsing with weighted FSTs (Miikka Silfverberg, University
of Helsinki)
* Machine learning of automata and transducers


Iñaki Alegria (University of the Basque Country)
Kenneth R. Beesley (SAP Business Objects, USA)
Francisco Casacuberta (Instituto Tecnológico de Informática, Spain)
Jan Daciuk (Gdansk University of Technology, Poland)
Frank Drewes (Umea University, Sweden)
Dale Gerdemann (University of Tübingen, Germany)
Mike Hammond (University of Arizona, USA)
Thomas Hanneforth (University of Potsdam, Germany)
Colin de la Higuera (University of Nantes, France)
Jan Holub (Czech Technical University in Prague, Czech Republic)
Mans Hulden (Ikerbasque, Basque Country)
André Kempe (CADEGE Technologies& Consulting, France)
Andras Kornai (Eötvös Loránd University, Hungary)
Andreas Maletti (University of Stuttgart, Germany)
Mark-Jan Nederhof (University of St Andrews, Scotland)
Kemal Oflazer (Carnegie Mellon University, Qatar)
Maite Oronoz (University of the Basque Country)
Laurette Pretorius (University of South Africa, South Africa)
Strahil Ristov (Ruder Boskovic Institute, Croatia)
Frederique Segond Frederique (ObjectDirect, France)
Max Silberztein (Université de Franche-Comté, France)
Richard Sproat (Oregon Health and Science University, USA)
Trond Trosterud (University of Tromso, Norway)
Shuly Wintner (University of Haifa, Israel)
Anssi Yli-Jyra (University of Helsinki, Finland)
Menno van Zaanen (Tilburg University, Netherlands)
Lynette van Zijl (Stellenbosch University, South Africa)


Iñaki Alegria (University of the Basque Country)
Koldo Gojenola (University of the Basque Country)
Izaskun Etxeberria (University of the Basque Country)
Nerea Ezeiza (University of the Basque Country)
Mans Hulden (Ikerbasque)
Amaia Lorenzo (University of the Basque Country)
Esther Miranda (University of the Basque Country)
Maite Oronoz (University of the Basque Country)

Further information about the conference is available at the FSMNLP
Conference website:

