posted by user: grupocole || 3736 views || tracked by 13 users: [display]

IE-IR-LSL 2009 : Information Retrieval and Information Extraction for less resourced languages


When Sep 7, 2009 - Sep 7, 2009
Where San Sebastian, Spain
Submission Deadline Jun 8, 2009
Notification Due Jul 1, 2009
Final Version Due Jul 15, 2009
Categories    NLP

Call For Papers

Information Retrieval and Information Extraction
for less resourced languages.

SEPLN 2009 pre-conference workshop
University of the Basque Country
Donostia-San Sebastián. Monday 7th September 2009

Organised by the SALTMIL Special Interest Group of ISCA
SEPLN 2009:
Paper submission:
Deadline for submission: 8 June 2009

Papers are invited for the above half-day workshop, in the format
outlined below. Most submitted papers will be presented in poster form,
though some authors may be invited to present in lecture format.


The phenomenal growth of the Internet has led to a situation where, by
some estimates, more than one billion words of text is currently
available. This is far more text than any given person can possibly
process. Hence there is a need for automatic tools to access and process
this mass of textual information. Emerging techniques of this kind
include Information Retrieval (IR), Information Extraction (IE), and
Question Answering (QA)

However, there is a growing concern among researchers about the
situation of languages other than English. Although not all Internet
text is in English, it is clear that non-English languages do not have
the same degree of representation on the Internet. Simply counting the
number of articles in Wikipedia, English is the only language with more
than 20 percent of the available articles. There then follows a group of
17 languages with between one and ten percent of the articles. The
remaining 245 languages each have less than one percent of the articles.
Even these low-profile languages are relatively privileged, as the
total number of languages in the world is estimated to be 6800.

Clearly there is a danger that the gap between high-profile and
low-profile languages on the Internet will continue to increase, unless
tools are developed for the low-profile languages to access textual
information. Hence there is a pressing need to develop basic language
technology software for less-resourced languages as well. In particular,
the priority is to adapt the scope of recently-developed IE, IR and QA
systems so that they can be used also for these languages. In doing so,
several questions will naturally arise, such as:

- What problems emerge when faced with languages having different
linguistic features from the major languages?
- Which techniques should be promoted in order to get the maximum
yield from sparse training data?
- What standards will enable researchers to share tools and techniques
across several different languages?
- Which tools are easily re-useable across several unrelated languages?

It is hoped that presentations will focus on real-world examples, rather
than purely theoretical discussions of the questions. Researchers are
encouraged to share examples of best practice -- and also examples where
tools have not worked as well as expected. Also of interest will be
cases where the particular features of a less-resourced language raise a
challenge to currently accepted linguistic models that were based on
features of major languages.


Given the context of IR, IE and QA, topics for discussion may include,
but are not limited to:

- Information retrieval;
- Text and web mining;
- Information extraction;
- Text summarization;
- Term recognition;
- Text categorization and clustering;
- Question answering;
- Re-use of existing IR, IE and QA data;
- Interoperability between tools and data.
- General speech and language resources for minority languages, with
particular emphasis on resources for IR,IE and QA.


8 June 2009 Deadline for submission
1 July 2009 Notification
15 July 2009 Final version
7 September 2009 Workshop


Kepa Sarasola, University of the Basque Country
Mikel Forcada, Universitat d'Alacant, Spain
Iñaki Alegria. University of the Basque Country
Xabier Arregi, University of the Basque Country
Arantza Casillas. University of the Basque Country
Briony Williams, Language Technologies Unit, Bangor University, Wales, UK


Iñaki Alegria. University of the Basque Country.
Atelach Alemu Argaw: Stockholm University, Sweden
Xabier Arregi, University of the Basque Country.
Jordi Atserias, Barcelona Media (yahoo! research Barcelona)
Shannon Bischoff, Universidad de Puerto Rico, Puerto Rico
Arantza Casillas. University of the Basque Country.
Mikel Forcada, Universitat d'Alacant, Spain
Xavier Gomez Guinovart. University of Vigo.
Lori Levin, Carnegie-Mellon University, USA
Climent Nadeu, Universitat Politècnica de Catalunya
Jon Patrick, University of Sydney, Australia
Juan Antonio Pérez-Ortiz, Universitat d'Alacant, Spain
Bojan Petek, University of Ljubljana, Slovenia
Kepa Sarasola, University of the Basque Country
Oliver Streiter, National University of Kaohsiung, Taiwan
Vasudeva Varma, IIIT, Hyderabad, India
Briony Williams, Bangor University, Wales, UK


We expect short papers of max 3500 words (about 4-6 pages) describing
research addressing one of the above topics, to be submitted as PDF
documents by uploading to the following URL:
The final papers should not have more than 6 pages, adhering to the
stylesheet that will be adopted for the SEPLN Proceedings (to be
announced later on the Conference web site).

Related Resources

CoMSE 2024   2024 3rd Conference on Materials Science and Engineering (CoMSE 2024)
IEEE Xplore-Ei/Scopus-CSPIT 2023   2023 Asia Conference on Communications, Signal Processing and Information Technology (CSPIT 2023) -EI Compendex
ISEEIE 2024   2024 4th International Symposium on Electrical, Electronics and Information Engineering (ISEEIE 2024)
IEEE Xplore-Ei/Scopus-DMCSE 2023   2023 International Conference on Data Mining, Computing and Software Engineering (DMCSE 2023) -EI Compendex
ITMS 2023   International Scientific Conference on Information Technology and Management Science of Riga Technical University
LREC-COLING 2024   The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation
CCITT 2024   3rd International Conference on Computing and Information Technology Trends
EACL 2024   The 18th Conference of the European Chapter of the Association for Computational Linguistics
ITNG 2024   The 21st Int'l Conf. on Information Technology: New Generations ITNG 2024
IEEE Big Data - MMAI 2023   IEEE Big Data 2023 Workshop on Multimodal AI (Hybrid)