posted by user: chrisbie || 1367 views || tracked by 2 users: [display]

GermEval 2014 : GermEval 2014 Named Entity Recognition Shared Task for German


When Oct 6, 2014 - Oct 10, 2014
Where Hildesheim, Germany
Submission Deadline Aug 15, 2014
Notification Due Sep 1, 2014
Final Version Due Sep 15, 2014
Categories    shared task   named entity recognition   german

Call For Papers

GermEval 2014 Named Entity Recognition Shared Task for German

Co-located with KONVENS 2014, October 8-10, Hildesheim, Germany

Named Entity Recognition (NER) has been shown useful for a wide range of NLP tasks from Information Extraction to Speech Processing.
For Semantic Web applications like entity linking, NER is a crucial preprocessing step.
Even though German is a relatively well-resourced language, NER for German has been challenging, both because capitalization is a less useful feature than in other languages, and because existing training data sets are encumbered by license problems. Therefore, no publicly available NER taggers for German exist that are free of usage restrictions and perform at high levels of accuracy.

The GermEval 2014 NER Shared Task is an event that makes available CC-licensed German data with NER annotation with the goal of significantly advancing the state of the art in German NER and to push the field of NER towards nested representations of named entities.

We invite all researchers and industry professionals to participate in the challenge and to demonstrate their capabilities of creating a Named Entity Recognition system for German. The systems will be evaluated on a manually created testset. Training data and development data will be provided. There are no restrictions regarding the type of NER system submissions, and no restrictions on the use of external data, background corpora, lexical resources etc.

GermanEval 2014 NER is associated with the KONVENS 2014 conference and will take place as a KONVENS workshop at Hildesheim in October 2014.

Task Setup
The GermEval 2014 NER Shared Task builds on a new dataset with German Named Entity annotation [1] with the following properties:

- The data was sampled from German Wikipedia and News Corpora as a collection of citations.
- The dataset covers over 31,000 sentences corresponding to over 590,000 tokens.
- The NER annotation uses the NoSta-D guidelines, which extend the Tübingen Treebank guidelines, using four main NER categories with sub-structure, and annotating embeddings among NEs such as [ORG FC Kickers [LOC Darmstadt]].

Data and Guidelines are available for download at

We split the dataset [1] into training, development and test sets and provide the datasets in a tab-separated (TSV) format.
- Training Set
- Development Set
- Test Set (Available August 1, 2014 in unannotated form, from September 1, 2014 in annotated form)

Further, we provide an evaluation script (adopted from the CoNLL competitions) assessing a given TSV file against a gold standard, and a verifier that tests whether a given file is a valid TSV file:

- valuation script (Available in the course of March, 2014)
- verifier script (Available in the course of March, 2014)
- readme: how to operate these scripts

There is just one track -- Participants may use arbitrary knowledge sources to model the data. Participants may submit up to three runs.

Submissions consist of a TSV file providing predictions for the test data and a paper of up to 4 pages (including references) describing the chosen approach and analyzing the performance. Papers should follow the KONVENS 2014 style files. The papers will be published online. We expect authors to present summaries of their systems at the KONVENS workshop.

Important Dates

March 1, 2014: Call for Participation; incl. training and development data and evaluation framework
August 1-15, 2014 Availability of test data and submission of model results
August 15, 2014 Deadline for Shared Task description submissions
September 1, 2014 Notification of Acceptance and Shared Task Results
September 15, 2014 Deadline camera-ready papers
October 8 - 10, 2014 Konvens

Chris Biemann
Language Technology, Technische Universität Darmstadt

Sebastian Padó
IMS, Stuttgart University

[1] D. Benikova, C. Biemann, M. Reznicek. NoSta-D Named Entity Annotation for German: Guidelines and Dataset. To be presented at LREC 2014, Reykjavik

Related Resources

SMM4H at NAACL 2021 2021   CFP ProfNER shared task: Identification of professions & occupations in Health-related Social Media (SMM4H at NAACL)
CASE 2021   Call for Papers and Shared Task Participation: Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) @ ACL-IJCNLP 2021
NER & Geo-Tagging for Literary Analysis 2021   Distant Reading Training School: Named Entity Recognition & Geo-Tagging for Literary Analysis
TIAD 2021   4th Translation Inference Across Dictionaries – Shared Task & Workshop @ LDK 2021
[WWW 2021] FinSBD-3 Shared Task 2021   Structure Boundary Detection, an extension of Sentence Boundary Detection in PDF Noisy Text in the Financial Domain
[WWW 2021] FinSIM-2 Shared Task 2021   Learning Semantic Similarities for the Financial Domain
JCRAI 2021   2021 International Joint Conference on Robotics and Artificial Intelligence (JCRAI 2021)
SDP Shared Tasks 2021   2nd Workshop on Scholarly Document Processing (SDP 2021) @ NAACL Shared Tasks
CFIMA 2021   2021 2nd International Conference on Frontiers of Intelligent Manufacturing and Automation (CFIMA 2021)
NCTA 2021   13th International Conference on Neural Computation Theory and Applications