MLSP 2024 : Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications

posted by organizer: mattshardlow || 1230 views || tracked by 4 users: [display]

MLSP 2024 : Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications

Link: https://sites.google.com/view/mlsp-sharedtask-2024/home

When	Jun 21, 2024 - Jun 21, 2024
Where	Mexico City
Submission Deadline	Mar 25, 2024

Categories natural language processing artificial intelligence

Call For Papers

The organisers are pleased to announce a new shared task, inviting participants to contribute novel systems for a Multilingual Lexical Simplification Pipeline. This task comprises lexical complexity prediction and lexical simplification, uniting these two core simplification tasks into a single pipeline. We invite participants to develop new lexical simplification systems for these two tasks in a variety of high- and low-resource languages (listed below).

This shared task will be hosted at the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024), which will be colocated with the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024) in Mexico City June 21-22nd.

Lexical complexity prediction was previously explored as part of the LCP 2021 shared task, hosted as part of SemEval 2021 (Shardlow et al. 2021). Participants were presented with a given word in a sentence and asked to evaluate its complexity on a continuous scale. This task requires participants to judge the difficulty of a given target word within a context on a continuous scale in the range 0 (easy to understand) to 1 (hard to understand).

Lexical simplification has also recently been explored at the TSAR 2022 shared task (Saggion et al. 2022), hosted as part of the Text Simplification, Accessibility and Readability Workshop at EMNLP 2022. In this task, systems must provide easier to understand alternatives for a given identified complex word in its context.

The lexical simplification pipeline unites these two tasks. Given a sentence with a marked token, the system must first make a prediction regarding the complexity of that token and secondly provide potential simpler alternatives for the token, or none if the token is judged to not require simplification. By co-developing systems to jointly perform these tasks, participants will create a working lexical simplification pipeline system that can be applied in settings such as education to improve the readability of texts for learners.

**Languages**

We will provide evaluation data for the following languages:

- English (en)
- French (fr)
- Brazillian Portuguese (pt-br)
- Bengali (bn)
- Sinhala (si)
- Filipino (fil)
- Japanese (jp)
- Italian (it)

We also hope to announce at least three further languages for participation.

Participants are free to submit to one or multiple languages. We strongly encourage submissions from multilingual systems that are capable of handling the languages that we have released and further languages beyond the scope of the task. We will provide a separate ranking for multilingual systems that participate in all languages.

**Dataset Format**

There is now a glut of available resources for simplification tasks such as lexical complexity prediction and lexical simplification. As such, each language will provide an unlabelled **test set only** comprising of 570 instances. Labelled trial data will also be released comprising of 30 instances per languages for the purpose of calibrating systems for the evaluation phase. **We will not release new training data for this task.** Participants are encouraged to make use of the many existing resources for lexical complexity prediction and lexical simplification to train their systems. A list of available resources will be hosted on the shared task website.

Each data instance in the trial data will comprise of the following fields: *language, token, begin, end, context, complexity, substitutions*. These are described below:

- Language: The language code for this instance
- Token: The identified (whole-word) token to be evaluated
- Begin: the begin-offset of the token in the context
- End: the end-offset of the token in the context
- Context: the context in which this token appeared. Typically, but not limited to the enclosing sentence boundaries.
- Complexity: A complexity score bounded in the range 0-1 derived from asking 10 annotators to judge the token in its context on a scale of 1 (easy) to 5 (difficult).
- Substitutions: A list of no more than 10 substitutions ranked by frequency of suggestion by the annotators.

Each data instance in the test data will comprise of the following fields: *language, token, begin, end, context*. Participant systems will provide the ‘complexity’ and ‘substitutions’ fields in the same format as the trial data.

**Evaluation**

For Lexical Complexity Prediction, we will evaluate using:

**Root Mean Squared Error** calculated between the system outputs for lexical complexity and the values returned by the annotators. See Shardlow et. al (2021) for details.

For Lexical Simplification We will use two metrics defined in Saggion et al. (2022) as follows:

**MAP@K** uses a ranked list of system-generated substitutes against the set of gold-standard substitutes. MAP@k takes into account the position of the relevant substitutes among the first k generated candidates.

**Accuracy@k@top1**, which is the percentage of instances where at least one of the k top ranked substitutes matches the most frequently suggested synonym in the gold data.

We will also provide **Human End-to-End Evaluation** for:

**Simplicity**,

**Fluency** and

**Meaning Preservation**.

Human evaluation will take place for the top 5 ranking systems according to the automated metrics. Availability of human evaluation will depend on the recruitment of evaluators from the task participants.

**Participant Registration**

Interested parties can register prior to the Trial Data Release via our [participant registration Google Form](https://sites.google.com/d/151nOTm4Lwla2MXolnTgNSk6hNQoCaruX/p/1BWd0x4Q2v8nBJZSslymvPUd2vkzpwCWO/edit)

Further information will be released through [the MLSP shared task website](https://sites.google.com/view/mlsp-sharedtask-2024/home)

**Timeline**

| Fri Feb 16 , 2024 | Trial Data Release |
| --- | --- |
| Fri Mar 15 , 2024 | Test Data Release |
| Mon Mar 25, 2024 | Final Submissions |
| Fri Apr 12, 2024 | System Papers Due |
| Fri Jun 21 2024 | BEA Workshop |

**Organisers**

| Matthew Shardlow | Manchester Metropolitan University |
| --- | --- |
| Marcos Zampieri | George Mason University |
| Kai North | George Mason University |
| Fernando Alva-Manchego | Cardiff University |
| Thomas François | UCLouvain |
| Remi Cardon | UCLouvain |
| Nishat Raihan | George Mason University |
| Tharindu Ranasinghe | Aston University |
| Joseph Imperial | University of Bath |
| Riza Batista-Navarro | University of Manchester |
| Adam Nohejl | NAIST |
| Yusuke Ide | NAIST |
| Akio Hayakawa | Universitat Pompeu Fabra |
| Laura Occhipinti | University of Bologna |
| Horacio Saggion | Universitat Pompeu Fabra |

**References**

Saggion, H., Štajner, S., Ferrés, D., Sheang, K.C., Shardlow, M., North, K. and Zampieri, M., 2022, December. Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification. In *Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)* (pp. 271-283).

Shardlow, M., Evans, R., Paetzold, G. and Zampieri, M., 2021, August. SemEval-2021 Task 1: Lexical Complexity Prediction. In *Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)* (pp. 1-16).