posted by user: grupocole || 667 views || tracked by 1 users: [display]

PaDAWan 2024 : 1st Portuguese Data Augmentation Workshop

FacebookTwitterLinkedInGoogle

Link: https://sites.google.com/view/padawan-2024/
 
When Nov 17, 2024 - Nov 21, 2024
Where Belém, Pará, Brazil
Submission Deadline Sep 10, 2024
Notification Due Oct 5, 2024
Final Version Due Oct 13, 2024
Categories    NLP   computational linguistics   artificial intelligene
 

Call For Papers

PaDAWan
2024: 1st Portuguese Data Augmentation Workshop (PaDAWan)

Belém,
Pará, Brazil

collocated
with STIL 2024

November
17th to 21th 2024

1st
Call for Papers

https://sites.google.com/view/padawan-2024/

********************************************************

The Portuguese Data Augmentation
Workshop (PaDAWan) aims to gather the community working on Data Augmentation, particularly employing Large Language Models (LLMs), in Portuguese.

With the advancement of LLMs,
many traditional Natural Language Processing (NLP) tasks are being revisited. One traditional key challenge is gathering high-quality data for training and evaluating specific tasks. This has often been the main bottleneck in developing machine learning models.
Data augmentation has become a crucial technique for enhancing the performance of these models across various tasks, especially when reliable data are limited. Nowadays, particularly with the use of LLMs, it has become feasible to apply sophisticated text
data augmentation techniques effectively.

The use of LLMs is still very
restricted due to several factors, such as costs, privacy concerns, latency issues, and other challenges. Given the current scenario, using LLMs to generate synthetic data to train classical models for specific tasks is a viable approach. Moreover, while many
works in the industry consider synthetic data, scientific discussions on methods and evaluations are not always aligned with market necessities.

This workshop aims to delve into
the use of LLMs for data augmentation, exploring possible methods, evaluation techniques, and associated ethical considerations. The goal is to bring together both industry professionals and academics to deeply discuss the topic.

We invite researchers to submit
papers that discuss challenges and advances in Portuguese data generation, including but not limited to the following topics:



Data creation and data labeling



Data reformation and anonymization



Data contamination and noise



Co-annotation



Augmented data evaluation and
controlled data augmentation



Ethics in generated data and unbiased
data generation



Practical applications or case
studies of data augmentation techniques



Challenges in Portuguese Synthetic/Augmented
Data

*Submissions*

We
invite both unpublished work, to be published in a special section of STIL Proceedings, and lightning talks proposals highlighting already published work.

Submission
deadline: September 10, 2024

Notification
for authors: October 5, 2024

Camera-ready
versions due: October 13, 2024

For more information, please access:
https://sites.google.com/view/padawan-2024/

For any doubts, please write to
padawan.workshop@gmail.com

Related Resources

IEEE-Ei/Scopus-ITCC 2025   2025 5th International Conference on Information Technology and Cloud Computing (ITCC 2025)-EI Compendex
ACL 2025   The 63rd Annual Meeting of the Association for Computational Linguistics
SPIE-Ei/Scopus-DMNLP 2025   2025 2nd International Conference on Data Mining and Natural Language Processing (DMNLP 2025)-EI Compendex&Scopus
DMML 2025   6th International Conference on Data Mining & Machine Learning
ICoSR 2025   2025 4th International Conference on Service Robotics
LDK 2025   Fifth Conference on Language, Data and Knowledge
ACM SAC 2025   40th ACM/SIGAPP Symposium On Applied Computing
IEEE BDAI 2025   IEEE--2025 the 8th International Conference on Big Data and Artificial Intelligence (BDAI 2025)
NAACL-SRW 2025   NAACL Student Research Workshop (SRW) 2025
CSITEC 2025   11th International Conference on Computer Science, Information Technology