posted by organizer: bross || 519 views || tracked by 2 users: [display]

REAL-Info 2024 : 1st Workshop on Reliable Evaluation of LLMs for Factual Information

FacebookTwitterLinkedInGoogle

Link: https://sites.google.com/view/real-info-2024/overview
 
When Jun 3, 2024 - Jun 3, 2024
Where Buffalo, NY, USA
Submission Deadline Mar 24, 2024
Notification Due Apr 14, 2024
Final Version Due May 5, 2024
Categories    computational social science   artificial intelligence   natural language processing
 

Call For Papers

LLMs can potentially influence various information tasks of millions of users, ranging from personal content creation to education, financial advice, and mental health support. However, there is also a growing concern about LLMs' ability to identify and generate factual information. Currently, there is no standardized approach for evaluating the factuality of LLMs.This half-day workshop will enable a broad and diverse conversation regarding assessing the factuality of content generated by LLMs and their overall performance in factuality-related tasks, e.g., fact-checking, misinformation detection, rumor detection, and stance detection. The objective is to encourage the development of new evaluation approaches, metrics, and benchmarks that can better gauge LLMs' performance in terms of factuality. Additionally, the workshop will explore human-centric approaches for mitigating and correcting inaccuracies to enhance LLMs' factuality for critical applications. This workshop will facilitate interactions among academic and industry researchers in computational social sciences, natural language processing, human-computer interaction, data science, and social computing.

LLMs have achieved state-of-the-art performance in several textual inference tasks and are gaining popularity. There is a significant focus on their integration with web and online applications, including web search, thus allowing them to reach millions of users. LLMs can influence various information tasks in our everyday lives, ranging from personal content creation to education, financial advice, and mental health support (Augenstein, 2023). However, with their vast linguistic capabilities and opaque nature, LLMs can inadvertently generate or amplify false information. There is growing concern about the factuality of LLM-generated content and its potential adverse impact on our information ecosystem (Chen, 2023; Peskoff, 2023).

Thus, the need for reliable methods to assess the factuality of information is more critical than ever. This is where the synergy of AI, Natural Language Processing (NLP), and Human-Computer Interaction (HCI) becomes essential. AI and NLP techniques can be employed to analyze and identify the factuality of information through various tasks (Augenstein, 2023), such as fact-checking, stance detection, claim verification, and misinformation detection. These techniques can sift through the vast amounts of data to spot inconsistencies, biases, or inaccuracies that could indicate misinformation. Still, these approaches often use language models themselves, and epistemological questions arise when one LLM is fact-checked using another (or itself). Meanwhile, HCI plays a vital role in designing interactions and tools that enable humans to effectively oversee, interpret, and correct the outputs of LLMs. This human-in-the-loop approach ensures a critical evaluation and context-sensitive understanding of the factuality of information, which pure algorithmic methods might overlook. The combination of NLP's analytical capabilities and HCI's focus on human-centric design is instrumental in creating a digital ecosystem where LLMs can be utilized safely and responsibly, minimizing the risks of false information while maximizing their potential for user-centric applications.

The goals of the 1st ICWSM workshop Reliable Evaluation of LLMs for Factual Information (REAL-Info) are to facilitate discussion around such new LLM evaluation approaches, metrics, and benchmarks for factuality assessment tasks within the community, to inform the scope, biases, and blindspots of LLMs. It will spark interdisciplinary conversations from academic and industry researchers in computational social sciences (CSS), natural language processing (NLP), human-computer interaction (HCI), data science, and social computing. The workshop will solicit, research, and position papers with novel ideas, including but not limited to:

- New evaluation methods and metrics for evaluating LLM’s factuality considering diverse social context, e.g., source and domain of data, language, temporal generalization of information, or hallucination in generated/summarized content.

- Human-centered design approaches to aid LLMs in detecting and mitigating false information, e.g., human experts in the loop, and variation in prompting.

- New LLM-powered tools, methods, and applications for improving factuality assessment in social computing and computational social science.

- Biases and blindspots of LLMs in factuality assessment, including approaches for error analysis and model diagnostics.

- Limitations of existing benchmarks for tasks relevant to factuality assessment, e.g., claim verification, fact-checking, stance detection, and misinformation detection.

- Improve datasets and evaluation quality, e.g., avoidance of selection bias, addressing subjective judgments and biases in crowd-sourced annotation.

- Comparative evaluation and implications of open source and commercial LLMs for tasks relevant to factuality assessment.

- How does the reliability and factuality of LLM impact users (e.g. journalists, software engineers, artists..) and communities?

Related Resources

ECAI 2024   27th European Conference on Artificial Intelligence
IEEE-Ei/Scopus-ACEPE 2024   2024 IEEE Asia Conference on Advances in Electrical and Power Engineering (ACEPE 2024) -Ei Compendex
CCVPR 2024   2024 International Joint Conference on Computer Vision and Pattern Recognition (CCVPR 2024)
ECNLPIR 2024   2024 European Conference on Natural Language Processing and Information Retrieval (ECNLPIR 2024)
SMM4H 2024   The 9th Social Media Mining for Health Research and Applications Workshop and Shared Tasks — Large Language Models (LLMs) and Generalizability for Social Media NLP
IEEE-Ei/Scopus-SGGEA 2024   2024 Asia Conference on Smart Grid, Green Energy and Applications (SGGEA 2024) -EI Compendex
ACIE 2024   CPS--2024 4th Asia Conference on Information Engineering (ACIE 2024)
LREC-COLING 2024   The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation
IEEE ICA 2022   The 6th IEEE International Conference on Agents
DSIT 2024   2024 7th International Conference on Data Science and Information Technology (DSIT 2024)