Welcome to SemEval-2025 Task-3 — Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes
Welcome to the official shared task website for Mu-SHROOM, a SemEval-2025 shared task!
Mu-SHROOM stands for “Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes”. Mu-SHROOM will invite participants to detect hallucination spans in the outputs of instruction-tuned LLMs in a multilingual context. This shared task builds upon our previous iteration, SHROOM, with a few key changes:
- We’re looking at multiple languages: Arabic (Modern standard), Chinese (Mandarin), English, Finnish, French, German, Hindi, Italian, Spanish, and Swedish;
- We’re now focusing on LLM outputs;
- Participants will have to predict where the hallucination occurs.
This website is under construction. More information will be available soon.
Participant info
Register ahead of time on our submission website
Want to be kept in the loop? Join our Google group mailing list or the shared task Slack! We also have a Twitter acount.
Data
Below are links to access the data already released, as well as provisional expected release dates for future splits. Do note that release dates are subject to change.
Dataset split | Access |
---|---|
Sample set | download (v1) |
Validation set | download (v1) |
Unlabeled train set | To be published (ETA Sept 12th) |
Unlabeled test set | To be published (ETA Jan 10th) |
Labeled test set | To be published (ETA Feb 1st) |
Participants can also download the scoring program here for reference and developing their systems.
Important dates
This information is subject to change.
- Sample data available: 15 July 2024
- Validaiton data ready: 2 September 2024
- Evaluation start: 10 January 2025
- Evaluation end: 31 January 2025
- Paper submission due: 28 February 2025 (TBC)
- Notification to authors: 31 March 2025 (TBC)
- Camera ready due: 21 April 2025 (TBC)
- SemEval workshop: Summer 2025 (co-located with a major NLP conference)
Organizers of the shared task
- Raúl Vázquez, University of Helsinki, Finland
- Timothee Mickus, University of Helsinki, Finland
- Elaine Zosa, SILO AI, Finland
- Teemu Vahtola, University of Helsinki, Finland
- Jörg Tiedemann, University of Helsinki, Finland
- Aman Sinha, Université de Lorraine, France
- Vincent Segonne, Université Bretagne Sud, France
- Fernando Sánchez-Vega, CIMAT A. C., Mexico
- Alessandro Raganato, University of Milano-Bicocca, Italy
- Jussi Karlgren, SILO AI, Finland
- Shaoxiong Ji, University of Helsinki, Finland
- Liane Guillou, University of Edinburgh, UK
- Joseph Attieh, University of Helsinki, Finland
- Marianna Apidianaki, University of Pennsylvania, USA
Looking for something else?
The website for the previous iteration of the shared task is available here.
The logo is available here (download); we encourage participants to use it where relevant (esp. in your posters)!