Welcome to the *SHROOM Shared-Task Series on Hallucinations and Related Observable Overgeneration Mistakes
The SHROOM shared task series brings together researchers and practitioners interested in detecting hallucinations โ that is, fluent yet semantically incorrect or unsupported outputs โ in natural language generation (NLG) systems. Since 2024, weโve been pushing the boundaries of automatic hallucination detection, with each edition introducing new challenges and innovations.
This website serves as a central hub to explore the current and past editions of the shared task, including SHROOM (2024), Mu-SHROOM (2025), and the upcoming ฮฝ-SHROOM (2026).
๐๐คช๐ SHROOM 2024
SHROOM โ the original Shared-task on Hallucinations and Related Observable Overgeneration Mistakes โ kicked off the initiative at SemEval-2024. Participants were asked to identify hallucinated content in NLG outputs across several generation tasks (e.g., machine translation, paraphrasing, definition modeling), both with and without access to the model that generated the outputs.
๐ Go to the SHROOM 2024 website
๐๐๐ Mu-SHROOM 2025
Mu-SHROOM is the second edition of the shared task, held at SemEval-2025. This multilingual extension of SHROOM expands the scope to 14 languages and shifts the focus to instruction-tuned large language models (LLMs). This time, the task targets hallucination spans at the character level โ asking participants to predict where hallucinations occur and how likely they are.
Mu-SHROOM brings together multilingual evaluation, character-level scoring, and a diverse set of public-weight LLMs, making it a challenging and rich task for researchers in hallucination detection and robust NLG.
๐๐งโ๐ฌ๐ SHROOM-CAP Shared Task
SHROOM-CAP is the third edition of the shared task, held at CHOMPS-2025. This cross-lingual extension of SHROOM expands the scope to high and low resource languages with the special focus to indic languages. This time, the task targets hallucination in scientific domain โ asking participants to predict if there is scienific hallucinations or not and how likely they are.
๐ Explore SHROOM-CAP Shared Task
๐ฎ๐๐โ๐จ๐คฏ ฮฝ-SHROOM 2026 (coming soon! TBC)
Get ready for the next iteration of the SHROOM series โ ฮฝ-SHROOM (pronounced โnu-shroomโ or โvi-shroomโ, interchangeably), coming in 2026! Building on insights from SHROOM and Mu-SHROOM, ฮฝ-SHROOM will introduce new dimensions to the hallucination detection landscape.
Stay tuned โ more details will be announced later in 2025.
๐ฅ๐๐ Join the SHROOM Community
Whether youโre interested in joining the next round, learning from past editions, or just staying informed about hallucination detection in NLG, weโd love to have you in the community.
- Join the conversation on Slack
- Check out the past editions Google gorups
๐งช Want to dive straight in?
Visit one of the task pages above and start exploring data, baselines, and results.
Reach out if you have further questions, collaboration ideas or simply want to say hi:
- Timothee Mickus, University of Helsinki, Finland
- Raรบl Vรกzquez, University of Helsinki, Finland