Hi, nice to meet you!
I am Botao Yu (余博涛), a PhD student at The Ohio State University, advised by Prof. Huan Sun. Previously, I earned my Master’s degree at Nanjing University, advised by Prof. Wei Hu (胡伟).
My research interest includes LLMs, language agents, AI for Science (esp. Chemistry), NLP, AI music, and deep learning.
🌟 Featured Projects


ChemToolAgent

LlaSMol
🔥 News
- 2025.06: Check out our new preprint AutoSDT, an automated pipeline for generating high-quality scientific coding tasks.
- 2025.06: Check out 🛠️ChemMCP, our newly released, MCP-compatible chemistry toolkit for LLMs and AI assistants. Let’s build it together!
- 2025.05: Check out our new preprint Topic Association Analysis, where we investigated why LLMs misclassify benign comments as toxic from the topic association bias perspective.
- 2025.05: Our paper MMMU-Pro is accepted to ACL 2025.
- 2025.03: Our ChemAgent is now renamed to ChemToolAgent. Check out our new version with more experimental results at arXiv.
- 2025.01: Our paper ChemAgent is accepted to NAACL 2025 Findings.
- 2025.01: Our paper ScienceAgentBench is accepted to ICLR 2025.
- 2024.11: Please check out our new preprint ChemAgent, an enhanced chemistry agent and its performance on various chemistry problems.
- 2024.10: Please check out our new preprint ScienceAgentBench, a benchmark to assess language models in scientific tasks.
- 2024.09: Check out our new preprint MMMU-Pro, an enhanced version of MMMU featuring full-vision evaluation.
- 2024.07: Our paper LlaSMol is accepted to COLM 2024 🎉!
- 2024.05: Our paper MMMU is selected as Oral (0.8%) and nominated for best paper (24 in total) at CVPR 2024 🎊!
- 2024.02: Please check out our preprint LlaSMol, where we propose an awesome chemistry task instruction tuning dataset and a series of chemistry LLMs.
- 2023.08: Arrived at Columbus. My PhD journey officially starts 😋!
- 2023.05: Please check out our preprint MuseCoco, a text-to-music generation system.
- 2022.09: Our paper Museformer is accepted to NeurIPS 2022 🎉!
📝 Publications
-
[Preprint] AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-Scientists
We introduce AutoSDT, an automated pipeline for generating high-quality coding tasks from real-world data-driven scientific workflows, addressing the data scarcity challenge in building AI co-scientists. Using AutoSDT, we create AutoSDT-5K, the largest open dataset of its kind, enabling significant performance gains in scientific discovery benchmarks. -
[Preprint] Probing Association Biases in LLM Moderation Over-Sensitivity
This paper investigates why large language models often misclassify benign comments as toxic, revealing that topic-level biases—rather than just offensive keywords—play a significant role. Using a novel Topic Association Analysis inspired by cognitive psychology, we uncover how LLMs' implicit associations influence moderation decisions. -
[NAACL 2025 Findings] ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem Solving
We propose a tool-augmented language agent for chemistry named ChemToolAgent, and evaluate it on both specialized chemistry tasks and general chemistry questions. The results show that tools cannot always help and may cause more reasoning errors. Previous title: Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving. -
[ICLR 2025] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
The study introduces a benchmark for evaluating language models in scientific discovery, using 102 tasks from peer-reviewed publications and expert validation. It reveals current limitations in code generation, highlighting the need for rigorous task assessments. -
[ACL 2025] MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
An enhanced version of MMMU featuring full-vision evaluation for multi-discipline multimodal understanding. -
[COLM 2024] LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
We propose a carefully curated chemistry task dataset for instruction tuning and a series of LLMs that significantly outperform GPT-4 and Claude-3-Opus on various chemistry tasks. -
[CVPR 2024 Oral] MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
This paper proposes a massive multi-discipline multimodal understanding and reasoning benchmark for expert AGI. -
[Preprint 2023] MuseCoco: Generating Symbolic Music from Text
A two-stage text-to-music generation system for creating symbolic music from textual descriptions. -
[Preprint 2023] EmoGen: Eliminating Subjective Bias in Emotional Music Generation
A method for generating emotional music while reducing subjective bias in the process. -
[NeurIPS 2022] Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation
We propose a fine- and coarse-grained attention mechanism for modeling the structures of music. -
[ISMIR 2022] MeloForm: Generating Melody with Musical Form Based on Expert Systems and Neural Networks
A system for generating melodies with musical form using a combination of expert systems and neural networks. -
[EMNLP 2021] Knowing False Negatives: An Adversarial Training Method for Distantly Supervised Relation Extraction
An adversarial training method to improve distantly supervised relation extraction by addressing false negatives. -
[APWeb-WAIM 2020] Joint Reasoning of Events, Participants and Locations for Plot Relation Recognition
A method for recognizing plot relations by jointly reasoning about events, participants, and locations in narratives.
📖 Education
-
PhD student in Computer Science and Engineering @ The Ohio State University
2023.08 - Now Columbus, Ohio, USA
-
Master’s student in Computer Science @ Nanjing University (南京大学)
2019.09 - 2023.06 Nanjing, Jiangsu, China
-
Undergraduate student in Software Engineering @ Dalian University of Technology (大连理工大学)
2015.09 - 2019.06 Dalian, Liaoning, China
-
High school student @ The High School Attached To Hunan Normal University (湖南师大附中)
2012.09 - 2015.06 Changsha, Hunan, China
💻 Internship
-
Research intern @ Microsoft Research Asia (微软亚洲研究院)
2021.04 - 2022.03 Beijing, China
Last updated: June 8, 2025