Yinya Huang
Yinya Huang
Home
News
Featured
Publications
Service
Awards
Light
Dark
Automatic
Computer Science - Artificial Intelligence
SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning
We present SEEPHYS, a large-scale multimodal benchmark for LLM reasoning grounded in physics questions ranging from middle school to …
Kun Xiang
,
Heng Li
,
Terry Jingchen Zhang
,
Yinya Huang
,
Zirong Liu
,
Peixin Qu
,
Jixi He
,
Jiaqi Chen
,
Yu-Jie Yuan
,
Jianhua Han
,
Hang Xu
,
Hanhui Li
,
Mrinmaya Sachan
,
Xiaodan Liang
Cite
DOI
URL
Project
Data
Website
LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Long-form legal reasoning remains a key challenge for large language models (LLMs) in spite of recent advances in test-time scaling. We …
Yu Fan
,
Jingwei Ni
,
Jakob Merane
,
Etienne Salimbeni
,
Yang Tian
,
Yoan Hermstrüwer
,
Yinya Huang
,
Mubashara Akhtar
,
Florian Geering
,
Oliver Dreyer
,
Daniel Brunner
,
Markus Leippold
,
Mrinmaya Sachan
,
Alexander Stremitzer
,
Christoph Engel
,
Elliott Ash
,
Joel Niklaus
Cite
DOI
URL
ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research
Operations research (OR) is widely deployed to solve critical decision-making problems with complex objectives and constraints, …
Zhiyuan Wang
,
Bokui Chen
,
Yinya Huang
,
Qingxing Cao
,
Ming He
,
Jianping Fan
,
Xiaodan Liang
Cite
DOI
URL
TreeRPO: Tree Relative Policy Optimization
Large Language Models (LLMs) have shown remarkable reasoning capabilities through Reinforcement Learning with Verifiable Rewards (RLVR) …
Zhicheng Yang
,
Zhijiang Guo
,
Yinya Huang
,
Xiaodan Liang
,
Yiwei Wang
,
Jing Tang
Cite
DOI
URL
Cite
×