Yinya Huang
Yinya Huang
Home
News
Featured
Publications
Service
Awards
Light
Dark
Automatic
Publications
Type
Conference-Paper
Journal-Paper
Manuscript
Conference paper
Preprint
Date
2025
2024
2023
2022
2021
SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning
We present SEEPHYS, a large-scale multimodal benchmark for LLM reasoning grounded in physics questions ranging from middle school to …
Kun Xiang
,
Heng Li
,
Terry Jingchen Zhang
,
Yinya Huang
,
Zirong Liu
,
Peixin Qu
,
Jixi He
,
Jiaqi Chen
,
Yu-Jie Yuan
,
Jianhua Han
,
Hang Xu
,
Hanhui Li
,
Mrinmaya Sachan
,
Xiaodan Liang
Cite
DOI
URL
Project
Data
Website
LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Long-form legal reasoning remains a key challenge for large language models (LLMs) in spite of recent advances in test-time scaling. We …
Yu Fan
,
Jingwei Ni
,
Jakob Merane
,
Etienne Salimbeni
,
Yang Tian
,
Yoan Hermstrüwer
,
Yinya Huang
,
Mubashara Akhtar
,
Florian Geering
,
Oliver Dreyer
,
Daniel Brunner
,
Markus Leippold
,
Mrinmaya Sachan
,
Alexander Stremitzer
,
Christoph Engel
,
Elliott Ash
,
Joel Niklaus
Cite
DOI
URL
TreeRPO: Tree Relative Policy Optimization
Large Language Models (LLMs) have shown remarkable reasoning capabilities through Reinforcement Learning with Verifiable Rewards (RLVR) …
Zhicheng Yang
,
Zhijiang Guo
,
Yinya Huang
,
Xiaodan Liang
,
Yiwei Wang
,
Jing Tang
Cite
DOI
URL
ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research
Operations research (OR) is widely deployed to solve critical decision-making problems with complex objectives and constraints, …
Zhiyuan Wang
,
Bokui Chen
,
Yinya Huang
,
Qingxing Cao
,
Ming He
,
Jianping Fan
,
Xiaodan Liang
Cite
DOI
URL
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling
Zhicheng Yang
,
Yiwei Wang
,
Yinya Huang
,
Zhijiang Guo
,
Wei Shi
,
Xiongwei Han
,
Liang Feng
,
Linqi Song
,
Xiaodan Liang
,
Jing Tang
PDF
Cite
URL
Code
FormalAlign: Automated Alignment Evaluation for Autoformalization
Jianqiao Lu
,
Yingjia Wan
,
Yinya Huang
,
Jing Xiong
,
Zhengying Liu
,
Zhijiang Guo
PDF
Cite
URL
Code
FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving
Xiaohan Lin
,
Qingxing Cao
,
Yinya Huang
,
Haiming Wang
,
Jianqiao Lu
,
Zhengying Liu
,
Linqi Song
,
Xiaodan Liang
PDF
Cite
URL
Data
Code
Proving Theorems Recursively
Haiming Wang
,
Huajian Xin
,
Zhengying Liu
,
Wenda Li
,
Yinya Huang
,
Jianqiao Lu
,
Zhicheng Yang
,
Jing Tang
,
Jian Yin
,
Zhenguo Li
,
Xiaodan Liang
PDF
Cite
URL
Code
Process-Driven Autoformalization in Lean 4
Jianqiao Lu
,
Zhengying Liu
,
Yingjia Wan
,
Yinya Huang
,
Haiming Wang
,
Zhicheng Yang
,
Jing Tang
,
Zhijiang Guo
PDF
Cite
URL
Code
AUTOCV: Empowering Reasoning with Automated Process Labeling via Confidence Variation
Jianqiao Lu
,
Zhiyang Dou
,
Hongru Wang
,
Zeyu Cao
,
Jianbo Dai
,
Yingjia Wan
,
Yinya Huang
,
Zhijiang Guo
PDF
Cite
URL
Code
CLOMO: Counterfactual Logical Modification with Large Language Models
Yinya Huang
,
Ruixin Hong
,
Hongming Zhang
,
Wei Shao
,
Zhicheng Yang
,
Dong Yu
,
Changshui Zhang
,
Xiaodan Liang
,
Linqi Song
PDF
Cite
URL
Data
Code
ATG: Benchmarking Automated Theorem Generation for Generative Language Models
Xiaohan Lin
,
Qingxing Cao
,
Yinya Huang
,
Zhicheng Yang
,
Zhengying Liu
,
Zhenguo Li
,
Xiaodan Liang
PDF
Cite
URL
Data
Leaderboard
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
Yinya Huang
,
Xiaohan Lin
,
Zhengying Liu
,
Qingxing Cao
,
Huajian Xin
,
Haiming Wang
,
Zhenguo Li
,
Linqi Song
,
Xiaodan Liang
PDF
Cite
URL
Code
Coverage
Slides
Poster
LEGO-Prover: Neural Theorem Proving with Growing Libraries
Haiming Wang
,
Huajian Xin
,
Chuanyang Zheng
,
Lin Li
,
Zhengying Liu
,
Qingxing Cao
,
Yinya Huang
,
Jing Xiong
,
Han Shi
,
Enze Xie
,
Jian Yin
,
Zhenguo Li
,
Heng Liao
,
Xiaodan Liang
PDF
Cite
URL
Code
Coverage
AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations
Zhicheng Yang
,
Yinya Huang
,
Jing Xiong
,
Liang Feng
,
Xiaodan Liang
,
Yiwei Wang
,
Jing Tang
PDF
Cite
URL
Code
RecRanker: Instruction Tuning Large Language Model as Ranker for Top-k Recommendation
Sichun Luo
,
Bowei He
,
Haohan Zhao
,
Yinya Huang
,
Aojun Zhou
,
Zongpeng Li
,
Yuanzhang Xiao
,
Mingjie Zhan
,
Linqi Song
PDF
Cite
URL
Integrating Large Language Models into Recommendation via Mutual Augmentation and Adaptive Aggregation
Sichun Luo
,
Yuxuan Yao
,
Bowei He
,
Yinya Huang
,
Aojun Zhou
,
Xinyi Zhang
,
Yuanzhang Xiao
,
Mingjie Zhan
,
Linqi Song
PDF
Cite
URL
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models
Jing Xiong
,
Jianhao Shen
,
Ye Yuan
,
Haiming Wang
,
Yichun Yin
,
Zhengying Liu
,
Lin Li
,
Zhijiang Guo
,
Qingxing Cao
,
Yinya Huang
,
Chuanyang Zheng
,
Xiaodan Liang
,
Ming Zhang
,
Qun Liu
PDF
Cite
DOI
URL
Code
Discourse-Aware Graph Networks for Textual Logical Reasoning
Yinya Huang
,
Lemao Liu
,
Kun Xu
,
Meng Fang
,
Liang Lin
,
Xiaodan Liang
PDF
Cite
DOI
URL
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure
Yinya Huang
,
Hongming Zhang
,
Ruixin Hong
,
Xiaodan Liang
,
Changshui Zhang
,
Dong Yu
PDF
Cite
DOI
URL
Code
REM-Net: Recursive Erasure Memory Network for Commonsense Evidence Refinement
Yinya Huang
,
Meng Fang
,
Xunlin Zhan
,
Qingxing Cao
,
Xiaodan Liang
PDF
Cite
DOI
URL
Code
DAGN: Discourse-Aware Graph Network for Logical Reasoning
Yinya Huang
,
Meng Fang
,
Yu Cao
,
Liwei Wang
,
Xiaodan Liang
PDF
Cite
DOI
URL
Code
PathReasoner: Explainable reasoning paths for commonsense question answering
Xunlin Zhan
,
Yinya Huang
,
Xiao Dong
,
Qingxing Cao
,
Xiaodan Liang
PDF
Cite
DOI
URL
Cite
×