๐ฅ for papers with >50 citations or repositories with >200 stars.
๐ for papers accepted by reputed conferences/journals.
- OpenAI o3-mini (31 Jan, 2025)
- OpenAI o1 (12 Sept, 2024)
-
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models (16 Jan 2025)
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li
-
Yiqi Wang, Wentao Chen, Xiaotian Han, Xudong Lin, Haiteng Zhao, Yongfei Liu, Bohan Zhai, Jianbo Yuan, Quanzeng You, Hongxia Yang
-
Reasoning with Large Language Models, a Survey (16 Jul 2024)
Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back
-
๐ LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models (1 Apr 2024, COLM 2024)
Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Adrian de Wynter, Yan Xia, Wenshan Wu, Ting Song, Man Lan, Furu Wei
-
๐ฅ๐ Towards Reasoning in Large Language Models: A Survey (20 Dec 2022, ACL 2023 Findings)
Jie Huang, Kevin Chen-Chuan Chang
-
EMMA Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark [Code] (9 Jan 2025)
Yunzhuo Hao, Jiawei Gu, Huichen Will Wang, Linjie Li, Zhengyuan Yang, Lijuan Wang, Yu Cheng
-
Polymath Polymath: A Challenging Multi-modal Mathematical Reasoning Benchmark [Code] (6 Oct 2024)
Himanshu Gupta, Shreyas Verma, Ujjwala Anantheswaran, Kevin Scaria, Mihir Parmar, Swaroop Mishra, Chitta Baral
-
๐ MLLM-CompBench MLLM-CompBench: A Comparative Reasoning Benchmark for Multimodal LLMs [Code] (23 Jul 2024, NeurIPS 2024)
Jihyung Kil, Zheda Mai, Justin Lee, Zihe Wang, Kerrie Cheng, Lemeng Wang, Ye Liu, Arpita Chowdhury, Wei-Lun Chao
-
LogicVista LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts [Code] (6 Jul 2024)
Yijia Xiao, Edward Sun, Tianyu Liu, Wei Wang
-
๐ฅ๐ Visual CoT Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning [Code] (25 Mar 2024, NeurIPS 2024 Spotlight)
Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu, Hongsheng Li
-
๐ฅ Mementos Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences [Code] (19 Jan 2024)
Xiyao Wang, Yuhang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang
-
Virgo Virgo: A Preliminary Exploration on Reproducing o1-like MLLM [Code] (3 Jan 2025)
Yifan Du, Zikang Liu, Yifan Li, Wayne Xin Zhao, Yuqi Huo, Bingning Wang, Weipeng Chen, Zheng Liu, Zhongyuan Wang, Ji-Rong Wen
-
๐ฅ Mulberry Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search [Code] (24 Dec 2024)
Huanjin Yao, Jiaxing Huang, Wenhao Wu, Jingyi Zhang, Yibo Wang, Shunyu Liu, Yingjie Wang, Yuxin Song, Haocheng Feng, Li Shen, Dacheng Tao
-
AtomThink AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning [Code] (18 Nov 2024)
Kun Xiang, Zhili Liu, Zihao Jiang, Yunshuang Nie, Runhui Huang, Haoxiang Fan, Hanhui Li, Weiran Huang, Yihan Zeng, Jianhua Han, Lanqing Hong, Hang Xu, Xiaodan Liang
-
๐ฅ LLaVA-CoT LLaVA-CoT: Let Vision Language Models Reason Step-by-Step [Code] (15 Nov 2024)
Guowei Xu, Peng Jin, Hao Li, Yibing Song, Lichao Sun, Li Yuan
-
๐ IRED Learning Iterative Reasoning through Energy Diffusion [Code] [Website] (17 Jun 2024, ICML 2024)
Yilun Du, Jiayuan Mao, Joshua B. Tenenbaum
-
๐ GeoReasoner GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model [Code] (3 Jun 2024, ICML 2024)
Ling Li, Yu Ye, Bingchuan Jiang, Wei Zeng
-
๐ฅ๐ VoT Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition [Code] [Website] (7 May 2024, ICML 2024 Oral)
Hao Fei, Shengqiong Wu, Wei Ji, Hanwang Zhang, Meishan Zhang, Mong-Li Lee, Wynne Hsu
-
Cantor Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [Code] (24 Apr 2024)
Timin Gao, Peixian Chen, Mengdan Zhang, Chaoyou Fu, Yunhang Shen, Yan Zhang, Shengchuan Zhang, Xiawu Zheng, Xing Sun, Liujuan Cao, Rongrong Ji
-
๐ Momentor Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning [Code] (18 Feb 2024, ICML 2024)
Long Qian, Juncheng Li, Yu Wu, Yaobo Ye, Hao Fei, Tat-Seng Chua, Yueting Zhuang, Siliang Tang
-
๐ ContPhy ContPhy: Continuum Physical Concept Learning and Reasoning from Videos [Code] [Website] (9 Feb 2024, ICML 2024)
Zhicheng Zheng, Xin Yan, Zhenfang Chen, Jingzhou Wang, Qin Zhi Eddie Lim, Joshua B. Tenenbaum, Chuang Gan
-
๐ ConTextual ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models [Code] [Website] (24 Jan 2024, ICML 2024)
Rohan Wadhawan, Hritik Bansal, Kai-Wei Chang, Nanyun Peng
-
๐ฅ๐ MM-CoT Multimodal Chain-of-Thought Reasoning in Language Models [Code] (2 Feb 2023, TMLR)
Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola
All issues and pull requests are warmly welcomed to contribute related papers to this curated list! Feel free to submit any relevant additions to help expand and enhance this collection.