Skip to content

Course information for CS598-Topics in LLM Agents(25'Spring) under the direction of Prof. Jiaxuan You ( jiaxuan@illinois.edu ).

License

Notifications You must be signed in to change notification settings

ulab-uiuc/CS598-Topics-in-LLM-Agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo

Topics in LLM Agents(25'Spring)

Course Console

Lectures: Room 1304 | Siebel Center for Comp Sci , Tuesday/Thursday 03:30 PM - 04:45 PM.

Member (NetID) Role Office Hours
Jiaxuan You (jiaxuan) Instructor Thursday 05:00-06:00PM, Room 2122 Siebel Center for Computer Science
Jinwei Yao (jinweiy) TA Tuesday 01:00-02:00 PM, Zoom (link visible on Canvas)

Canvas: for homework/report submission.

Github: most of course information is here, including schedule and paper lists.

Slack: ALL communication regarding this course must be via [Slack](link visible in the Canvas,join with your UIUC email address). This includes questions, discussions, announcements, as well as private messages.

Piazza: This feature is deprecated. Replaced by Slack.

OpenReview: for the simulation of review and response as part of the course projects.

Note: Please use Piazza to submit your questions. Please DON'T email the TA or Professor You, unless the matter is private.

Course Description

Learning Objectives: This course offers an in-depth exploration of the fascinating field of LLM agents. Designed as a seminar-style course, it guides students through the fundamental methods that power LLM agents and examines their practical applications in real-world contexts. At the end of this course, you will be able to:

  • Have a great overview of state-of-the-art LLM agent papers;
  • Familiar with the process of research lifecycle including paper submission, paper review and rebuttal;
  • Critique and evaluate the design details of LLM agent papers.

Structure: The course is structured around reading cutting-edge research papers, student-led presentations, interactive discussions, and collaborative semester-long projects. We begin with an introduction to the core concepts of LLM agents, then delve into the latest research on building agents, covering topics including:

  • Agent ability
    • Reasoning
    • Memory
    • Planning
    • Multimodal understanding
  • Agent evaluation
  • Agent framework
    • Tool use
    • Retrieval-augmented generation
    • Multi-agent systems
  • Agent application
    • Auto-research
    • Coding agents
    • Social agents
    • Gaming agents
  • Challenges from agents to AGI
    • Data
    • Safety
    • Alignment
    • Human-agent interaction

Tentative Schedule and Reading List

Note: (1) This is an evolving list; (2) For each topic, there would be 2-3 "required" papers that presenter should include in their in-class presentation.

Date Readings Pilot-Presenter Copilot-Reviewers Notes
Course Introduction
Jan 21 (Required) Section 1 of How far are we from AGI?
(Required) How to Write a Paper
(Required) Language Agents: Foundations, Prospects, and Risks
How to Give a Bad Talk
Jiaxuan
Overview of LLM Agents
Jan 23 [AI Agent Overview I]
(Required) Section 2-3 of How far are we from AGI?
Jiaxuan
Jan 28 [AI Agent Overview II]
(Required) Section 4-5 of How far are we from AGI?
Jiaxuan
Jan 30 [AI Agent Overview III]
(Required) Section 6-7 of How far are we from AGI?
Jiaxuan
Feb 4 No Lecture / Work on Project Proposal
Feb 6 No Lecture / Work on Project Proposal
Agent Ability
Feb 11 [Reasoning]
(Required) Tree of Thoughts: Deliberate Problem Solving with Large Language Models
(Required) ReAct: Synergizing Reasoning and Acting in Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models
Kartik Ramesh, Allen Thomas, Shraddhaa Mohan Rashi Tyagi, Ziyang Zheng, Haoran Wu
Feb 13 [Memory]
(Required) HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
(Required) Cognitive Architectures for Language Agents
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together
Global workspace theory of consciousness: Toward a cognitive neuroscience of human experience?
Tianyi Huang, Yuyang Wang, Boyang Sun Peixuan Han, Zirui Cheng, Xiaocheng Yang
Feb 18 [Planning]
(Required) LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
(Required) Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Feb 20 [Multi-modal Understanding]
(Required) Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
(Required)VisualWebArena: Evaluating Multimodal Agents on Realistic Visually Grounded Web Tasks
GroundingGPT: Language Enhanced Multi-modal Grounding Model
Agent Evaluation
Feb 25 [via benchmarks/LLMs/VLMs]
(Required) Autonomous Evaluation and Refinement of Digital Agents
(Required) Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
AI Agents That Matter
Agent Framework
Feb 27 [Tool Use]
(Required) ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
(Required) Gorilla: Large Language Model Connected with Massive APIs
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
What Are Tools Anyway? A Survey from the Language Model Perspective
March 4 [Retrieval-Augmented Generation]
(Required) Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity
(Required) Corrective Retrieval-Augmented Generation
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
March 6 No Lecture / Work on Mid-term Presentation
March 11 [Mid-term Presentation]
March 13 [Mid-term Presentation]
March 25 [Multi-agent Systems]
(Required) AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
(Required) CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Agent Application
March 27 [Auto-research]
(Required) ResearchTown: Simulator of Human Research Community
(Required) Can Large Language Models Provide Useful Feedback on Research Papers? A Large-scale Empirical Analysis
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
April 1 [Coding Agents]
(Required)OpenHands: An Open Platform for AI Software Developers as Generalist Agents
(Required) If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents
A Survey on Large Language Models for Code Generation
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
April 3 [Social Agents]
(Required) Generative Agents: Interactive Simulacra of Human Behavior
(Required) SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
April 8 [Gaming Agents]
(Required) Voyager: An Open-Ended Embodied Agent with Large Language Models
(Required) MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
A Survey on Large Language Model-Based Game Agents
A Generalist Agent
Challenges from Agents to AGI
April 10 [Data]
(Required) BAGEL: Bootstrapping Agents by Guiding Exploration with Language
(Required) SOAR: Autonomous Improvement of Instruction Following Skills via Foundation Models
Latent Action Pretraining from Videos
April 15 [Safety]
(Required) Universal and Transferable Adversarial Attacks on Aligned Language Models
(Required)DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Extracting Training Data from Large Language Models
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies
April 17 [Alignment]
(Required) Training Language Models to Follow Instructions with Human Feedback
(Required) Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Position: A Roadmap to Pluralistic Alignment
Aligning AI with Shared Human Values
April 22 No Lecture / Work on Final Presentation We don't have slots for [Human-Agent Interaction]. But you can read by yourself if you are interested.
Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts
Evaluating Human-Language Model Interaction
April 24 No Lecture / Work on Final Presentation
April 29 (Tentative)No Lecture / Work on Final Presentation
May 1 Final Presentation
May 6 Final Presentation

Tentative Grading

Groups: All activities of this course, except your own participation :), will be performed in groups of 3 students. You can use Piazza to find your team mates online. Form a group of 3 members and declare your group's membership and paper preferences (with your UIUC email address) by Jan 30. After this date, we will form groups from the remaining students.

Component Weight Breakdown
Pre-class Idea/Question Proposal 10%
In-class Discussion 25% - 15% In-class Pilot Presentation
- 10% In-class Co-pilot Summary
Projects 65% - 5% Proposal Report
- 5% Midterm Presentation
- 30% Final Survey Report
- 10% Review and Response
- 15% Final Presentation

Policies

Pre-class: Pre-class Idea/Question Proposal

Each lecture will include one or two required readings that all students are expected to read. Additionally, there will be optional related readings that only the presenter(s) are required to familiarize themselves with. These optional readings are not mandatory for the rest of the class.

Before each lecture(starting from Jan 28 for counts), all students must submit here(with your UIUC email address) one insightful question/idea for each of the presented papers. Up to five absences.

In-class: Presentation & Discussion

In each class after Overview of LLM Agents taught by Prof.You, the students are expected to conduct the presentation and discussion.

This discussion will involve two distinct roles played by different student groups, simulating an interactive and dynamic scholarly exchange. Each group will be assigned to the following two roles once:

  1. The Pilot Presenter:

    • Group Assignment: Prepare slides for papers marked as "Required" and deliver a presentation on a specific topic.
    • Responsibility: Present the assigned topic and address audience questions during the presentation.
  2. The Co-pilot Reviewers:

    • Group Assignment: Write a summary of the paper and take on the role of reviewers for one assigned slot.
    • Responsibility: Critically evaluate the paper by posing challenging questions, identifying weaknesses, and suggesting areas for improvement. Your role is to provide constructive feedback and engage in a simulated peer review discussion.

Rest of the Class: feel free to actively ask questions and engage in the dialogue. In-class assignment summary

Guidelines for the Pilot Presenter

The course will follow a seminar format, with one group presenting during each class session. Each group will be responsible for presenting at least one lecture throughout the semester. Presentations should be no longer than 50 minutes, excluding interruptions. However, presenters should anticipate and be prepared to address questions and interruptions during their talk.

During your presentation, you are expected to:

  • Provide a concise background to introduce and motivate the problem (e.g., referencing prior talks for simplicity).
  • Explain the main idea, approach, and/or key insight from the required reading (use examples whenever appropriate).
  • Cover technical details to help the audience grasp the key points without needing to closely read the material (provide a quick overview of evaluations).
  • Highlight differences between the required reading and related works, including any additional readings.
  • Discuss strengths and weaknesses of the required reading and suggest potential directions for future research.

Submission of slides:

  • Deadlines: Slides for the presentation must be submitted to the instructor team via Canvas (in *.pptx format) at least 24 hours before the scheduled class.
  • Format: We recommend (not mandatory) this template.

Guidelines for the Co-pilot Reviewers

Each group will be assigned roughly 1 paper summaries within 2 days after class presentation.

Each summary should address the following questions in 2-3 pages with sufficient detail:

  1. What is the problem being addressed, and why is it important?
  2. What is the state of related works in this topic?
  3. What solution is proposed, and what is the key insight guiding the solution?
  4. What are the drawbacks or limitations of the proposed solution?
  5. What potential directions could be explored in future research?

Submission of Paper Summary:

  • Deadline: Summaries must be uploaded to Canvas (by one member in the group) within 2 days after the presentation of the corresponding paper. Late submissions will not be counted.
  • Format: We provide this template for your reference. We suggest that you can use Google Docs to enable in-line comments and suggestions.

Best Practices:

  • Allocate enough time to read and understand the assigned paper.
  • Discuss the paper as a group to share perspectives and insights.
  • Write the summary carefully, ensuring clarity and completeness.
  • Incorporate key observations from the class discussion in your final submission.

Project: A Survey on LLM Agents

After team building with 3 members in each group, you are expected to start your term-long project ASAP.

To simulate the whole process of academic research, you are supposed to have a proposal, submit a paper, review and response.

For the proposal:

  • Topic: Select a topic (not too ambitious like "LLM agents" or not too small like "Minecraft gaming agents") related to LLM agents. Of course, you can choose the topic that you play as a presenter but it is not mandatory.
  • Format: Templates adapted from TMLR, about 2 pages.
  • Deadlines: Feb 10, 2025.

We will also have a midterm presentation to check your progress:

  • Scheduled Slots: March 11/13, 2025.
  • Time limit: To decide.

For survey paper/report submission(draft):

  • Description: Not the final version but you should be ready and complete for a submission. Important for review and response afterward.
  • Where to submit: OpenReview and Canvas. Details to update later.
  • Format: Templates adapted from TMLR.
  • Page Limitation: To decide.
  • Deadlines: April 16, 2025.

For review and response:

  • Description: You are expected to review the survey of other groups and response the reviews for your own paper submission. The duration of both review and response would be 1 week for each.
  • Where to review and response: OpenReview and Canvas. Details to update later.
  • Guidelines: To update.
  • Deadlines: Review: April 23; Response: April 30.

For final presentation:

  • Description: you are expected to present your survey paper.
  • Scheduled Slots: May 1/6, 2025.
  • Time limit: To decide.

For final version of survey paper:

  • Description: you are expected to submit your final version of survey report.
  • Format: Templates adapted from TMLR.
  • Where to submit: Canvas.
  • Page Limitation: To decide.
  • Deadlines: May 8, 2025.

Summary of Deadlines

We summarize the deadlines of activities of pre-class, in-class and project here for your convenience.

Task Who Due Date/Time Notes
Team Building Everyone Jan 30 Can use Piazza to find teammates.
Team information submission for Topic Assignment Every group Jan 30 Declare your group's membership and paper preferences (with your UIUC email address) here.
Project Proposal Every group Feb 10 Submit to Canvas.
Pre-class Proposal Everyone before the class(starting from Jan 28 for counts, up to 5 absences) Submit an insightful question/idea here(only UIUC email address) for each required paper.
In-class Discussion Presenter 24 hours before the class Submit slides to Canvas
"Reviewers" Within 2 days after the class Submit paper summaries to Canvas
Midterm Presentation Every group March 25/27
Survey Report Submission (Draft) Every group April 16 Submit to both Canvas and OpenReview
Review and Response Every group April 23 for review, and April 30 for response April 16-23 for review, April 24-30 for response
Final Presentation Every group May 1/6
Report Camera-ready Revision (Final) Every group May 8 Submit to Canvas

Acknowledgements

In course structure design, this course is heavily inspired by other seminar-like courses, particularly UIUC CS598-GenAI System. Acknowledgments to Prof.Fan Lai for generous sharing of his great course. For course topics and paper lists, we mainly refer to UC Berkeley CS294/194-196 Large Language Model Agents and EMNLP 2024 Tutorial: Language Agents: Foundations, Prospects, and Risks. Thanks Haofei Yu, Zirui Chen, Kunlun Zhu and other Ulab members for suggestions.

About

Course information for CS598-Topics in LLM Agents(25'Spring) under the direction of Prof. Jiaxuan You ( jiaxuan@illinois.edu ).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published