-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
a53895e
commit e6017b3
Showing
4 changed files
with
204 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
<!DOCTYPE html> | ||
<html lang="en"> | ||
<head> | ||
<meta charset="UTF-8"> | ||
<title>Problem Solving in Language Model Networks</title> | ||
<link rel="stylesheet" type="text/css" href="styles.css"> | ||
</head> | ||
<body> | ||
<div class="container"> | ||
<h1 style="margin-top: 50px;"><span class="highlighted-title">Problem Solving in Language Model Networks</span></h1> | ||
<div class="author-info"> | ||
<p class="author">Ciaran Regan<sup>1</sup>, Alexandre Gournail<sup>2</sup>, Mizuki Oka<sup>1</sup></p> | ||
<p class="affiliation"> | ||
<sup>1</sup>Grad. School of Science and Technology, University of Tsukuba, Tsukuba, Ibaraki, Japan <sup>2</sup>Ensimag, Grenoble INP, Grenoble, France | ||
</p> | ||
</div> | ||
<div class="images-container"> | ||
<img src="./figs/tsukuka.png" alt="First Image" class="image"> | ||
<img src="./figs/ensimag.png" alt="Second Image" class="image"> | ||
</div> | ||
|
||
<h3 class="abstract-heading">Abstract</h3> | ||
<p>We investigate multi-agent approaches to enhance the reasoning and question-answering capabilities of Large Language Models (LLMs). Our study extends the concept of multi-agent debate to more complex network structures, specifically scale-free networks. We measure the question-answering performance, the strength of the consensus formed, and the impact of bias within the network. Results indicate that correctly biased hub nodes significantly improve overall system performance, suggesting that strategically placing knowledgeable agents can boost collective intelligence.</p> | ||
|
||
<h3 class="abstract-heading">Introduction</h3> | ||
<p>Large Language Models (LLMs) have shown remarkable abilities in various tasks, yet they still struggle with hallucinations and incorrect answers. To address these issues, multi-agent approaches inspired by human problem-solving have been introduced. Techniques like ReAct and Reflexion enable LLMs to engage in iterative reasoning and self-reflection, improving their decision-making. However, these methods primarily use single agents. Our work explores multi-agent systems on scale-free networks, aiming to understand how agents influence each other and how network topology affects performance. We extend the concept of multi-agent debate to these complex networks to analyze their dynamics and effectiveness.</p> | ||
|
||
<h3 class="abstract-heading">Methods</h3> | ||
<p>We represent LLM agents as nodes in a network, with edges indicating communication channels. In multi-agent debate, agents first solve problems individually, then reconsider their answers based on their neighbors' responses and their previous answers. This process repeats for several rounds, culminating in a majority vote to determine the collective answer. We introduce bias by providing certain agents with correct or incorrect answers and study the influence of these biased agents based on their network position (hubs or edges). The impact of these biases on the overall performance and the consensus within the network is analyzed.</p> | ||
|
||
<h3 class="abstract-heading">Experimental Setup</h3> | ||
<p>We conducted experiments using three scale-free networks, each with 25 GPT-3.5-Turbo powered agents. These agents engaged in four rounds of debate to answer 100 high-school mathematics questions from the MMLU dataset. The experiment was repeated three times to ensure statistical significance. To study the effect of bias, we introduced correct and incorrect answers into either the hub or edge nodes and compared the performance with unbiased networks. The goal was to observe how biased nodes influenced the spread of information and the overall accuracy of the system.</p> | ||
|
||
<h3 class="abstract-heading">Results</h3> | ||
<p>The introduction of bias into hub nodes had a significant impact on performance. Correctly biased hubs increased the system's accuracy from 64% to 86%, while incorrectly biased hubs reduced it to 42%. This shows that agents are strongly influenced by their neighbors' responses. Networks with biased edge nodes showed little change in performance, indicating that influence is more significant when the biased nodes are centrally located. Our analysis revealed that agents tend to form a consensus when the system answers correctly, but responses are split when the system is incorrect. The presence of bias reduced consensus in incorrect answers, increasing the variability in responses.</p> | ||
|
||
<h3 class="abstract-heading">Conclusion</h3> | ||
<p>Our study demonstrates that the strategic placement of knowledgeable agents in central network positions can enhance the overall performance of multi-agent systems. This finding suggests that future multi-agent systems should leverage network topology to optimize collective intelligence. By placing larger, more capable models at network hubs and smaller models at the periphery, it is possible to improve performance without a significant increase in computational cost. Future research should explore different network structures and larger systems to generalize these findings further.</p> | ||
|
||
<h3 class="abstract-heading">Discussion and Limitations</h3> | ||
<p>This study has important implications for designing future multi-agent systems. However, it is limited by the number of agents, questions, and rounds used due to computational constraints. Future work should explore a broader range of network structures, including random and small-world networks, and increase the number of agents to better understand the dynamics and performance of these systems. Despite these limitations, our findings provide valuable insights into how bias and network topology influence collective problem-solving and consensus formation in multi-agent systems.</p> | ||
</div> | ||
</body> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
body { | ||
font-family: 'Arial', sans-serif; | ||
} | ||
.container { | ||
width: 80%; | ||
margin: auto; | ||
} | ||
.video-container { | ||
position: relative; | ||
padding-bottom: 56.25%; /* 16:9 aspect ratio */ | ||
padding-top: 25px; | ||
height: 0; | ||
} | ||
.video-caption { | ||
text-align: center; | ||
font-size: 0.9em; | ||
color: #666; | ||
margin-top: 5px; | ||
} | ||
.video-container iframe { | ||
position: absolute; | ||
top: 0; | ||
left: 0; | ||
width: 100%; | ||
height: 100%; | ||
} | ||
.small-heading { | ||
font-size: 1.4em; | ||
font-weight: bold; | ||
color: #2e2e2e; | ||
margin-top: 30px; | ||
text-align: left; | ||
} | ||
.author-info { | ||
text-align: center; | ||
margin-top: 5px; | ||
} | ||
.author { | ||
font-size: 1.2em; | ||
color: #363636; | ||
} | ||
.affiliation, .corresponding-author { | ||
font-size: 0.9em; | ||
} | ||
sup { | ||
font-size: 0.75em; | ||
} | ||
.images-container { | ||
text-align: center; /* 画像を中央揃えにする */ | ||
margin-top: 20px; /* 上のコンテンツとの間隔を開ける */ | ||
display: flex; | ||
justify-content: center; | ||
align-items: center; | ||
} | ||
.image { | ||
width: 150px; /* 画像の幅を適宜設定 */ | ||
height: auto; /* 高さを自動調整してアスペクト比を維持 */ | ||
margin: 0 30px; /* 画像の間に余白を設定 */ | ||
display: inline-block; /* 画像をインラインブロック要素として表示 */ | ||
} | ||
.results-section { | ||
font-size: 1.0em; /* Adjust size as needed */ | ||
text-align: left; | ||
margin-top: 40px; /* Space above the Abstract heading */ | ||
margin-bottom: 10px; /* Space below the Abstract heading */ | ||
} | ||
.figure img { | ||
max-width: 100%; | ||
height: auto; | ||
} | ||
.caption { | ||
text-align: center; | ||
font-size: 0.9em; | ||
color: #666; | ||
} | ||
.highlighted-title { | ||
font-size: 1.2em; /* Adjust size as needed */ | ||
font-weight: bold; | ||
} | ||
|
||
h1 { | ||
text-align: center; | ||
font-weight: normal; /* Ensure the rest of the title is not bold */ | ||
} | ||
.abstract-heading { | ||
font-size: 1.0em; /* Adjust size as needed */ | ||
text-align: left; | ||
margin-top: 40px; /* Space above the Abstract heading */ | ||
margin-bottom: 10px; /* Space below the Abstract heading */ | ||
} | ||
.method-section{ | ||
font-size: 1.0em; /* Adjust size as needed */ | ||
text-align: left; | ||
margin-top: 40px; /* Space above the Abstract heading */ | ||
margin-bottom: 10px; /* Space below the Abstract heading */ | ||
} | ||
.collapsible { | ||
background-color: #f9f9f9; | ||
color: #444; | ||
cursor: pointer; | ||
padding: 18px; | ||
width: 100%; | ||
border: none; | ||
text-align: left; | ||
outline: none; | ||
font-size: 15px; | ||
transition: 0.4s; | ||
} | ||
|
||
.active, .collapsible:hover { | ||
background-color: #555; | ||
color: white; | ||
} | ||
.tcolorbox { | ||
border: 1px solid rgb(27, 27, 27); | ||
background-color: #cdcdcd; | ||
padding: 10px; | ||
margin: 10px 0; | ||
border-radius: 10px; /* Adjust this value as needed */ | ||
} | ||
.collapsible::after { | ||
content: '\002B'; /* Unicode character for "+" */ | ||
font-size: 13px; | ||
color: #777; | ||
float: right; | ||
margin-left: 5px; | ||
} | ||
.active::after { | ||
content: "\2212"; /* Unicode character for "-" */ | ||
} | ||
.content { | ||
padding: 0 18px; | ||
display: none; | ||
overflow: hidden; | ||
background-color: #f1f1f1; | ||
transition: max-height 0.2s ease-out; | ||
} | ||
.figure { | ||
display: flex; | ||
justify-content: center; | ||
margin-bottom: 20px; | ||
margin-top: 20px; | ||
} | ||
.subfigure { | ||
margin: 0 10px; | ||
text-align: center; | ||
} | ||
.subfigure img { | ||
width: 100%; | ||
height: auto; | ||
} | ||
@media (max-width: 768px) { | ||
.figure { | ||
flex-direction: column; | ||
align-items: center; | ||
} | ||
.subfigure { | ||
width: 80%; | ||
} | ||
} |