From 5b5debedaf8b165ed5ecc30b210423e33f466870 Mon Sep 17 00:00:00 2001 From: pooruss Date: Sun, 17 Mar 2024 16:39:43 +0800 Subject: [PATCH] Update stabletoolbench in readme --- README.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/README.md b/README.md index c546c82..68b30ef 100644 --- a/README.md +++ b/README.md @@ -36,6 +36,9 @@ *Read this in [中文](README_ZH.md).* ## What's New +- **[2024/3/17]** Welcome to **[StableToolBench](https://github.com/zhichengg/StableToolBench)**: +A **stable and reliable** local toolbench server based on API response simulation. Dive deeper into the tech behind StableToolBench with [paper here](https://arxiv.org/pdf/2403.07714.pdf) and explore more on the [project homepage](https://zhichengg.github.io/stb.github.io/). Codes are available [here](https://github.com/zhichengg/StableToolBench). + - **[2023/9/29]** A new version ToolEval which is more stable and covers more models including GPT4! Please refer to [**ToolEval**](https://github.com/OpenBMB/ToolBench/tree/master/toolbench/tooleval) for more details. Besides, [**ToolLLaMA-2-7b-v2**](https://huggingface.co/ToolBench/ToolLLaMA-2-7b-v2) is released with stronger tool-use capabilities. Please use the ToolLLaMA-2-7b-v2 model to reproduce our latest experimental results with the new version ToolEval. - **[2023/8/30]** Data updation, with more than **120,000** solution path annotations and **intact reasoning thoughts**! Please find `data.zip` on [Google Drive](https://drive.google.com/drive/folders/1yBUQ732mPu-KclJnuQELEhtKakdXFc3J). @@ -683,3 +686,14 @@ Feel free to cite us if you like ToolBench. primaryClass={cs.CL} } ``` + +```bibtex +@misc{guo2024stabletoolbench, + title={StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models}, + author={Guo, Zhicheng and Cheng, Sijie and Wang, Hao and Liang, Shihao and Qin, Yujia and Li, Peng and Liu, Zhiyuan and Sun, Maosong and Liu, Yang}, + year={2024}, + eprint={2403.07714}, + archivePrefix={arXiv}, + primaryClass={cs.CL} +} +```