(:construction: Migrating to GitHub App) PR Arena ⚔️

PR Arena is a coding assistant designed to evaluate and improve OpenHands GitHub Backlog Resolver through paired pull request (PR) generations. It enables developers to compare contributions from different LLMs such as GPT-4o, Llama, and more.

This project is inspired by Copilot Arena, an open source AI coding assistant that provides paired autocomplete completions from different LLMs.

Follow the instruction below to setup the Arena setting for the OpenHands resolver.

Using the GitHub Actions Workflow

This repository includes a GitHub Actions workflow that can automatically attempt to generate a pair of pull requests for individual issues labeled with 'pr-arena'. Follow the steps to use this workflow in your own repository:

Prepare a github personal access token. You can:
1. Contact us and we will set up a token for the openhands-agent account (if you want to make it clear which commits came from the agent.
2. Choose your own github user that will make the commits to the repo, and create a personal access token with read/write scope for "contents", "issues", "pull requests", and "workflows" on the desired repos.
Create an API key for the LLMs you will be setting up for the Arena setting. We usually use a single API key which can access the LLM Router.
Copy the .github/workflows/openhands-resolver.yml file to your repository's .github/workflows/ directory.
Enable read/write workflows for the repository by going to Settings -> Actions -> General -> Workflow permissions and selecting "Read and write permissions" and click "Allow Github Actions to create and approve pull requests".
Set up the following GitHub secrets in your repository, or across your entire org if you want to only set ths once and use the resolver in multiple repositories:
- PAT_USERNAME: The github username that you used to create the personal access token.
- PAT_TOKEN: The personal access token for github.
- LLM_MODELS: The comma seperated LLM models to use (i.e. litellm_proxy/neulab/claude-3-5-sonnet-20240620, litellm_proxy/neulab/gpt-4o-2024-05-13, litellm_proxy/neulab/gpt-4o-2024-08-06, litellm_proxy/neulab/gpt-4o-mini-2024-07-18, litellm_proxy/neulab/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo, litellm_proxy/neulab/Qwen/Qwen2-72B-Instruct, litellm_proxy/neulab/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo, litellm_proxy/neulab/NousResearch/Hermes-3-Llama-3.1-405B-Turbo, litellm_proxy/neulab/gemini/gemini-1.5-flash, litellm_proxy/neulab/gemini/gemini-1.5-pro, litellm_proxy/neulab/o1-preview, litellm_proxy/neulab/o1-mini, litellm_proxy/neulab/meta-llama/Meta-Llama-3.1-405B-Instruct, litellm_proxy/neulab/meta-llama/Meta-Llama-3.1-70B-Instruct, litellm_proxy/neulab/meta-llama/Meta-Llama-3.1-8B-Instruct, litellm_proxy/neulab/meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo, litellm_proxy/neulab/meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo, litellm_proxy/neulab/deepseek-chat)
- LLM_API_KEY: Your API key to access the LLM Router for the LLM service
- LLM_BASE_URL: The base URL for the LLM API (i.e. https://llm-proxy.app.all-hands.dev)
- FIREBASE_CONFIG: (Only for the prototype) An environment variable containing the Firebase configuration details (e.g., API key, project ID, etc.).
To trigger the workflow, add the 'pr-arena' label to any issue you want the AI to attempt to resolve in an Arena setting.

The workflow will:

Randomly select two LLMs among given LLM_MODELS to attempt to resolve the issue, using the OpenHands resolver and the selected models respectively.
Create and display two git_patchs that corresponds to each of the attempts. (Wait until the GitHub action comments on issue with the webpage URL for you arena!)
When the user selects one of them, it automatically creates a Pull Request based on the selected model.
Comment on the issue with the results.

Troubleshooting

This project is an extension of OpenHands GitHub Backlog Resolver. If you have any issues, please open an issue on this github repo, we're happy to help! Alternatively, you can email us or join the OpenHands Slack workspace and ask there.

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
.github		.github
openhands_resolver		openhands_resolver
tests		tests
.gitignore		.gitignore
.openhands_instructions		.openhands_instructions
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(:construction: Migrating to GitHub App) PR Arena ⚔️

Using the GitHub Actions Workflow

Troubleshooting

About

Releases

Packages

Contributors 4

Languages

License

neulab/pr-arena

Folders and files

Latest commit

History

Repository files navigation

(:construction: Migrating to GitHub App) PR Arena ⚔️

Using the GitHub Actions Workflow

Troubleshooting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages