Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cloud RMs #173

Open
natolambert opened this issue Sep 6, 2024 · 4 comments
Open

Add Cloud RMs #173

natolambert opened this issue Sep 6, 2024 · 4 comments
Labels
New Model Add a new model to the leaderboard/codebase

Comments

@natolambert
Copy link
Collaborator

See @zankner's repo https://github.com/zankner/CLoud, RM's that think out loud!

@natolambert natolambert added the New Model Add a new model to the leaderboard/codebase label Sep 6, 2024
@scottsuk0306
Copy link

Hi @natolambert, I recently trained a generative RM (prometheus-eval/prometheus-RM-Llama-8B-v1.0) based on the CLoud code base and l think the inference using the huggingface transformers can be easily integrated to existing code of reward-bench. Can I try working on this?

@natolambert
Copy link
Collaborator Author

natolambert commented Dec 2, 2024

@scottsuk0306 yes please! I've been curious for a bit and we're building new datasets now

FYI -- my issue was the tokenization timing was specific and would require a bunch of handling or a refactor. Lmk if you figure out something clever.

@zankner
Copy link

zankner commented Dec 2, 2024

Hi sorry realized I never followed up on this. @scottsuk0306 @natolambert we support HF inference in the CLoud repo which I can also add (hf-inference). However, HF generate is extremely slow. More recently we added vllm-support but I'm not sure whether that would be easy to support in your repo, if you have any thoughts.

@natolambert
Copy link
Collaborator Author

We could try running Cloud RM with the "generative" pipeline, which is a bit different.
Otherwise, I feel like it isn't needed to use VLLM, unless we can figure out how to wrap it in the same abstraction -- which actually may be possible, because we just do some general model loading.

Either way, curious where we end up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Model Add a new model to the leaderboard/codebase
Projects
None yet
Development

No branches or pull requests

3 participants