-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cloud RMs #173
Comments
Hi @natolambert, I recently trained a generative RM (prometheus-eval/prometheus-RM-Llama-8B-v1.0) based on the CLoud code base and l think the inference using the huggingface transformers can be easily integrated to existing code of reward-bench. Can I try working on this? |
@scottsuk0306 yes please! I've been curious for a bit and we're building new datasets now FYI -- my issue was the tokenization timing was specific and would require a bunch of handling or a refactor. Lmk if you figure out something clever. |
Hi sorry realized I never followed up on this. @scottsuk0306 @natolambert we support HF inference in the CLoud repo which I can also add (hf-inference). However, HF generate is extremely slow. More recently we added vllm-support but I'm not sure whether that would be easy to support in your repo, if you have any thoughts. |
We could try running Cloud RM with the "generative" pipeline, which is a bit different. Either way, curious where we end up. |
See @zankner's repo https://github.com/zankner/CLoud, RM's that think out loud!
The text was updated successfully, but these errors were encountered: