Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non tango pipeline #16

Merged
merged 9 commits into from
Dec 5, 2023
Merged

Non tango pipeline #16

merged 9 commits into from
Dec 5, 2023

Conversation

OyvindTafjord
Copy link
Contributor

@AkshitaB This is an attempt at merging my old workflow from the run_lm_eval.py script in catwalk with the new steps and functionality in this repo. Feel free to leave it in a branch for now if it's not suitable to be merged.

From my viewpoint, the main advantages of running it this way are:

  • Avoid relying on a tango workspace for retrieving raw predictions, instead will have link to associated beaker experiment
  • Less setup in terms of needing a public tango workspace for communication between steps
  • Easier debugging if something goes wrong, avoid issues around tango caching wrong output
  • Don't need a bunch of jsonnet code to tinker with the evaluation configs

To run this in beaker requires a somewhat clunky gantry command, as well as uploading the config files (either to beaker or nfs) if that's used instead of direct parameters (which is also supported). But that's easy enough once it's in your workflow. My gantry example uses my OLMoEvalLatest beaker image which includes the OLMo code and I try to keep up to date.

Since this is an internal repo, GITHUB_TOKEN is needed to run gantry. Also, GDRIVE_SERVICE_ACCOUNT_JSON is needed to upload to google sheets (I tweaked the google sheet upload a bit, behind a simple_pipeline argument, to add the beaker ID, remove the tango stuff, and add a all_metrics column, see example in my "OLMo-evals-testing" sheet).

I'm totally open to suggestions for how to refactor this to accomplish the main advantages above without being quite as hacky. :)

@AkshitaB
Copy link
Collaborator

AkshitaB commented Dec 4, 2023

@OyvindTafjord Thank you for the PR! I've made some fixes, mainly formatting and lint, and added a couple of test cases. This should help with future updates to the code.

@AkshitaB AkshitaB merged commit b56a989 into main Dec 5, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants