Potential Overflow in TrainingState.env_steps on GPU due to jnp.int32 Default #578

vincentzhang · 2025-02-13T05:58:07Z

In the Brax PPO training implementation, the TrainingState.env_steps field (see train.py#L60 and #L613) is jnp.ndarray that defaults to jnp.int32 on GPU. If the training run exceeds 2³¹ – 1 (2,147,483,647) steps, the counter may overflow, potentially leading to unexpected behavior in long-running training sessions.

The overflow problem has been reported in the Mujoco Playground repository (#48). To mitigate this, one option might be to initialize env_steps with a 64‑bit integer (e.g. jnp.array(0, dtype=jnp.int64)).

erikfrey · 2025-02-14T22:53:40Z

Unfortunately this is a pretty fundamental JAX limitation - you can't mix 32 and 64 bit precision. So if you want 64 bit step count, training is going to slow down a lot because everything else will become 64 bit too.

What we could do, if some intrepid soul would like to try it, is make the step count a custom big int. You will need to store two int32s, and when you want the step count you'd need to do:

step_count = num1 << 32 + num2

Then you'd have to add some logic to increment num1 and num2 appropriately.

vincentzhang · 2025-02-17T07:39:18Z

Thanks a lot for the suggestion. I'll look into it sometime next week.

vincentzhang mentioned this issue Feb 13, 2025

The step counter overflows when using a very large num_timesteps google-deepmind/mujoco_playground#48

Open

vincentzhang mentioned this issue Feb 20, 2025

assert total_steps >= num_timesteps google-deepmind/mujoco_playground#52

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Overflow in TrainingState.env_steps on GPU due to jnp.int32 Default #578

Potential Overflow in TrainingState.env_steps on GPU due to jnp.int32 Default #578

vincentzhang commented Feb 13, 2025 •

edited

Loading

erikfrey commented Feb 14, 2025

vincentzhang commented Feb 17, 2025

Potential Overflow in TrainingState.env_steps on GPU due to jnp.int32 Default #578

Potential Overflow in TrainingState.env_steps on GPU due to jnp.int32 Default #578

Comments

vincentzhang commented Feb 13, 2025 • edited Loading

erikfrey commented Feb 14, 2025

vincentzhang commented Feb 17, 2025

vincentzhang commented Feb 13, 2025 •

edited

Loading