Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wandb support, callback func for PipelineStage, and cache handling #382

Merged
merged 26 commits into from
Jul 22, 2024

Conversation

KuoHaoZeng
Copy link
Collaborator

  1. track grad norm before clipping
  2. add sampler_select to remove KV cache corresponding to finished processes at evaluation
  3. support wandb checkpoint upload
  4. support wandb checkpoint download at evaluation and training resume
  5. add callback function to support engine attribute changes in different PipelineStage, for example, change optimizer.

@@ -210,8 +210,7 @@


class TrackingCallback(Protocol):
def __call__(self, type: TrackingInfoType, info: Dict[str, Any], n: int):
...
def __call__(self, type: TrackingInfoType, info: Dict[str, Any], n: int): ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@@ -255,14 +263,18 @@

# We can't display all 72 channels in an RGB image so instead we randomly assign
# each object a color and then just allow them to overlap each other
colored_semantic_map = SemanticMapBuilder.randomly_color_semantic_map(
semantic_map
colored_semantic_map = (

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable colored_semantic_map is not used.
)

# Here's the full semantic map with nothing masked out because the agent
# hasn't seen it yet
colored_semantic_map_no_fog = SemanticMapBuilder.randomly_color_semantic_map(
map_sensors[-1].semantic_map_builder.ground_truth_semantic_map
colored_semantic_map_no_fog = (

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable colored_semantic_map_no_fog is not used.
@@ -575,7 +590,7 @@
)
allo_h, allo_w = allocentric_map_height_width

max_view_range = math.sqrt((ego_w / 2.0) ** 2 + ego_h ** 2)
max_view_range = math.sqrt((ego_w / 2.0) ** 2 + ego_h**2)

Check warning

Code scanning / CodeQL

Pythagorean calculation with sub-optimal numerics Warning

Pythagorean calculation with sub-optimal numerics.
return ckpts_paths
else:
assert len(ckpt_steps) == 1
ckpt_fn = "{}-step-{}:latest".format(run_token, steps)

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error

Local variable 'steps' may be used before it is initialized.
jordis-ai2
jordis-ai2 previously approved these changes Jul 15, 2024
Copy link
Collaborator

@jordis-ai2 jordis-ai2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good catch :)

@KuoHaoZeng
Copy link
Collaborator Author

Few updates:

  1. Fix the pytest.yaml by downgrade torchvision to torchvision>=0.7.0,<=0.16.2 in the setup.py.
  2. Add lint in actions for black.
  3. make ckpt saving at every host an option

Copy link
Contributor

@Lucaweihs Lucaweihs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Lucaweihs Lucaweihs merged commit 0f8944a into main Jul 22, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants