Skip to content

Commit

Permalink
Fixed comm parser issue
Browse files Browse the repository at this point in the history
Summary:
This DIFF is to fix the following two comm parser issue:
1. process_group:init changed backend_id to uid
2. record_param_comms changed input size from 8 to 10

Differential Revision: D56091619
  • Loading branch information
shengfukevin authored and facebook-github-bot committed Apr 16, 2024
1 parent 0a07342 commit 9d91451
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions train/comms/pt/commsTraceParser.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ def _parseExecutionTrace(
break

for pg in pgObj:
backendId = pg["backend_id"]
backendId = pg["uid"] if "uid" in pg else pg["backend_id"]
ranks = pg["ranks"]
if isinstance(ranks, list):
pgId = int(pg["pg_name"])
Expand All @@ -256,7 +256,7 @@ def _parseExecutionTrace(
for node in in_trace.nodes.values():
if node.name == "record_param_comms":
shift = (
0 if len(node.inputs) == 8 else 1
0 if len(node.inputs) == 8 or len(node.inputs) == 10 else 1
) # wait/barrier ops do not have an input tensor (len=7), shift index one over
newComm = commsArgs()
newComm.id = node.id
Expand Down

0 comments on commit 9d91451

Please sign in to comment.