You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Megatron, I find that the check for tp_comm_overlap and sequence_parallel。
if args.tp_comm_overlap:
assert args.sequence_parallel == True, 'Tensor parallel communication/GEMM overlap can happen only when sequence parallelism is enabled'
But why?
The text was updated successfully, but these errors were encountered:
That is because we currently only support AllGather/ReduceScatter overlapping with GEMM (and those communication types are used when sequence parallelism is enabled, as opposed to AllReduce which is being used in the other cases).
In Megatron, I find that the check for
tp_comm_overlap
andsequence_parallel
。But why?
The text was updated successfully, but these errors were encountered: