Skip to content

Support Replicate Parallel Operator on CPUs for Realm backend#1640

Open
seemamirch wants to merge 2 commits intoflexflow:masterfrom
seemamirch:sm/realm-parallel-operators-replicate
Open

Support Replicate Parallel Operator on CPUs for Realm backend#1640
seemamirch wants to merge 2 commits intoflexflow:masterfrom
seemamirch:sm/realm-parallel-operators-replicate

Conversation

@seemamirch
Copy link
Copy Markdown

@seemamirch seemamirch commented Apr 9, 2026

Description of changes:

Add support for replicate op in distributed training & Realm backend

  • Add perform_pass_expansion_for_replicate for fwd/bwd pass expansion
  • Add perform_shard_expansion_for_replicate and _bwd for shard expansion
  • Add build_replicate_invocation in make_dynamic_open_dataflow_graph
  • Add is_replicate_attrs helper and guard replicate in copy_insertion
  • Add ReplicateAttrs to TrainingOperationAttrs
  • Add SumReductionFloat/Double for backward replicate reduce operation
  • Add issue_replicate_bwd in spawn_dynamic_node_invocation
  • Fix per_device_op_state init race condition with direct write
  • Fix .value() calls on optional per_device_op_state across op unary/binary impls
  • Update issue_copy to support optional reduction op
  • Fix Relu to allow discard_copy_degree > 1
  • Add testcase for Replicate Op

This change is Reviewable

Seema Mirchandaney added 2 commits April 9, 2026 15:49
- Add perform_pass_expansion_for_replicate for fwd/bwd pass expansion
- Add perform_shard_expansion_for_replicate and _bwd for shard expansion
- Add build_replicate_invocation in make_dynamic_open_dataflow_graph
- Add is_replicate_attrs helper and guard replicate in copy_insertion
- Add ReplicateAttrs to TrainingOperationAttrs
- Add SumReductionFloat/Double for backward replicate reduce operation
- Add issue_replicate_bwd in spawn_dynamic_node_invocation
- Fix per_device_op_state init race condition with direct write
- Fix .value() calls on optional per_device_op_state across op impls
- Update issue_copy to support optional reduction op
- Add testcase for replicate op
- Add perform_pass_expansion_for_replicate for fwd/bwd pass expansion
- Add perform_shard_expansion_for_replicate and _bwd for shard expansion
- Add build_replicate_invocation in make_dynamic_open_dataflow_graph
- Add is_replicate_attrs helper and guard replicate in copy_insertion
- Add ReplicateAttrs to TrainingOperationAttrs
- Add SumReductionFloat/Double for backward replicate reduce operation
- Add issue_replicate_bwd in spawn_dynamic_node_invocation
- Fix per_device_op_state init race condition with direct write
- Fix .value() calls on optional per_device_op_state across op impls
- Update issue_copy to support optional reduction op
- Add testcase for replicate op
@seemamirch
Copy link
Copy Markdown
Author

@lockshaw @elliottslaughter - Please review

@seemamirch seemamirch force-pushed the sm/realm-parallel-operators-replicate branch from b18c75d to 3405621 Compare April 9, 2026 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant