[cuda.core] Add support for Multicast Objects

## Feature Request

Add Python bindings and a `cuda.core` abstraction for CUDA Multicast Objects
(the `cuMulticastCreate` / `cuMulticastAddDevice` / `cuMulticastBindMem` /
`cuMulticastBindAddr` / `cuMulticastUnbind` family of driver APIs).

## Motivation

Multicast Objects enable a single virtual address to multicast memory accesses
across multiple devices, which is foundational for efficient multi-GPU
collectives and SHARP-style reductions on NVLink-connected systems (Hopper+
and beyond). They are already exposed via the low-level `cuda.bindings` driver
API, but there is no high-level `cuda.core` object wrapping them.

This is one of the last major CUDA driver features without a `cuda.core`
counterpart. Internal teams building multi-GPU libraries (RAPIDS, Triton,
cuBLASMp, etc.) currently have to drop to raw driver calls.

## Proposed Scope

- A `MulticastObject` (or similarly named) class with:
  - Constructor / factory taking granularity + participating-device list
  - `add_device(device)` / `bind_memory(buffer)` / `bind_address(...)` /
    `unbind(...)` methods
  - Context-manager lifecycle (auto-release on exit)
  - Integration with existing `Device`, `Buffer`, and `VirtualMemoryResource`
- Query helpers for multicast granularity (`cuMulticastGetGranularity`)
- Example(s) under `cuda_core/examples/` demonstrating a simple 2-GPU
  multicast allreduce-style pattern
- API reference entry in `cuda_core/docs/source/api.rst`

## Related

- Tracking gaps identified in cuda.core feature audit (Nov 2025) — other
  untracked gaps include `host_launch` and `Library` (cuLibrary) APIs; those
  can be filed separately.
- Relevant driver APIs: `cuMulticastCreate`, `cuMulticastAddDevice`,
  `cuMulticastBindMem`, `cuMulticastBindAddr`, `cuMulticastUnbind`,
  `cuMulticastGetGranularity`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cuda.core] Add support for Multicast Objects #2057

Feature Request

Motivation

Proposed Scope

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[cuda.core] Add support for Multicast Objects #2057

Description

Feature Request

Motivation

Proposed Scope

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions