diff --git a/README.md b/README.md
index 6da895bbb9b..0a986bc10b0 100644
--- a/README.md
+++ b/README.md
@@ -5,8 +5,8 @@ CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It c
* [cuda.core](https://nvidia.github.io/cuda-python/cuda-core/latest): Pythonic access to CUDA Runtime and other core functionality
* [cuda.bindings](https://nvidia.github.io/cuda-python/cuda-bindings/latest): Low-level Python bindings to CUDA C APIs
* [cuda.pathfinder](https://nvidia.github.io/cuda-python/cuda-pathfinder/latest): Utilities for locating CUDA components installed in the user's Python environment
-* [cuda.coop](https://nvidia.github.io/cccl/python/coop): A Python module providing CCCL's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels
-* [cuda.compute](https://nvidia.github.io/cccl/python/compute): A Python module for easy access to CCCL's highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc. that are callable on the *host*
+* [cuda.coop](https://nvidia.github.io/cccl/unstable/python/coop.html): A Python module providing CCCL's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels
+* [cuda.compute](https://nvidia.github.io/cccl/unstable/python/compute/index.html): A Python module for easy access to CCCL's highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc. that are callable on the *host*
* [numba.cuda](https://nvidia.github.io/numba-cuda/): A Python DSL that exposes CUDA **SIMT** programming model and compiles a restricted subset of Python code into CUDA kernels and device functions
* [cuda.tile](https://docs.nvidia.com/cuda/cutile-python/): A new Python DSL that exposes CUDA **Tile** programming model and allows users to write NumPy-like code in CUDA kernels
* [nvmath-python](https://docs.nvidia.com/cuda/nvmath-python/latest): Pythonic access to NVIDIA CPU & GPU Math Libraries, with [*host*](https://docs.nvidia.com/cuda/nvmath-python/latest/overview.html#host-apis), [*device*](https://docs.nvidia.com/cuda/nvmath-python/latest/overview.html#device-apis), and [*distributed*](https://docs.nvidia.com/cuda/nvmath-python/latest/distributed-apis/index.html) APIs. It also provides low-level Python bindings to host C APIs ([nvmath.bindings](https://docs.nvidia.com/cuda/nvmath-python/latest/bindings/index.html)).
@@ -44,4 +44,6 @@ The list of available interfaces is:
* NVRTC
* nvJitLink
* NVVM
+* nvFatbin
* cuFile
+* NVML
diff --git a/cuda_core/docs/nv-versions.json b/cuda_core/docs/nv-versions.json
index d55ec26f53f..0d0aa6276d9 100644
--- a/cuda_core/docs/nv-versions.json
+++ b/cuda_core/docs/nv-versions.json
@@ -3,6 +3,10 @@
"version": "latest",
"url": "https://nvidia.github.io/cuda-python/cuda-core/latest/"
},
+ {
+ "version": "1.0.0",
+ "url": "https://nvidia.github.io/cuda-python/cuda-core/1.0.0/"
+ },
{
"version": "0.7.0",
"url": "https://nvidia.github.io/cuda-python/cuda-core/0.7.0/"
diff --git a/cuda_core/docs/source/api.rst b/cuda_core/docs/source/api.rst
index 41ff5f179ed..74e0ad392e7 100644
--- a/cuda_core/docs/source/api.rst
+++ b/cuda_core/docs/source/api.rst
@@ -6,11 +6,10 @@
``cuda.core`` API Reference
===========================
-This is the main API reference for ``cuda.core``. The package has not yet
-reached version 1.0.0, and APIs may change between minor versions, possibly
-without deprecation warnings. Once version 1.0.0 is released, APIs will
-be considered stable and will follow semantic versioning with appropriate
-deprecation periods for breaking changes.
+This is the main API reference for ``cuda.core``. As of version 1.0.0, all
+APIs are considered stable and follow `Semantic Versioning `_
+with appropriate deprecation periods for breaking changes. See the
+:doc:`support policy ` for details.
Devices and execution
@@ -242,46 +241,6 @@ execution.
checkpoint.Process
-CUDA system information and NVIDIA Management Library (NVML)
-------------------------------------------------------------
-
-.. note::
- ``cuda.core.system`` support requires ``cuda_bindings`` 12.9.6 or later, or 13.2.0 or later.
-
-Basic functions
-```````````````
-
-.. autosummary::
- :toctree: generated/
-
- system.get_driver_version
- system.get_driver_version_full
- system.get_driver_branch
- system.get_num_devices
- system.get_nvml_version
- system.get_process_name
- system.get_topology_common_ancestor
- system.get_p2p_status
-
-Events
-``````
-
-.. autosummary::
- :toctree: generated/
-
- system.register_events
-
-Types
-`````
-
-.. autosummary::
- :toctree: generated/
-
- :template: autosummary/cyclass.rst
-
- system.Device
- system.NvlinkInfo
-
Utility functions
-----------------
diff --git a/cuda_core/docs/source/api_nvml.rst b/cuda_core/docs/source/api_nvml.rst
new file mode 100644
index 00000000000..9e9ad3d5640
--- /dev/null
+++ b/cuda_core/docs/source/api_nvml.rst
@@ -0,0 +1,44 @@
+.. SPDX-FileCopyrightText: Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+.. SPDX-License-Identifier: Apache-2.0
+
+.. module:: cuda.core.system
+
+CUDA system information and NVIDIA Management Library (NVML)
+============================================================
+
+.. note::
+ ``cuda.core.system`` support requires ``cuda_bindings`` 12.9.6 or later, or 13.2.0 or later.
+
+Basic functions
+---------------
+
+.. autosummary::
+ :toctree: generated/
+
+ get_driver_version
+ get_driver_version_full
+ get_driver_branch
+ get_num_devices
+ get_nvml_version
+ get_process_name
+ get_topology_common_ancestor
+ get_p2p_status
+
+Events
+------
+
+.. autosummary::
+ :toctree: generated/
+
+ register_events
+
+Types
+-----
+
+.. autosummary::
+ :toctree: generated/
+
+ :template: autosummary/cyclass.rst
+
+ Device
+ NvlinkInfo
diff --git a/cuda_core/docs/source/index.rst b/cuda_core/docs/source/index.rst
index 3bf962d7251..9a266e20949 100644
--- a/cuda_core/docs/source/index.rst
+++ b/cuda_core/docs/source/index.rst
@@ -15,12 +15,14 @@ Welcome to the documentation for ``cuda.core``.
install
interoperability
api
+ api_nvml
environment_variables
contribute
.. toctree::
:maxdepth: 1
+ support
conduct
license
diff --git a/cuda_core/docs/source/install.rst b/cuda_core/docs/source/install.rst
index 90e2a1b5b17..05f813f9d3f 100644
--- a/cuda_core/docs/source/install.rst
+++ b/cuda_core/docs/source/install.rst
@@ -32,7 +32,7 @@ dependencies are as follows:
Free-threading Build Support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-As of cuda-core 0.4.0, **experimental** packages for the `free-threaded interpreter`_ are shipped.
+As of cuda-core 1.0.0, **experimental** packages for the `free-threaded interpreter`_ are shipped.
1. Support for these builds is best effort, due to heavy use of `built-in
modules that are known to be thread-unsafe`_, such as ``ctypes``.
diff --git a/cuda_core/docs/source/release/1.0.0-notes.rst b/cuda_core/docs/source/release/1.0.0-notes.rst
index a0f1de1a121..99d3dca518e 100644
--- a/cuda_core/docs/source/release/1.0.0-notes.rst
+++ b/cuda_core/docs/source/release/1.0.0-notes.rst
@@ -20,11 +20,74 @@ New features
including string process state queries, lock/checkpoint/restore/unlock
operations, and GPU UUID remapping support for restore.
(`#1343 `__)
+- Added green context support (CUDA 12.4+). New types :class:`Context`,
+ :class:`ContextOptions`, :class:`SMResource`, :class:`SMResourceOptions`,
+ :class:`WorkqueueResource`, and :class:`WorkqueueResourceOptions` enable GPU
+ SM and workqueue resource partitioning. Create green contexts via
+ :meth:`Device.create_context`, then use :meth:`Context.create_stream` and
+ :attr:`Context.resources` to work within the partitioned resources.
+ (`#1976 `__)
+- Changes to the :mod:`cuda.core.system` module for NVIDIA Management Library (NVML)
+ access:
+
+ - :attr:`system.Device.mig` for querying and setting MIG mode, enumerating
+ MIG device instances, and navigating parent/child relationships.
+ (`#1916 `__)
+ - :attr:`system.Device.compute_running_processes` for querying running compute
+ processes on a device, returning :class:`~system.ProcessInfo` objects with
+ PID, GPU memory usage, and MIG instance IDs.
+ (`#1917 `__)
+ - :meth:`system.Device.get_nvlink` for querying NVLink version and state per
+ link, and :attr:`system.Device.utilization` returning current GPU and memory
+ utilization rates.
+ (`#1918 `__)
+
+- Re-wrapped NVML enums as human-readable ``StrEnum`` subclasses instead of raw
+ integer re-exports from ``cuda.bindings.nvml``. These are available in
+ ``cuda.core.system.typing``.
+ (`#2014 `__)
+- Enums are now available in places where a small number of string values are
+ accepted or returned. You may continue to use the string values, or use
+ enumerations for better linting and type-checking.
+ (`#2016 `__)
+ The new enums are:
+
+ - :class:`cuda.core.typing.CompilerBackendType`
+ - :class:`cuda.core.typing.GraphConditionalType`
+ - :class:`cuda.core.typing.GraphMemoryType`
+ - :class:`cuda.core.typing.ManagedMemoryLocationType`
+ - :class:`cuda.core.typing.ObjectCodeFormatType`
+ - :class:`cuda.core.typing.PCHStatusType`
+ - :class:`cuda.core.typing.SourceCodeType`
+ - :class:`cuda.core.typing.VirtualMemoryAccessType`
+ - :class:`cuda.core.typing.VirtualMemoryAllocationType`
+ - :class:`cuda.core.typing.VirtualMemoryGranularityType`
+ - :class:`cuda.core.typing.VirtualMemoryHandleType`
+ - :class:`cuda.core.typing.VirtualMemoryLocationType`
Breaking changes
----------------
+- :class:`~utils.StridedMemoryView` now provides a fast path for ``torch.Tensor``
+ objects via PyTorch's AOT Inductor (AOTI) stable C ABI. When a ``torch.Tensor``
+ is passed to any ``from_*`` classmethod (``from_dlpack``,
+ ``from_cuda_array_interface``, ``from_array_interface``, or
+ ``from_any_interface``), tensor metadata is read directly from the underlying
+ C struct, bypassing the DLPack and CUDA Array Interface protocol overhead.
+ This yields ~7–20x faster ``StridedMemoryView`` construction for PyTorch
+ tensors (depending on whether stream ordering is required). Proper CUDA stream
+ ordering is established between PyTorch's current stream and the consumer
+ stream, matching the DLPack synchronization contract.
+ Requires PyTorch >= 2.3.
+
+ This is a *behavioral* breaking change: because the AOTI tensor bridge reads
+ raw metadata without re-enacting PyTorch's export guardrails, tensors that
+ PyTorch would reject at the DLPack boundary (notably ``requires_grad``,
+ conjugated, non-strided/sparse, and wrong-current-device CUDA tensors) are
+ now accepted. This is intentional — ``StridedMemoryView`` is designed for
+ low-level interop where those checks are not needed.
+ (`#749 `__)
- Renamed :class:`~graph.GraphDef` to :class:`~graph.GraphDefinition` for
consistency with the rest of the API, which spells words out (e.g.
``TensorMapDescriptor``, not ``TensorMapDesc``).
@@ -125,6 +188,63 @@ Breaking changes
- :obj:`cuda.core.typing.DevicePointerT` -> :obj:`cuda.core.typing.DevicePointerType`
- :obj:`cuda.core.typing.IsStreamT` -> :obj:`cuda.core.typing.IsStreamType`
+- Renamed and converted multiple :class:`~system.Device` properties and methods
+ for naming consistency
+ (`#1946 `__):
+
+ On :class:`~system.Device`:
+
+ - ``is_c2c_mode_enabled`` -> ``is_c2c_enabled``
+ - ``persistence_mode_enabled`` -> ``is_persistence_mode_enabled``
+ - ``clock(clock_type)`` -> ``get_clock(clock_type)``
+ - ``get_auto_boosted_clocks_enabled()`` -> ``is_auto_boosted_clocks_enabled``
+ (method -> property)
+ - ``get_current_clock_event_reasons()`` -> ``current_clock_event_reasons``
+ (method -> property)
+ - ``get_supported_clock_event_reasons()`` -> ``supported_clock_event_reasons``
+ (method -> property)
+ - ``display_mode`` -> ``is_display_connected``
+ - ``display_active`` -> ``is_display_active``
+ - ``fan(fan=0)`` -> ``get_fan(fan=0)``
+ - ``get_supported_pstates()`` -> ``supported_pstates``
+ (method -> property)
+
+ On ``PciInfo``:
+
+ - ``get_max_pcie_link_generation()`` -> ``link_generation`` (method -> property)
+ - ``get_gpu_max_pcie_link_generation()`` -> ``max_link_generation``
+ (method -> property)
+ - ``get_max_pcie_link_width()`` -> ``max_link_width`` (method -> property)
+ - ``get_current_pcie_link_generation()`` -> ``current_link_generation``
+ (method -> property)
+ - ``get_current_pcie_link_width()`` -> ``current_link_width``
+ (method -> property)
+ - ``get_pcie_throughput(counter)`` -> ``get_throughput(counter)``
+ - ``get_pcie_replay_counter()`` -> ``replay_counter`` (method -> property)
+
+ On ``Temperature``:
+
+ - ``sensor(sensor=...)`` -> ``get_sensor(sensor=...)``
+ - ``threshold(threshold_type)`` -> ``get_threshold(threshold_type)``
+ - ``thermal_settings(sensor_index)`` -> ``get_thermal_settings(sensor_index)``
+
+ On ``FanInfo``:
+
+ - ``set_default_fan_speed()`` -> ``set_default_speed()``
+
+- Removed 18 helper/data-container classes from ``cuda.core.system.__all__``:
+ ``BAR1MemoryInfo``, ``ClockInfo``, ``ClockOffsets``, ``CoolerInfo``,
+ ``DeviceAttributes``, ``DeviceEvents``, ``EventData``, ``FanInfo``,
+ ``FieldValue``, ``FieldValues``, ``GpuDynamicPstatesInfo``,
+ ``GpuDynamicPstatesUtilization``, ``InforomInfo``, ``PciInfo``,
+ ``RepairStatus``, ``Temperature``, ``ThermalSensor``, ``ThermalSettings``.
+ These classes are still returned by :class:`~system.Device` properties and
+ methods but should not be directly instantiated by users.
+ (`#1942 `__)
+- :attr:`system.Device.uuid` now returns the full NVML UUID with prefix
+ (e.g. ``GPU-...``). Use :attr:`system.Device.uuid_without_prefix` for
+ the previous behavior.
+ (`#1916 `__)
- :func:`args_viewable_as_strided_memory` and :class:`StridedMemoryView` are now
longer at the top-level in :mod:`cuda.core`. They are available publicly from the
:mod:`cuda.core.utils` module.
@@ -133,33 +253,29 @@ Breaking changes
Fixes and enhancements
-----------------------
-- :class:`~utils.StridedMemoryView` now provides a fast path for ``torch.Tensor``
- objects via PyTorch's AOT Inductor (AOTI) stable C ABI. When a ``torch.Tensor``
- is passed to any ``from_*`` classmethod (``from_dlpack``,
- ``from_cuda_array_interface``, ``from_array_interface``, or
- ``from_any_interface``), tensor metadata is read directly from the underlying
- C struct, bypassing the DLPack and CUDA Array Interface protocol overhead.
- This yields ~7-20x faster ``StridedMemoryView`` construction for PyTorch
- tensors (depending on whether stream ordering is required). Proper CUDA stream ordering is established between PyTorch's current
- stream and the consumer stream, matching the DLPack synchronization contract.
- Requires PyTorch >= 2.3.
- (`#749 `__)
-
-- Enums are not available in places where a small number of string values are
- accepted or returned. You may continue to use the string values, or use
- enumerations for better linting and type-checking.
- (`#2016 `__)
- The new enums are:
-
- - :class:`cuda.core.typing.CompilerBackendType`
- - :class:`cuda.core.typing.GraphConditionalType`
- - :class:`cuda.core.typing.GraphMemoryType`
- - :class:`cuda.core.typing.ManagedMemoryLocationType`
- - :class:`cuda.core.typing.ObjectCodeFormatType`
- - :class:`cuda.core.typing.PCHStatusType`
- - :class:`cuda.core.typing.SourceCodeType`
- - :class:`cuda.core.typing.VirtualMemoryAccessType`
- - :class:`cuda.core.typing.VirtualMemoryAllocationType`
- - :class:`cuda.core.typing.VirtualMemoryGranularityType`
- - :class:`cuda.core.typing.VirtualMemoryHandleType`
- - :class:`cuda.core.typing.VirtualMemoryLocationType`
+- Fixed :attr:`Buffer.is_managed` returning ``False`` for pool-allocated managed
+ memory (:class:`ManagedMemoryResource`), which caused DLPack interop to
+ misclassify managed buffers as ``kDLCUDAHost``. The fix queries both the
+ driver pointer attribute and the memory resource.
+ (`#1924 `__)
+- :attr:`system.Device.arch` now returns ``UNKNOWN`` instead of raising
+ ``ValueError`` when NVML reports an architecture not yet in the enum.
+ (`#1937 `__)
+- :meth:`system.Device.get_field_values` and
+ :meth:`system.Device.clear_field_values` with an empty list no longer raise
+ ``InvalidArgumentError``.
+ (`#1982 `__)
+- :class:`Linker` error and info log retrieval now properly checks return codes
+ from nvJitLink, raising exceptions on failure instead of silently ignoring
+ errors.
+ (`#1993 `__)
+- Fixed a potential crash when NVML event set creation failed on Windows, due to
+ ``__dealloc__`` freeing an uninitialized handle.
+ (`#1992 `__)
+- CUDA Runtime error messages are now more reliable, especially on Windows
+ where the runtime DLL name table could disagree with the installed bindings.
+ (`#2003 `__)
+- Linux release wheels are now stripped of debug symbols, significantly reducing
+ package size. Debug builds are now supported via
+ ``--config-settings=debug=true``.
+ (`#1890 `__)
diff --git a/cuda_core/docs/source/support.rst b/cuda_core/docs/source/support.rst
new file mode 100644
index 00000000000..38d91368586
--- /dev/null
+++ b/cuda_core/docs/source/support.rst
@@ -0,0 +1,79 @@
+.. SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+.. SPDX-License-Identifier: Apache-2.0
+
+.. _cuda-core-support:
+
+``cuda.core`` Support Policy
+============================
+
+Versioning Scheme
+-----------------
+
+``cuda.core`` follows `Semantic Versioning (SemVer) `_ with the version
+format ``major.minor.patch``:
+
+- **Major**: Bumped when a new CUDA major release is out and support for the oldest CUDA major
+ version is dropped. Breaking API changes only happen at major-version boundaries.
+- **Minor**: Bumped when new, backward-compatible features are added, or when a new Python feature
+ release is out and the oldest supported Python version reaches EOL.
+- **Patch**: Bumped for bug fixes and backward-compatible maintenance updates.
+
+Unlike ``cuda.bindings``, the ``cuda.core`` version is *not* aligned with the CUDA Toolkit version.
+Consult the table below or the :doc:`release notes ` to determine which CUDA versions are
+supported by a given ``cuda.core`` release.
+
+CUDA Version Support
+--------------------
+
+``cuda.core`` is actively maintained to support the two (2) most recent CUDA major versions. For
+example, ``cuda.core`` 1.x supports CUDA 12 and 13. Any fix in the latest release would be
+backported as needed.
+
+When a new CUDA major version is released and support for the oldest major version is dropped,
+``cuda.core`` will release a new major version (e.g., 1.x → 2.0.0).
+
+.. list-table:: CUDA Version Support Matrix
+ :header-rows: 1
+
+ * - ``cuda.core`` version
+ - Supported CUDA versions
+ * - 1.x
+ - 12, 13
+
+As with any CUDA library, certain features may impose additional requirements on
+the minimum ``cuda-bindings`` or CUDA driver version. Refer to the individual
+module documentation for details.
+
+Python Version Support
+----------------------
+
+``cuda.core`` supports all Python versions following the `CPython EOL schedule
+`_. As of writing, Python 3.10 – 3.14 are supported.
+
+When a new Python feature version is released and the oldest supported version reaches EOL,
+``cuda.core`` will bump its minor version accordingly.
+
+Free-threading Build Support
+----------------------------
+
+As of ``cuda.core`` 1.0.0, wheels for the `free-threaded interpreter
+`_ are shipped to PyPI. This support
+is currently *experimental*.
+
+1. For now, you are responsible for making sure that calls into the underlying CUDA libraries
+ are thread-safe. This is subject to change.
+
+Release Cadence
+---------------
+
+- ``cuda.core`` follows its own release cadence, independent of CUDA Toolkit releases, as long as
+ SemVer guarantees are maintained.
+- We currently aim for bimonthly releases, though this is subject to change.
+- Major version releases are aligned to CUDA major version releases.
+- New features may be delivered in minor releases at any time — not gated by the CUDA Toolkit
+ release schedule.
+
+----
+
+The NVIDIA CUDA Python team reserves the right to amend the above support policy. Any major changes,
+however, will be announced to users in advance.
diff --git a/cuda_python/DESCRIPTION.rst b/cuda_python/DESCRIPTION.rst
index 6120a568023..90bf5c127a4 100644
--- a/cuda_python/DESCRIPTION.rst
+++ b/cuda_python/DESCRIPTION.rst
@@ -10,8 +10,8 @@ CUDA Python is the home for accessing NVIDIA's CUDA platform from Python. It con
* `cuda.core `_: Pythonic access to CUDA Runtime and other core functionality
* `cuda.bindings `_: Low-level Python bindings to CUDA C APIs
* `cuda.pathfinder `_: Utilities for locating CUDA components installed in the user's Python environment
-* `cuda.coop `_: A Python module providing CCCL's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels
-* `cuda.compute `_: A Python module for easy access to CCCL's highly efficient and customizable parallel algorithms, like ``sort``, ``scan``, ``reduce``, ``transform``, etc. that are callable on the *host*
+* `cuda.coop `_: A Python module providing CCCL's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels
+* `cuda.compute `_: A Python module for easy access to CCCL's highly efficient and customizable parallel algorithms, like ``sort``, ``scan``, ``reduce``, ``transform``, etc. that are callable on the *host*
* `numba.cuda `_: A Python DSL that exposes CUDA **SIMT** programming model and compiles a restricted subset of Python code into CUDA kernels and device functions
* `cuda.tile `_: A new Python DSL that exposes CUDA **Tile** programming model and allows users to write NumPy-like code in CUDA kernels
* `nvmath-python `_: Pythonic access to NVIDIA CPU & GPU Math Libraries, with `host `_, `device `_, and `distributed `_ APIs. It also provides low-level Python bindings to host C APIs (`nvmath.bindings `_).
@@ -52,4 +52,6 @@ The list of available interfaces is:
* NVRTC
* nvJitLink
* NVVM
+* nvFatbin
* cuFile
+* NVML
diff --git a/cuda_python/docs/source/index.rst b/cuda_python/docs/source/index.rst
index 7aad94ef9c4..458a7a03229 100644
--- a/cuda_python/docs/source/index.rst
+++ b/cuda_python/docs/source/index.rst
@@ -20,8 +20,8 @@ multiple components:
- `CUPTI Python`_: Python APIs for creation of profiling tools that target CUDA Python applications via the CUDA Profiling Tools Interface (CUPTI)
- `Accelerated Computing Hub`_: Open-source learning materials related to GPU computing. You will find user guides, tutorials, and other works freely available for all learners interested in GPU computing.
-.. _cuda.coop: https://nvidia.github.io/cccl/python/coop
-.. _cuda.compute: https://nvidia.github.io/cccl/python/compute
+.. _cuda.coop: https://nvidia.github.io/cccl/unstable/python/coop.html
+.. _cuda.compute: https://nvidia.github.io/cccl/unstable/python/compute/index.html
.. _numba.cuda: https://nvidia.github.io/numba-cuda/
.. _cuda.tile: https://docs.nvidia.com/cuda/cutile-python/
.. _nvmath-python: https://docs.nvidia.com/cuda/nvmath-python/latest
@@ -50,8 +50,8 @@ be available, please refer to the `cuda.bindings`_ documentation for installatio
cuda.core
cuda.bindings
cuda.pathfinder
- cuda.coop
- cuda.compute
+ cuda.coop
+ cuda.compute
numba.cuda
cuda.tile
nvmath-python