From eac7298b0facf1b68c8f2615b9504e254ae58aa3 Mon Sep 17 00:00:00 2001 From: Karl Gyllstrom Date: Mon, 9 Mar 2026 11:14:18 -0700 Subject: [PATCH] Fix [[nodiscard]] build errors and BUCK deps across comms, gloo, caffe2 (#494) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Summary: Pull Request resolved: https://github.com/pytorch/gloo/pull/494 X-link: https://github.com/meta-pytorch/torchcomms/pull/960 X-link: https://github.com/pytorch/pytorch/pull/176671 ROCm 7.0+ HIP headers annotate API functions (hipStreamDestroy, hipMemcpyAsync, hipStreamSynchronize, hipSetDevice, hipGetDevice, hipFree, hipHostUnregister, hipDeviceEnablePeerAccess, cuGetErrorString) with [[nodiscard]]. Combined with -Werror, this causes build failures wherever return values are discarded. Originally discovered building with ROCm 7.2 headers, but confirmed to also affect ROCm 7.0 builds (reported independently by yvliu and hqguo). The [[nodiscard]] attribute is present in both ROCm 7.0 and 7.2 HIP headers — the fix is the same for both versions. Changes: - Add (void) casts to suppress [[nodiscard]] warnings across comms/ (tcp_devmem, ctran, rcclx), gloo/, and caffe2/ (nativert) — 12 C++ files - Fix BUCK dependency issues in comms/tcp_devmem/nccl (replace devmgr-client with common:common) and comms/tcp_devmem/unpack (explicit glog dep path) that surface when building these targets under ROCm constraints The (void) casts are no-ops on CUDA and older ROCm — safe to land regardless of ROCm version. Reviewed By: bbeckca Differential Revision: D93759269 --- gloo/cuda_collectives_native.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gloo/cuda_collectives_native.h b/gloo/cuda_collectives_native.h index e6c45c401..dbc62cb3f 100644 --- a/gloo/cuda_collectives_native.h +++ b/gloo/cuda_collectives_native.h @@ -83,7 +83,7 @@ class CudaLocalNativeReduce : public LocalOp { // Enable peer access for devA to memory on devB CUDA_CHECK(cudaSetDevice(devA)); - cudaDeviceEnablePeerAccess(devB, 0); + (void)cudaDeviceEnablePeerAccess(devB, 0); // Use cudaGetLastError so that any error is cleared. auto err = cudaGetLastError(); @@ -196,7 +196,7 @@ class CudaLocalNativeBroadcast : public LocalOp { // Enable peer access for devA to memory on devB CUDA_CHECK(cudaSetDevice(devA)); - cudaDeviceEnablePeerAccess(devB, 0); + (void)cudaDeviceEnablePeerAccess(devB, 0); // Use cudaGetLastError so that any error is cleared. auto err = cudaGetLastError();