From 4be47143fddb000392a0183d8e619a38f0bf6dbe Mon Sep 17 00:00:00 2001
From: Leo Fang <leof@nvidia.com>
Date: Thu, 7 May 2026 16:14:06 +0000
Subject: [PATCH] Fix sccache summary to count CUDA sub-tool hits/misses

The rapidsai/sccache fork tracks CUDA compilation sub-phases under
separate language keys in the JSON stats: "CUDA" (nvcc driver),
"CUDA (Device code)" (cudafe++), "PTX" (cicc), and "CUBIN" (ptxas).

The summary script only counted the "CUDA" key, which represents just
the top-level nvcc pass and typically shows 0 cache hits.  All the
actual cache hits land in the sub-tool categories.  This caused the
step summary to report 0% hit rate even when sccache's own stats
showed ~34%.

Include all CUDA-related language keys so the reported rate matches
sccache's own "Cache hits rate" output.
---
 .github/actions/sccache-summary/action.yml | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/.github/actions/sccache-summary/action.yml b/.github/actions/sccache-summary/action.yml
index 5881f6a3ca5..28734116f62 100644
--- a/.github/actions/sccache-summary/action.yml
+++ b/.github/actions/sccache-summary/action.yml
@@ -6,8 +6,6 @@ name: sccache summary
 description: Parse sccache stats JSON and write a summary table to GITHUB_STEP_SUMMARY
 
 # Inspired by NVIDIA/cccl's prepare-execution-summary.py (PR #3621).
-# Only counts C/C++ and CUDA language hits (excludes PTX/CUBIN which are
-# not included in sccache's compile_requests counter).
 
 inputs:
   json-file:
@@ -47,10 +45,11 @@ runs:
         with open(json_file) as f:
             stats = json.load(f)["stats"]
 
-        # compile_requests includes non-compilation calls (linker, etc).
-        # Use cache_hits + cache_misses as the denominator to match sccache's
-        # own "Cache hits rate" which only counts actual compilation requests.
-        counted_languages = {"C/C++", "CUDA"}
+        # compile_requests only counts top-level nvcc invocations, but each
+        # invocation spawns sub-tool compilations (cudafe++, cicc, ptxas) that
+        # sccache tracks under separate language keys.  Count all of them so
+        # the reported rate matches sccache's own "Cache hits rate".
+        counted_languages = {"C/C++", "CUDA", "CUDA (Device code)", "PTX", "CUBIN"}
         hits = sum(
             v for k, v in stats.get("cache_hits", {}).get("counts", {}).items()
             if k in counted_languages