From 4be47143fddb000392a0183d8e619a38f0bf6dbe Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Thu, 7 May 2026 16:14:06 +0000 Subject: [PATCH] Fix sccache summary to count CUDA sub-tool hits/misses The rapidsai/sccache fork tracks CUDA compilation sub-phases under separate language keys in the JSON stats: "CUDA" (nvcc driver), "CUDA (Device code)" (cudafe++), "PTX" (cicc), and "CUBIN" (ptxas). The summary script only counted the "CUDA" key, which represents just the top-level nvcc pass and typically shows 0 cache hits. All the actual cache hits land in the sub-tool categories. This caused the step summary to report 0% hit rate even when sccache's own stats showed ~34%. Include all CUDA-related language keys so the reported rate matches sccache's own "Cache hits rate" output. --- .github/actions/sccache-summary/action.yml | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/.github/actions/sccache-summary/action.yml b/.github/actions/sccache-summary/action.yml index 5881f6a3ca5..28734116f62 100644 --- a/.github/actions/sccache-summary/action.yml +++ b/.github/actions/sccache-summary/action.yml @@ -6,8 +6,6 @@ name: sccache summary description: Parse sccache stats JSON and write a summary table to GITHUB_STEP_SUMMARY # Inspired by NVIDIA/cccl's prepare-execution-summary.py (PR #3621). -# Only counts C/C++ and CUDA language hits (excludes PTX/CUBIN which are -# not included in sccache's compile_requests counter). inputs: json-file: @@ -47,10 +45,11 @@ runs: with open(json_file) as f: stats = json.load(f)["stats"] - # compile_requests includes non-compilation calls (linker, etc). - # Use cache_hits + cache_misses as the denominator to match sccache's - # own "Cache hits rate" which only counts actual compilation requests. - counted_languages = {"C/C++", "CUDA"} + # compile_requests only counts top-level nvcc invocations, but each + # invocation spawns sub-tool compilations (cudafe++, cicc, ptxas) that + # sccache tracks under separate language keys. Count all of them so + # the reported rate matches sccache's own "Cache hits rate". + counted_languages = {"C/C++", "CUDA", "CUDA (Device code)", "PTX", "CUBIN"} hits = sum( v for k, v in stats.get("cache_hits", {}).get("counts", {}).items() if k in counted_languages