dev: Add LLVM source-based code coverage support

cdecker · cdecker · commit 4a85b8bdfdc8 · 2026-02-05T17:16:31.000+01:00
Add support for LLVM source-based code coverage instrumentation, enabling
developers and CI to generate coverage reports for the Core Lightning codebase.

Build System:
- Add coverage-clang-collect and coverage-clang-report Makefile targets
- Fix missing endif for PYTEST_TESTS conditional
- Respect CARGO_TARGET_DIR environment variable for Rust builds

Coverage Scripts (contrib/coverage/):
- collect-coverage.sh: Merges .profraw files with validation and batching
- generate-coverage-report.sh: Generates HTML reports using llvm-cov

CI Workflow:
- coverage.yaml: Daily workflow for building, testing, and publishing reports

Usage:
  ./configure --enable-coverage CC=clang
  make -j$(nproc)
  CLN_COVERAGE_DIR=/tmp/cln-coverage make pytest
  make coverage-clang

Changelog-Changed: Added LLVM source-based code coverage support with CI integration
diff --git a/.github/workflows/coverage.yaml b/.github/workflows/coverage.yaml
@@ -0,0 +1,55 @@
+---
+name: CLN static dev resources
+on:
+  schedule:
+    - cron: "37 23 * * *"
+  workflow_dispatch:
+
+# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
+# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
+concurrency:
+  group: "pages"
+  cancel-in-progress: false
+
+jobs:
+  gen-n-upload:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+        with:
+          fetch-depth: 0
+        fetchLocal: ["master"]
+
+      - name: Setup Pages
+        uses: actions/configure-pages@v4
+
+      - name: Rebase locally
+        run: git rebase master 20240102-coverage
+
+      - name: Generate coverage report
+        run: |
+          make distclean coverage-clean
+          ./configure --enable-coverage --disable-valgrind CC=clang
+          make -j $(nproc)
+          PYTEST_PAR=$(nproc) make pytest || true
+          make coverage
+
+      - name: Collect static webpage
+        run: |
+          mkdir -p site
+          mv coverage/html site/coverage
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: 'site'
+
+      - name: Upload to `gh-pages`
+        id: deployment
+        uses: actions/deploy-pages@v4
diff --git a/.gitignore b/.gitignore
@@ -23,7 +23,6 @@ gen_*.c
 gen_*.h
 wire/gen_*_csv
 cli/lightning-cli
-coverage
 # Coverage profiling data files
 *.profraw
 *.profdata
diff --git a/Makefile b/Makefile
@@ -67,10 +67,6 @@ else
 DEV_CFLAGS=
 endif
 
-ifeq ($(COVERAGE),1)
-COVFLAGS = --coverage
-endif
-
 ifeq ($(CLANG_COVERAGE),1)
 COVFLAGS+=-fprofile-instr-generate -fcoverage-mapping
 endif
@@ -378,10 +374,12 @@ endif
 RUST_PROFILE ?= debug
 
 # Cargo places cross compiled packages in a different directory, using the target triple
+# Respect CARGO_TARGET_DIR if set in the environment
+CARGO_BASE_DIR = $(or $(CARGO_TARGET_DIR),target)
 ifeq ($(TARGET),)
-RUST_TARGET_DIR = target/$(RUST_PROFILE)
+RUST_TARGET_DIR = $(CARGO_BASE_DIR)/$(RUST_PROFILE)
 else
-RUST_TARGET_DIR = target/$(TARGET)/$(RUST_PROFILE)
+RUST_TARGET_DIR = $(CARGO_BASE_DIR)/$(TARGET)/$(RUST_PROFILE)
 endif
 
 ifneq ($(RUST_PROFILE),debug)
diff --git a/contrib/coverage/collect-coverage.sh b/contrib/coverage/collect-coverage.sh
@@ -0,0 +1,116 @@
+#!/bin/bash -eu
+# Merge all .profraw files into a single .profdata file
+# Usage: ./collect-coverage.sh [COVERAGE_DIR] [OUTPUT_FILE]
+
+COVERAGE_DIR="${1:-${CLN_COVERAGE_DIR:-/tmp/cln-coverage}}"
+OUTPUT="${2:-coverage/merged.profdata}"
+
+echo "Collecting coverage from: $COVERAGE_DIR"
+
+# Find all profraw files
+mapfile -t PROFRAW_FILES < <(find "$COVERAGE_DIR" -name "*.profraw" 2>/dev/null || true)
+
+if [ ${#PROFRAW_FILES[@]} -eq 0 ]; then
+    echo "ERROR: No .profraw files found in $COVERAGE_DIR"
+    exit 1
+fi
+
+echo "Found ${#PROFRAW_FILES[@]} profile files"
+
+# Validate each profraw file and filter out corrupt/incomplete ones
+# Define validation function for parallel execution
+validate_file() {
+    local profraw="$1"
+
+    # Check if file is empty
+    if [ ! -s "$profraw" ]; then
+        return 1  # Empty
+    fi
+
+    # Check if file is suspiciously small (likely incomplete write)
+    # Valid profraw files are typically > 1KB
+    filesize=$(stat -c%s "$profraw" 2>/dev/null || stat -f%z "$profraw" 2>/dev/null)
+    if [ "$filesize" -lt 1024 ]; then
+        return 2  # Too small
+    fi
+
+    # Try to validate the file by checking if llvm-profdata can read it
+    if llvm-profdata show "$profraw" >/dev/null 2>&1; then
+        echo "$profraw"  # Valid - output to stdout
+        return 0
+    else
+        return 3  # Corrupt
+    fi
+}
+
+# Export function for parallel execution
+export -f validate_file
+
+TOTAL=${#PROFRAW_FILES[@]}
+NPROC=$(nproc 2>/dev/null || echo 4)
+echo "Validating ${TOTAL} files in parallel (using ${NPROC} cores)..."
+
+# Run validation in parallel and collect valid files
+mapfile -t VALID_FILES < <(
+    printf '%s\n' "${PROFRAW_FILES[@]}" | \
+    xargs -P "$NPROC" -I {} bash -c 'validate_file "$@"' _ {}
+)
+
+# Calculate error counts
+CORRUPT_COUNT=$((TOTAL - ${#VALID_FILES[@]}))
+
+if [ ${#VALID_FILES[@]} -eq 0 ]; then
+    echo "ERROR: No valid .profraw files found (all $CORRUPT_COUNT files were corrupt/incomplete)"
+    exit 1
+fi
+
+echo "Valid files: ${#VALID_FILES[@]}"
+if [ $CORRUPT_COUNT -gt 0 ]; then
+    echo "Filtered out: $CORRUPT_COUNT files (empty/small/corrupt)"
+fi
+mkdir -p "$(dirname "$OUTPUT")"
+
+# Merge with -sparse flag for efficiency
+# Use batched merging to avoid "Argument list too long" errors
+BATCH_SIZE=500
+TOTAL_FILES=${#VALID_FILES[@]}
+
+if [ "$TOTAL_FILES" -le "$BATCH_SIZE" ]; then
+    # Small enough to merge in one go
+    echo "Merging ${TOTAL_FILES} files..."
+    llvm-profdata merge -sparse "${VALID_FILES[@]}" -o "$OUTPUT"
+else
+    # Need to merge in batches
+    echo "Merging ${TOTAL_FILES} files in batches of ${BATCH_SIZE}..."
+
+    # Create temp directory for intermediate files
+    TEMP_DIR=$(mktemp -d "${TMPDIR:-/tmp}/profdata-merge.XXXXXX")
+    trap 'rm -rf "$TEMP_DIR"' EXIT
+
+    BATCH_NUM=0
+    INTERMEDIATE_FILES=()
+
+    # Merge files in batches
+    for ((i=0; i<TOTAL_FILES; i+=BATCH_SIZE)); do
+        BATCH_NUM=$((BATCH_NUM + 1))
+        END=$((i + BATCH_SIZE))
+        if [ "$END" -gt "$TOTAL_FILES" ]; then
+            END=$TOTAL_FILES
+        fi
+
+        BATCH_FILES=("${VALID_FILES[@]:$i:$BATCH_SIZE}")
+        INTERMEDIATE="$TEMP_DIR/batch-$BATCH_NUM.profdata"
+
+        echo "  Batch $BATCH_NUM: merging files $((i+1))-$END..."
+        llvm-profdata merge -sparse "${BATCH_FILES[@]}" -o "$INTERMEDIATE"
+        INTERMEDIATE_FILES+=("$INTERMEDIATE")
+    done
+
+    # Merge all intermediate files into final output
+    echo "Merging ${#INTERMEDIATE_FILES[@]} intermediate files into final output..."
+    llvm-profdata merge -sparse "${INTERMEDIATE_FILES[@]}" -o "$OUTPUT"
+
+    # Cleanup handled by trap
+fi
+
+echo "✓ Merged profile: $OUTPUT"
diff --git a/contrib/coverage/generate-coverage-report.sh b/contrib/coverage/generate-coverage-report.sh
@@ -0,0 +1,58 @@
+#!/bin/bash -eu
+# Generate HTML and text coverage reports from merged profile data
+# Usage: ./generate-coverage-report.sh [PROFDATA_FILE] [OUTPUT_DIR]
+
+PROFDATA="${1:-coverage/merged.profdata}"
+OUTPUT_DIR="${2:-coverage/html}"
+
+if [ ! -f "$PROFDATA" ]; then
+    echo "ERROR: Profile not found: $PROFDATA"
+    echo "Run collect-coverage.sh first to create the merged profile"
+    exit 1
+fi
+
+# Get all binaries from Makefile (includes plugins, tools, test binaries)
+echo "Discovering instrumented binaries from Makefile..."
+mapfile -t BINARIES < <(make -qp 2>/dev/null | awk '/^ALL_PROGRAMS :=/ {$1=$2=""; print}' | tr ' ' '\n' | grep -v '^$')
+mapfile -t TEST_BINARIES < <(make -qp 2>/dev/null | awk '/^ALL_TEST_PROGRAMS :=/ {$1=$2=""; print}' | tr ' ' '\n' | grep -v '^$')
+
+# Combine all binaries
+ALL_BINARIES=("${BINARIES[@]}" "${TEST_BINARIES[@]}")
+
+# Build llvm-cov arguments
+ARGS=()
+for bin in "${ALL_BINARIES[@]}"; do
+    if [ -f "$bin" ]; then
+        if [ ${#ARGS[@]} -eq 0 ]; then
+            ARGS+=("$bin")  # First binary is primary
+        else
+            ARGS+=("-object=$bin")  # Others use -object=
+        fi
+    fi
+done
+
+if [ ${#ARGS[@]} -eq 0 ]; then
+    echo "ERROR: No instrumented binaries found"
+    echo "Make sure you've built with --enable-coverage"
+    exit 1
+fi
+
+echo "Generating coverage report for ${#ARGS[@]} binaries..."
+
+# Generate HTML report
+llvm-cov show "${ARGS[@]}" \
+    -instr-profile="$PROFDATA" \
+    -format=html \
+    -output-dir="$OUTPUT_DIR" \
+    -show-line-counts-or-regions \
+    -show-instantiations=false
+
+echo "✓ HTML report: $OUTPUT_DIR/index.html"
+
+# Generate text summary
+mkdir -p coverage
+llvm-cov report "${ARGS[@]}" \
+    -instr-profile="$PROFDATA" \
+    | tee coverage/summary.txt
+
+echo "✓ Summary: coverage/summary.txt"
diff --git a/doc/developers-guide/coverage.md b/doc/developers-guide/coverage.md
@@ -0,0 +1,30 @@
+# Test Coverage
+
+> Coverage isn't everything, but it can tell you were you missed a thing.
+
+We use LLVM's [Source-Based Code Coverage][sbcc] support to instrument
+the code at compile time. This instrumentation then emits coverage
+files (`profraw`), which can then be aggregated via `llvm-profdata`
+into a single `profdata` file, and from there a variety of tools can
+be used to inspect coverage.
+
+The most common use is to generate an HTML report for all binaries
+under test. CLN being a multi-process system has a number of binaries,
+sharing some source code. To simplify the aggregation of data and
+generation of the report split per source file, we use the
+`prepare-code-coverage-artifact.py` ([`pcca.py`][pcca]) script from
+the LLVM project.
+
+## Conventions
+
+The `tests/fixtures.py` sets the `LLVM_PROFILE_FILE` environment
+variable, indicating that the `profraw` files ought to be stores in
+`coverage/raw`. Processing the file then uses [`pcca.py`][pcca] to
+aggregate the raw files, into a data file, and then generate a
+per-source-file coverage report.
+
+This report is then published [here][report]
+
+[sbcc]: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html
+[pcca]: https://github.com/ElementsProject/lightning/tree/master/contrib/prepare-code-coverage-artifact.py
+[report]: https://cdecker.github.io/lightning/coverage
diff --git a/tests/fixtures.py b/tests/fixtures.py
@@ -10,7 +10,16 @@
 
 
 @pytest.fixture
-def node_cls():
+def node_cls(test_name: str):  # noqa: F811
+    # We always set the LLVM coverage destination, just in case
+    # `lightningd` was compiled with the correct instrumentation
+    # flags. This creates a `coverage` directory in the repository
+    # and puts all the files in it.
+    repo_root = Path(__file__).parent.parent
+    os.environ['LLVM_PROFILE_FILE'] = str(
+        repo_root / "coverage" / "raw" / f"{test_name}.%p.profraw"
+    )
+
     return LightningNode