Purpose:
- record what has been measured
- separate ordinary missing test coverage from branches that really need special infrastructure
- avoid overstating “coverage complete” when the numbers do not support it
Verified on 2026-03-28:
- C:
- script:
tests/run-coverage-c.sh - result:
94.1% - threshold:
90% - current file totals:
netipc_protocol.c:98.7%(380/385)netipc_uds.c:92.9%(434/467)netipc_service.c:92.1%(734/797)netipc_shm.c:95.1%(346/364)
- latest ordinary C additions covered:
- malformed client
HELLO_ACKhandling:- short packet
- wrong kind
- unexpected transport status
- truncated payload
- malformed client
HELLOpayload on server accept - malformed UDS response receive paths:
- short packet
- too-short batch directory
- short continuation packet
- response
item_countover limit - bad continuation header
- missing continuation packet after a valid first chunk
- malformed SHM attach metadata and stale-scan paths:
- bad on-disk
header_len - invalid aligned region metadata / overlap
- declared region larger than the file
- direct
nipc_shm_owner_alive(NULL)false path cleanup_stale()ignoring non-matching directory entries
- bad on-disk
- direct UDS guard paths:
- chunked send with
packet_size == NIPC_HEADER_LENrejecting zero chunk payload budget - explicit
NULLservice-name validation tests through the public API
- chunked send with
- plus the earlier Linux C service coverage gains:
- client init default-buffer sizing and truncation
- empty increment-batch fast-path
- tiny request-buffer overflow guards for batch, string-reverse, and SHM increment send
- negotiated SHM obstruction covering both server create rejection and client attach failure
- typed server missing-handler and success-dispatch paths
- server worker-count floor and long run_dir truncation
- raw SHM malformed response handling on the client side:
- forged oversize response length
- short response
- bad decoded header
- wrong kind / code / message_id
- raw SHM malformed request handling on the server side:
- forged oversize request length
- short request
- bad header
- typed server unknown-method dispatch and typed-init error propagation
- SHM batch client error propagation:
- negotiated-capacity send overflow
- malformed raw batch response propagation
- response
item_countmismatch
- malformed client
- note:
- the SHM slice briefly exposed a parallel-only
ctestcollision betweentest_uds_rustandtest_shm_rust - the fix was CMake-side isolation with a shared
RESOURCE_LOCK, not a Rust SHM code change - some remaining Linux C lines still report as uncovered even though direct public tests already exercise the corresponding bad-param / bad-kind paths
- treat those as candidates for gcov line-mapping noise or architecture-only guards before adding more duplicate tests
- the SHM slice briefly exposed a parallel-only
- script:
- Go:
- script:
tests/run-coverage-go.sh - result:
95.8% - threshold:
90%
- script:
- Rust:
- script:
tests/run-coverage-rust.sh - result:
98.57% - threshold:
90% - note:
- Linux now uses
cargo-llvm-cov, matching the Windows Rust workflow - the Linux report excludes Windows-tagged Rust files:
src/service/cgroups_windows_tests.rssrc/transport/windows.rssrc/transport/win_shm.rs
- Unix Rust service tests now live in:
src/service/cgroups_unix_tests.rs
- Unix Rust transport tests now live in:
src/transport/posix_tests.rssrc/transport/shm_tests.rs
- implication:
- adding new Unix Rust service and transport tests no longer inflates the denominators of the corresponding runtime files
- current Linux file totals from the verified
llvm-covrun:service/cgroups.rs:98.28%(802/816)transport/posix.rs:97.50%(663/680)transport/shm.rs:96.20%(583/606)
- implication:
- the remaining Linux Rust total is now dominated by helper / fault-injection territory, not by Windows-tagged files or inline test code polluting the Linux baseline
- the small protocol files still keep inline tests on purpose for now, because externalizing them reduced the Linux total to
98.49%without enough runtime-signal gain - one concrete layering fact is now proven:
- on POSIX baseline, a bad response
message_idis rejected by L1 before the L2 typed wrappers can map it toBadLayout - malformed batch directories on POSIX UDS are rejected by L1 before the Rust managed-service loop can map them to
INTERNAL_ERROR - the honest ordinary coverage path for that branch is Linux SHM
- on POSIX baseline, a bad response
- latest ordinary Unix Rust service slice covered:
- managed-server recovery after malformed short UDS request
- managed-server recovery after malformed UDS header
- managed-server recovery after peer-close during UDS response send
- managed-server recovery after malformed short SHM request
- managed-server recovery after malformed SHM header
poll_fd()readable and deterministic EINTR paths- SHM stale cleanup / recovery for:
- missing run dir
- unrelated and non-UTF8 entries
- zero-generation stale files
- latest narrow ordinary Rust transport slice additionally covered:
- direct
UdsListener::accept()failure on a closed listener fd ShmContext::owner_alive()with cached generation0skipping generation mismatch checksShmContext::receive()waking successfully under a finite timeout budget
- direct
- remaining Linux Rust misses are now dominated by:
- fixed-size encode guards
- raw
socket/listen/ftruncate/mmap/fstatfailures - a few send-break / teardown timing edges
- one likely unreachable guard in
dispatch_cgroups_snapshot()wherebuilder.finish()would have to return0
- Linux now uses
- script:
Latest Linux Go notes from the current ordinary POSIX slice:
service/cgroups/client.go:95.9%transport/posix/shm_linux.go:91.9%transport/posix/uds.go:95.6%- the latest rerun also exposed two real Unix Go harness bugs, not library regressions:
- the worker-capacity test used a readiness helper that briefly consumed the only worker slot
- the non-request termination test relied on one-shot raw connect / refresh assumptions instead of retry-style readiness
- the latest direct SHM guard slice raised:
ShmSend()to96.6%ShmReceive()to96.2%
- the latest Linux Go service slice raised:
Run()to94.7%handleSession()to92.9%- Linux Go total to
95.8%
Verified on 2026-03-28:
-
C:
- latest clean
win11coverage measurement:93.2% - per-file:
netipc_service_win.c:90.2%netipc_named_pipe.c:95.4%netipc_win_shm.c:97.2%
- status:
- passes the Linux-matching per-file and total
90%gates - the script now runs three bounded direct executables before the generic
ctestloop:test_win_service_guards.exetest_win_service_guards_extra.exetest_win_service_extra.exe
- latest validated direct-guard results:
test_win_service_guards.exe:198 passed, 0 failedtest_win_service_guards_extra.exe:93 passed, 0 failedtest_win_service_extra.exe:165 passed, 0 failed
- the remaining Windows C subset then runs one-by-one under
ctest --timeout 60 - the full
bash tests/run-coverage-c-windows.sh 90flow now completes cleanly on the validatedwin11path - the latest deterministic Windows C service fault-injection pass now also covers:
- NULL config default handling
- minimum response-buffer growth
- client-side response/send buffer allocation failures
- client/server SHM context allocation failures
- server-side SHM create failure and recovery
- cache allocation failure and recovery
- typed hybrid malformed SHM replies and status-code propagation on the real public path
- the latest Windows transport follow-up also covers:
- chunked receive error paths in
netipc_named_pipe.c - oversized-response
MSG_TOO_LARGEcoverage innetipc_win_shm.c - client receive-buffer reuse on a second chunked round-trip
- chunked receive error paths in
- passes the Linux-matching per-file and total
- latest clean
-
Go:
- script:
tests/run-coverage-go-windows.sh 90 - result:
95.4% - selected key files:
service/cgroups/cache_windows.go:100.0%service/cgroups/client_windows.go:100.0%transport/windows/pipe.go:92.1%transport/windows/shm.go:94.2%
- status:
- reported above the Linux-matching
90%target - the script exits cleanly in noninteractive
ssh - first-class Windows Go service/cache tests now also run under
ctest - the latest public typed-wrapper tests now cover:
Cache.Ready()Cache.Status()Client.Status()NewServerWithWorkers()
- the dead private
Handler.snapshotMaxItems()helper was removed, so the package denominator no longer hides a fake untested method - the latest transport edge tests, raw WinSHM L2 tests, and the listener shutdown fix materially raised both the Windows transport package and the Windows-only service branches that named pipes cannot reach
- malformed raw WinSHM request tests now also cover the real SHM server-side teardown / reconnect path
- the latest create / attach edge tests materially raised the remaining ordinary Windows Go transport file
- the remaining ordinary Windows Go gaps are now concentrated in low-level transport branches such as:
peekNamedPipeAvailableWaitReadable()SetPayloadLimits()- short write / zero-byte read / next-pipe-creation failures
- reported above the Linux-matching
- script:
-
Rust:
- script:
tests/run-coverage-rust-windows.sh 90 - result:
92.08%line coverage after excluding Rust bin / benchmark noise from the report - key files:
service/cgroups.rs:92.74%line coveragetransport/windows.rs:94.74%line coveragetransport/win_shm.rs:95.76%line coverage
- status:
- validated workflow with the same total
90%line threshold as Linux Rust coverage - plus per-file
90%line gates for:service\cgroups.rstransport\windows.rstransport\win_shm.rs
- the old ignored retry/shutdown test is now part of the normal Windows Rust suite
- native Phase 1 Win32 fault injection now covers forced failure and
recovery for:
CreateFileMappingWOpenFileMappingWMapViewOfFileCreateEventWOpenEventW
- validated workflow with the same total
- script:
These categories genuinely need special infrastructure beyond ordinary tests.
Examples:
src/libnetdata/netipc/src/protocol/netipc_protocol.c:146src/libnetdata/netipc/src/protocol/netipc_protocol.c:203src/libnetdata/netipc/src/protocol/netipc_protocol.c:223src/libnetdata/netipc/src/protocol/netipc_protocol.c:248src/libnetdata/netipc/src/protocol/netipc_protocol.c:453src/libnetdata/netipc/src/protocol/netipc_protocol.c:489
These require absurd sizes such as item_count > SIZE_MAX / N or wire values
that are not produced by normal encoders. They are reasonable candidates for
fault-injection or synthetic corruption harnesses.
Examples:
malloc/calloc/reallocfailure cleanup in C service and transport code- low-memory allocation paths in Windows C
netipc_service_win.c - low-memory branches in Go and Rust transport scratch growth
These require deterministic allocation-failure injection to cover reliably.
Examples:
- socket creation failures
mmap/CreateFileMapping/MapViewOfFilefailurespthread_create/ Win32 handle creation failures- named-pipe creation / handshake API failures
These are not “ordinary missing tests”. They require fault injection, resource exhaustion, or environment simulation.
Windows now also has a first-class verifier entrypoint for the core C runtime executables:
tests/run-verifier-windows.sh
That burns down ordinary handle / heap misuse. The remaining excluded OS-failure branches are the ones that still need deterministic fault injection or low-resource simulation.
Examples:
- futex timeout and EINTR races
- TOCTOU stale-cleanup paths
- mid-send or mid-receive disconnect timing
- Windows accept / shutdown timing edges
These need race orchestration or deterministic hooks.
Facts:
src/crates/netipc/src/transport/windows.rssrc/crates/netipc/src/transport/win_shm.rssrc/crates/netipc/src/service/cgroups.rs
These modules are excluded from Linux builds and cannot be measured by the current Linux coverage path. The tooling / environment gap is now solved on Windows. The remaining Windows Rust caveat is no longer an ignored restart / shutdown test; it is the broader OS-failure / low-resource branch set that now requires deterministic fault injection rather than more ordinary tests.
The following are still ordinary missing coverage and should not be treated as hard exclusions yet.
Current evidence:
netipc_service_win.cis now90.2%netipc_named_pipe.cis95.4%netipc_win_shm.cis97.2%
Brutal truth:
- the current Windows C gate is green
- this does not mean Windows C is coverage-complete
- the remaining uncovered branches are still a mix of ordinary missing tests and branches that would need fault injection
Current evidence:
- Windows Go total is now
95.4% service/cgroups/cache_windows.gois now100.0%service/cgroups/client_windows.gois now100.0%transport/windows/pipe.gois now92.1%transport/windows/shm.gois now94.2%- some malformed named-pipe response cases are filtered by the Windows session layer before they can reach L2 validation branches
- direct raw WinSHM tests now cover the equivalent Windows-only L2 branches
- the public typed Windows Go wrappers are now fully covered
Brutal truth:
- Windows Go is no longer the red gate for the Linux-matching
90%target - but it is still not honest to call it coverage-complete
- the remaining ordinary Windows Go work is no longer in the public typed wrappers
- the next honest review target is whether any of the tiny remaining low-level
pipe.gobranches are still worth ordinary testing, plus a final check for any still-reachabletransport/windows/shm.goresidual gap
Current evidence:
- Windows Rust now has a validated threshold-enforced workflow
service/cgroups.rsis now92.74%line coveragetransport/windows.rsis94.74%line coveragetransport/win_shm.rsis95.76%line coverage- the Windows retry/shutdown test now runs in the normal suite
Brutal truth:
- Windows Rust is no longer a tooling gap
- it is no longer blocked on weak service or transport runtime files sitting below the
90%line floor - it is now threshold-enforced at total
90%plus per-file90%for the critical Windows runtime files transport/win_shm.rsis no longer the weak Windows Rust runtime file- the remaining Windows Rust work is now farther into true low-resource / OS-failure territory beyond the current WinSHM syscall hooks
Even on POSIX, not every remaining uncovered line should be assumed unreachable. Some are likely still coverable with better malformed-input or disconnect tests.
Concrete evidence from the latest Linux Go SHM slice:
- typed SHM recovery cases such as batch-handler failure and batch-response overflow are ordinary and are now covered
- direct SHM transport guard and timeout paths are ordinary too:
- invalid service-name entry guards
ShmSend()bad-parameter guardsShmReceive()bad-parameter and timeout pathsShmCleanupStale()missing-directory / unrelated-file branches
- result:
transport/posix/shm_linux.gois now91.9%
- raw malformed SHM requests on POSIX are not currently ordinary unit-test
targets:
- malformed short request
- malformed header request
- unexpected message kind
- reason:
- they currently block in
ShmReceive(..., 30000)insideservice/cgroups/client.go, so ordinary tests spend the full SHM receive timeout instead of getting a cheap reconnect signal
- they currently block in
- implication:
- these paths should stay out of the ordinary-coverage bucket unless the POSIX SHM timeout behavior becomes directly controllable in tests
Concrete evidence from the latest Linux Go service-loop slice:
- ordinary POSIX server-loop cases are still available and are now covered:
- worker-capacity rejection
- idle peer disconnect
- non-request termination
- truncated raw request recovery
- result:
service/cgroups/client.gois now94.3%Run()moved to86.8%handleSession()moved to90.6%
- one real test-harness issue was exposed under coverage slowdown:
- the Unix Go server helpers were still using blind sleeps before clients
raced
Refresh() - this is now fixed by waiting for a successful POSIX handshake, not just for the socket path to exist
- the Unix Go server helpers were still using blind sleeps before clients
raced
- current honest remaining question in the POSIX service loop:
- the
session.Send(...)failure branch after peer close did not reproduce as an ordinary Unix-domain packet case in this slice - treat it as unproven ordinary work until a deterministic reproduction exists
- the
Concrete evidence from the latest Linux Go SHM transport slice:
- ordinary filesystem-obstruction cases were still available and are now covered:
checkShmStale()unreadable stale filecheckShmStale()non-empty directory path (open succeeds,Mmapfails)ShmServerCreate()retry-create failure when stale recovery cannot remove the obstructing path
- one existing SHM transport test-harness race was also real and is now fixed:
- the direct SHM roundtrip tests were using fixed service names plus blind
50mssleeps beforeShmClientAttach() - under coverage slowdown this produced both:
- attach-before-create failures (
ErrShmOpen) - and later server-side futex-wait timeouts
- attach-before-create failures (
- the fix was:
- unique SHM service names per test
- attach-ready waiting instead of blind sleeps
- the direct SHM roundtrip tests were using fixed service names plus blind
- result:
transport/posix/shm_linux.gois now91.4%ShmServerCreate()moved to79.2%checkShmStale()moved to92.6%
- implication:
- the remaining
shm_linux.gogaps are now even more concentrated in true OS failure paths such asFtruncate,Mmap,Dup,f.Stat, and atomic-load bounds failures after a successfulMmap
- the remaining
Concrete evidence from the latest Linux Go UDS transport slice:
- the remaining ordinary UDS batch-directory and first-request state paths were
still available and are now covered:
- client
Send()withinflightIDs == nil - non-chunked batch-directory underflow validation
- chunked batch-directory validation after successful reassembly
detectPacketSize()fallback / success helper behavior
- client
- result:
transport/posix/uds.gois now95.6%Send()is now100.0%Receive()moved to97.8%detectPacketSize()is now100.0%Listen()moved to81.0%connectAndHandshake()moved to93.2%serverHandshake()moved to95.3%
- implication:
- the remaining weak
uds.golines are now much more concentrated in raw syscall-failure and short-write territory:Listen()raw socket / bind / listen failuresrawSendMsg()short writes- handshake send / receive syscall failures
- the remaining weak
- Linux / POSIX:
- the scripts are working
- the current lowered thresholds pass
- coverage improved meaningfully, especially C
- Windows:
- coverage measurement now exists and is validated
- Windows C now passes the Linux-matching
90%gate - Windows Go is above the Linux-matching
90%target, the script reliability issue is fixed, and Windows Go service/cache tests are now part ofctest - Windows Go transport coverage is now materially stronger too
- Windows Rust coverage now has a real threshold-enforced entrypoint
- more ordinary test work is required before any “coverage parity” claim is honest
100%overall coverage is not currently achieved.- Some branches truly need special infrastructure.
- A meaningful part of the remaining Windows coverage gap is still plain missing test work, not a technical impossibility.