profile-compaction: CollapseConfig CRD + projection overlay + user-ma…#808
profile-compaction: CollapseConfig CRD + projection overlay + user-ma…#808entlein wants to merge 9 commits into
Conversation
|
Warning Review limit reached
More reviews will be available in 56 minutes and 15 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (46)
📝 WalkthroughWalkthroughThis PR adds a Cosign-based signature framework, tamper detection for legacy user overlays in ChangesSignature Package: Interfaces, Adapters, and Cosign Implementation
Container Profile Cache: Tamper Detection, Overlay Identity, and Projection
Rules Watcher: Signature Verification for Cluster-Synced Rules
Rule Binding Cache: Non-Blocking Notification Dispatch
Main Wiring, Exporters, and Dependency Updates
Component Tests and Resource Fixtures
Sequence Diagram(s)sequenceDiagram
participant ContainerProfileCache
participant profiles.ApplicationProfileAdapter
participant signature.VerifyObject
participant tamper_alert
participant exporters.Exporter
ContainerProfileCache->>profiles.ApplicationProfileAdapter: wrap label-referenced ApplicationProfile
ContainerProfileCache->>signature.VerifyObject: verifyUserApplicationProfile(profile, wlid)
alt ErrSignatureMismatch
signature.VerifyObject-->>tamper_alert: return ErrSignatureMismatch
tamper_alert->>tamper_alert: check tamperEmitted dedup map
tamper_alert->>exporters.Exporter: emitTamperAlert R1016 (once per tamper key)
tamper_alert-->>ContainerProfileCache: return false (drop overlay if strict mode)
else ErrObjectNotSigned or no signature
signature.VerifyObject-->>ContainerProfileCache: return true (overlay accepted)
else verification success
signature.VerifyObject-->>tamper_alert: clear dedup entry
tamper_alert-->>ContainerProfileCache: return true
end
ContainerProfileCache->>stampOverlayIdentity: stamp ap=ns/name@rv into SyncChecksum
sequenceDiagram
participant CosignAdapter
participant SignableObject
participant CosignVerifier
CosignAdapter->>SignableObject: GetContent()
CosignAdapter->>CosignAdapter: GetContentHash(content)
CosignAdapter->>CosignAdapter: SignData(data)
CosignAdapter->>SignableObject: SetAnnotations(mergedAnnotations)
CosignVerifier->>SignableObject: GetAnnotations()
CosignVerifier->>CosignAdapter: DecodeSignatureFromAnnotations()
CosignVerifier->>CosignAdapter: GetContentHash()
CosignVerifier->>CosignVerifier: VerifyData(hash, sig, allowUntrusted)
alt verification failure
CosignVerifier-->>Caller: ErrSignatureMismatch
else success
CosignVerifier-->>Caller: nil
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/objectcache/containerprofilecache/containerprofilecache.go`:
- Around line 412-414: The verification calls to
c.verifyUserApplicationProfile(userAP, sharedData.Wlid) currently ignore its
boolean result; update both call sites (the one using userAP at lines ~412 and
the similar one around ~428-430) to check the returned bool and, when it is
false and the cache is running in strict verification mode (e.g., the instance
flag controlling strict verification such as c.strictVerify or equivalent),
avoid projecting/merging the overlay (skip the merge/projectOverlay path), log
the verification failure with context (Wlid and which overlay), and return or
surface an error instead of continuing; if not in strict mode, continue but log
a warning. Ensure you reference and use c.verifyUserApplicationProfile(...) and
the strict-mode flag when implementing the conditional.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: acff329b-9e89-4538-a272-7877e0ad1e70
📒 Files selected for processing (12)
pkg/objectcache/containerprofilecache/containerprofilecache.gopkg/objectcache/containerprofilecache/projection.gopkg/objectcache/containerprofilecache/projection_apply.gopkg/objectcache/containerprofilecache/tamper_alert.gopkg/objectcache/containerprofilecache/tamper_alert_test.gopkg/objectcache/containerprofilecache/test32_projection_test.gopkg/objectcache/projection_types.gopkg/objectcache/shared_container_data.gopkg/objectcache/v1/mock.gopkg/rulebindingmanager/cache/cache.gopkg/rulebindingmanager/cache/cache_test.gopkg/rulemanager/cel/libraries/cache/function_cache.go
…verlay CodeRabbit upstream PR kubescape#808 / containerprofilecache.go:414 (Major). The verifyUserApplicationProfile and verifyUserNetworkNeighborhood methods already return a boolean reflecting verification outcome — true when the overlay is unsigned OR when verification succeeded OR in permissive mode (EnableSignatureVerification=false); false only in strict mode on actual tamper. The two call sites in projection-load were discarding that return, so tampered overlays in strict mode silently merged anyway. The R1016 alert was emitted but the protection was advisory only. Now: when verify returns false (strict mode + tamper detected) the overlay is nilled out before the merge step so the cache never projects a known-tampered profile. Permissive mode is unchanged — verify always returns true, the overlay still merges, R1016 still fires. New tests: - TestVerifyAP_StrictMode_ReturnsFalseOnTamper — sign + tamper an ApplicationProfile, construct a cache with EnableSignatureVerification=true, and assert verifyUserApplicationProfile returns false (caller drops overlay). - TestVerifyNN_StrictMode_ReturnsFalseOnTamper — symmetric pin for the NetworkNeighborhood path. The existing legacy-permissive tamper test (TestVerifyAP_TamperedProfile_PopulatesDedupMap) continues to pass unchanged — that path still returns true with the R1016 emitted.
|
@coderabbitai review Collective re-review request after applying your atomic-split advisory + rabbit-feedback fixes across the full SBOB upstream PR set. Each PR is now scoped to a single tier per your dependency map: Tier 0 (opens):
Tier 2 (execs):
Tier 3 (network):
Tier 4 (tamper/sign):
Umbrellas:
Please confirm atomicity per your earlier advisory and flag any remaining cross-PR coupling we missed. |
|
PR
|
| File | Issue | Belongs to |
|---|---|---|
projection_types.go |
Defines ExecsByPath map[string][]string on ProjectedContainerProfile |
Tier 2 · #812 ("ExecsByPath composite-key surface") |
projection_apply.go |
Consumes ExecsByPath, containsDynamicSegment, isNetworkIPWildcard, isNetworkDNSWildcard |
Tier 2 (#812) for exec surface; Tier 3 (#806/#324) for network wildcards |
v1/mock.go |
Updated to populate ExecsByPath + wildcard network patterns |
Tier 2 + Tier 3 |
rulebindingmanager/cache/cache.go + test |
Non-blocking fanout refactor — purely infrastructure, no tamper/sign semantics | Pre-Tier 4 (Tier 0 or standalone) |
Specific recommendations
-
ExecsByPathtype definition should either live in#812(Tier 2) with#808taking a compile dependency on it, or you should explicitly note that#808is the defining PR and#812merely consumes it. Right now the tier map implies the reverse, which is confusing. -
isNetworkIPWildcard/isNetworkDNSWildcardinprojection_apply.goduplicate classifier logic that Tier 3 (#806,#324) is supposed to own. Either import from the Tier 3 package or note these are intentionally local shims with a// TODO: consolidate with pkg/objectcache/networkmatchcomment. -
Non-blocking fanout (
rulebindingmanager/cache) is fully orthogonal to tamper/sign. If this was included because it was required to prevent deadlocks when emitting tamper alerts under load, that's a valid reason — but it should be called out explicitly in the PR description as a required infrastructure prerequisite, not just bundled silently.
Bottom line: The tamper/sign core is clean. The main atomicity leaks are the ExecsByPath surface definition crossing from Tier 4 into what the map says is Tier 2, and the fanout refactor having no declared relationship to the Tier 4 scope. If you can clarify the ExecsByPath ownership direction and add a sentence to the PR description explaining why the fanout refactor is here, the remaining coupling is defensible.
✅ Actions performed
Review triggered.
Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.
…e#807, kubescape#808) (#44) * fix(containerprofilecache): honor verify result before merging user overlay CodeRabbit upstream PR kubescape#808 / containerprofilecache.go:414 (Major). The verifyUserApplicationProfile and verifyUserNetworkNeighborhood methods already return a boolean reflecting verification outcome — true when the overlay is unsigned OR when verification succeeded OR in permissive mode (EnableSignatureVerification=false); false only in strict mode on actual tamper. The two call sites in projection-load were discarding that return, so tampered overlays in strict mode silently merged anyway. The R1016 alert was emitted but the protection was advisory only. Now: when verify returns false (strict mode + tamper detected) the overlay is nilled out before the merge step so the cache never projects a known-tampered profile. Permissive mode is unchanged — verify always returns true, the overlay still merges, R1016 still fires. New tests: - TestVerifyAP_StrictMode_ReturnsFalseOnTamper — sign + tamper an ApplicationProfile, construct a cache with EnableSignatureVerification=true, and assert verifyUserApplicationProfile returns false (caller drops overlay). - TestVerifyNN_StrictMode_ReturnsFalseOnTamper — symmetric pin for the NetworkNeighborhood path. The existing legacy-permissive tamper test (TestVerifyAP_TamperedProfile_PopulatesDedupMap) continues to pass unchanged — that path still returns true with the R1016 emitted. * docs(applicationprofile): rule-author contracts on opens-with-suffix/prefix + exec-args Two doc-only fixes for CodeRabbit cross-PR advisory: #7 (NA kubescape#807): ap.was_path_opened_with_suffix / _prefix — explicitly document the false-negative gap when the projection is in pass-through mode (cp.Opens.All=true). Wildcard Patterns are skipped from the suffix/prefix scan because their token-bearing text doesn't safely answer suffix questions. Rule authors who need wildcard-aware coverage should either declare an Opens-projection slice (so SuffixHits/PrefixHits become authoritative for the literals they care about) or use ap.was_path_opened (which runs CompareDynamic over Patterns). #8 (NA kubescape#807): wasExecutedWithArgs — document the three states of ExecsByPath: 1. Path absent from Execs.Values → exec not allowed, fall through. 2. Path in Values, ABSENT from ExecsByPath → legacy back-compat "no argv constraint", match. 3. Path in Values, PRESENT with empty arg list [] → explicit "ran with no args" constraint, NOT a wildcard. The distinction is load-bearing for profile authors: an entry of {Path: ..., Args: []} is a constraint, not a free pass. No behavioural change. Tests pass unchanged. * deps: pin stereoscope v0.1.9 + runtime-spec v1.2.1 (compat with kubescape/syft fork) Storage rc1 bumped to syft v1.42.4 (CVE-2026-33481), which transitively requires stereoscope v0.1.22 + runtime-spec v1.3.0. Those versions use the new moby/moby/client submodule API, which is incompatible with inspektor-gadget's moby/moby umbrella requirement on the node-agent side (ambiguous-import wall — see issue #45). Node-agent stays on kubescape/syft v1.32.0-ks.2 via the existing replace, but transitive resolution from storage's go.mod pulls the newer stereoscope into the build, breaking the build with: undefined: client.New undefined: client.PingOptions Adds two replace directives to force the older transitive chain that matches kubescape/syft v1.32.0-ks.2's expectations: github.com/anchore/stereoscope => v0.1.9-0.20250826202322-... github.com/opencontainers/runtime-spec => v1.2.1 This is the minimum set needed for node-agent to build cleanly against storage rc1 (with syft v1.42.4) while still using kubescape/syft on its own side. Storage's CVE fix remains in effect at the storage binary; node-agent's syft surface is unchanged. Verified locally: go build ./... ok go test ./pkg/objectcache/... ./pkg/rulemanager/... -count=1 → 30+ packages ok --------- Co-authored-by: Entlein <eineintlein@gmail.com>
matthyx
left a comment
There was a problem hiding this comment.
I found three merge-blocking build issues and left inline notes on each failing spot.
| "github.com/kubescape/node-agent/pkg/exporters" | ||
| "github.com/kubescape/node-agent/pkg/rulemanager/types" | ||
| "github.com/kubescape/node-agent/pkg/signature" | ||
| "github.com/kubescape/node-agent/pkg/signature/profiles" |
There was a problem hiding this comment.
go test ./... fails on this new file because github.com/kubescape/node-agent/pkg/signature and .../pkg/signature/profiles do not exist in the checked-in module (pkg/**/signature/** is empty here). As written, this breaks package setup for every importer of containerprofilecache, so the PR will not build until these imports point at a real package or the code is wired through an existing dependency.
| } | ||
| // v0.0.2 IPAddresses[] — list form supporting CIDRs and '*' sentinel. | ||
| // Same semantics as the deprecated singular IPAddress, just plural. | ||
| addrs = append(addrs, n.IPAddresses...) |
There was a problem hiding this comment.
The current github.com/kubescape/storage/pkg/apis/softwarecomposition/v1beta1.NetworkNeighbor type only exposes IPAddress (singular) — go doc ...NetworkNeighbor has no IPAddresses field. That makes this append, and the matching ingress append below, a compile error against the current dependency, so this needs to stay on the singular field unless the storage API is updated first.
| pcp.EgressAddresses.All = true | ||
| pcp.EgressAddresses.Values = make(map[string]struct{}) | ||
| if !specInstalled || spec.EgressAddresses.InUse { | ||
| addrs := make([]string, 0, len(n.IPAddresses)+1) |
There was a problem hiding this comment.
Same blocker in the mock path: NetworkNeighbor does not have an IPAddresses field, so len(n.IPAddresses) and the append here fail to compile (same issue repeats in the analogous ingress block below). go test ./pkg/objectcache/v1 is currently red because of this.
Three matthyx blockers (2026-05-27): (1) tamper_alert.go:28 — imports pkg/signature and pkg/signature/profiles which ship in node-agent#809 (not yet merged). Adds minimum-surface stubs (pkg/signature/stub.go, pkg/signature/profiles/stub.go) so the PR compiles standalone. With IsSigned returning false, the tamper path short-circuits and never invokes Verify — the no-op stub is behavior- safe: signed-profile detection is dormant until kubescape#809 lands and replaces these files with the real implementation. (2) projection_apply.go:265 — NetworkNeighbor.IPAddresses missing. (3) mock.go:202 — same IPAddresses issue. Both (2) and (3) resolved by replacing kubescape/storage with k8sstormcenter/storage's upstream-pr/sbob-network tip 46f37d32 (sibling of kubescape/storage#324). Removed when kubescape#324 merges and a release ships IPAddresses.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/rulebindingmanager/cache/cache.go (2)
188-189:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDon't multiplex refresh pulses onto the pod-delta notifier.
rulebindingmanager.RuleBindingNotify{}is a real zero-value event here (Action == Addedwith an empty Pod).pkg/containerwatcher/v2/container_watcher_collection.go:144-173treats every notification as a concrete pod update, so this pulse will mark rule bindings initialized and add"/"toruleManagedPods. Ifpkg/rulemanager/rule_manager.go:112-133needs a coalesced refresh signal, give it a separate notifier path or extend the payload with an explicit refresh action.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/rulebindingmanager/cache/cache.go` around lines 188 - 189, The refresh pulse is being sent as an empty rulebindingmanager.RuleBindingNotify via dispatchNonBlocking(notifiers, ...), which the container watcher treats as a real Pod Added event and corrupts ruleManagedPods; stop multiplexing refreshes onto the pod-delta notifier by adding a dedicated refresh path or explicit action field: either add a new notifier channel specifically for coalesced refreshes and call dispatchNonBlocking(refreshNotifiers, nil, "refresh pulse"), or extend rulebindingmanager.RuleBindingNotify with an explicit Action enum/value (e.g., Refresh) and emit RuleBindingNotify{Action: Refresh} from the caller and update the consumers in container_watcher_collection and rule_manager.go to ignore Refresh as a pod update and handle it as a coalesced refresh signal.
201-219:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDrop-on-full is unsafe for the multi-pod add/modify/delete paths.
This helper is fine for idempotent pulses, but the handlers pass slices of per-pod notifications. Once a subscriber channel fills, later pod events are silently lost, and
pkg/containerwatcher/v2/container_watcher_collection.go:144-173does not resync from cache — it updatesruleManagedPodsfrom each message. A single large rule-binding change can therefore leave a slow subscriber permanently missing pods. Consider coalescing those handlers to one resync event or moving delivery behind a per-subscriber queue/worker that preserves at-least-once delivery.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/rulebindingmanager/cache/cache.go` around lines 201 - 219, The current dispatchNonBlocking drops per-pod notifications which breaks callers that expect at-least-once delivery (the add/modify/delete handlers that send slices of per-pod RuleBindingNotify and consumers that update ruleManagedPods). Fix by changing delivery semantics: either (A) coalesce the handlers so they emit one consolidated resync-style RuleBindingNotify instead of many per-pod messages (update the add/modify/delete handlers to produce a single resync message), or (B) implement a per-subscriber worker/queue: create a goroutine per notifier that owns a buffered channel and serializes delivery (modify dispatchNonBlocking to enqueue into each notifier's dedicated queue/channel and have the worker perform the actual send with retries/blocking to guarantee at-least-once), and update indexOfNotifier usage accordingly; pick one approach and adjust all callers of dispatchNonBlocking and the add/modify/delete handlers to match.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/rulebindingmanager/cache/cache.go`:
- Line 140: The AddHandler and ModifyHandler implementations hold c.mutex and
call addRuleBinding/modifiedRuleBinding while those helpers use
context.Background() for Namespaces().List and Pods(...).List which prevents
cancellation; change the signatures of addRuleBinding(ctx context.Context, ...)
and modifiedRuleBinding(ctx context.Context, ...) and pass the incoming watch
context from AddHandler/ModifyHandler into those calls, then replace
context.Background() uses in both helpers with the passed ctx for both
Namespaces().List and Pods(...).List; finally update cache_test.go to call the
new function signatures accordingly.
---
Outside diff comments:
In `@pkg/rulebindingmanager/cache/cache.go`:
- Around line 188-189: The refresh pulse is being sent as an empty
rulebindingmanager.RuleBindingNotify via dispatchNonBlocking(notifiers, ...),
which the container watcher treats as a real Pod Added event and corrupts
ruleManagedPods; stop multiplexing refreshes onto the pod-delta notifier by
adding a dedicated refresh path or explicit action field: either add a new
notifier channel specifically for coalesced refreshes and call
dispatchNonBlocking(refreshNotifiers, nil, "refresh pulse"), or extend
rulebindingmanager.RuleBindingNotify with an explicit Action enum/value (e.g.,
Refresh) and emit RuleBindingNotify{Action: Refresh} from the caller and update
the consumers in container_watcher_collection and rule_manager.go to ignore
Refresh as a pod update and handle it as a coalesced refresh signal.
- Around line 201-219: The current dispatchNonBlocking drops per-pod
notifications which breaks callers that expect at-least-once delivery (the
add/modify/delete handlers that send slices of per-pod RuleBindingNotify and
consumers that update ruleManagedPods). Fix by changing delivery semantics:
either (A) coalesce the handlers so they emit one consolidated resync-style
RuleBindingNotify instead of many per-pod messages (update the add/modify/delete
handlers to produce a single resync message), or (B) implement a per-subscriber
worker/queue: create a goroutine per notifier that owns a buffered channel and
serializes delivery (modify dispatchNonBlocking to enqueue into each notifier's
dedicated queue/channel and have the worker perform the actual send with
retries/blocking to guarantee at-least-once), and update indexOfNotifier usage
accordingly; pick one approach and adjust all callers of dispatchNonBlocking and
the add/modify/delete handlers to match.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1b640f7e-c5dd-4b9f-a2b1-436aa9cc7564
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (6)
go.modpkg/objectcache/containerprofilecache/containerprofilecache.gopkg/rulebindingmanager/cache/cache.gopkg/rulebindingmanager/cache/cache_test.gopkg/signature/profiles/stub.gopkg/signature/stub.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/objectcache/containerprofilecache/containerprofilecache.go
| c.mutex.Unlock() | ||
| return | ||
| } | ||
| rbs = c.addRuleBinding(ruleBinding) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show repo file
ls -la
# Print the relevant portions of the file around the referenced lines
FILE="pkg/rulebindingmanager/cache/cache.go"
echo "=== File stats ==="
wc -l "$FILE"
echo
echo "=== Snippet around line 120-180 ==="
nl -ba "$FILE" | sed -n '110,190p'
echo
echo "=== Snippet around line 220-310 ==="
nl -ba "$FILE" | sed -n '220,310p'
echo
echo "=== Snippet around line 320-380 ==="
nl -ba "$FILE" | sed -n '320,380p'
echo
echo "=== Find all context.Background() uses in this file ==="
rg -n "context\.Background\(\)" "$FILE" || true
echo
echo "=== Find definitions/usages of addRuleBinding / modifiedRuleBinding ==="
rg -n "func \(c \*RBCache\) (addRuleBinding|modifiedRuleBinding)\b" "$FILE" || true
rg -n "\.addRuleBinding\(" "$FILE" || true
rg -n "\.modifiedRuleBinding\(" "$FILE" || true
echo
echo "=== Show where List() calls happen in addRuleBinding/modifiedRuleBinding ==="
# Print whole functions if not too large
perl -ne 'print if $.>=0' "$FILE" >/dev/null
# Use ast-grep? Fallback to sed by locating function start lines
ADD_START=$(nl -ba "$FILE" | rg -n "func \(c \*RBCache\) addRuleBinding" | head -n1 | cut -d: -f1 || true)
MOD_START=$(nl -ba "$FILE" | rg -n "func \(c \*RBCache\) modifiedRuleBinding" | head -n1 | cut -d: -f1 || true)
if [ -n "${ADD_START:-}" ]; then
echo "=== addRuleBinding from $ADD_START (print until next func) ==="
nl -ba "$FILE" | sed -n "${ADD_START},$((ADD_START+160))p"
fi
if [ -n "${MOD_START:-}" ]; then
echo "=== modifiedRuleBinding from $MOD_START (print until next func) ==="
nl -ba "$FILE" | sed -n "${MOD_START},$((MOD_START+160))p"
fiRepository: kubescape/node-agent
Length of output: 2017
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="pkg/rulebindingmanager/cache/cache.go"
echo "=== File stats ==="
wc -l "$FILE"
echo
echo "=== Snippet around line 110-190 ==="
awk 'NR>=110 && NR<=190 {printf "%d\t%s\n", NR, $0}' "$FILE"
echo
echo "=== Snippet around line 220-310 ==="
awk 'NR>=220 && NR<=310 {printf "%d\t%s\n", NR, $0}' "$FILE"
echo
echo "=== Snippet around line 320-380 ==="
awk 'NR>=320 && NR<=380 {printf "%d\t%s\n", NR, $0}' "$FILE"
echo
echo "=== Find all context.Background() uses in this file ==="
rg -n "context\.Background\(\)" "$FILE" || true
echo
echo "=== Find definitions/usages of addRuleBinding / modifiedRuleBinding ==="
rg -n "func \(c \*RBCache\) (addRuleBinding|modifiedRuleBinding)\b" "$FILE" || true
rg -n "\.addRuleBinding\(" "$FILE" || true
rg -n "\.modifiedRuleBinding\(" "$FILE" || true
echo
echo "=== Function bodies (trimmed) ==="
ADD_START=$(rg -n "func \(c \*RBCache\) addRuleBinding\b" "$FILE" | head -n1 | cut -d: -f1 || true)
MOD_START=$(rg -n "func \(c \*RBCache\) modifiedRuleBinding\b" "$FILE" | head -n1 | cut -d: -f1 || true)
if [ -n "${ADD_START:-}" ]; then
echo "--- addRuleBinding starting at $ADD_START ---"
awk -v s="$ADD_START" 'NR>=s && NR<=s+220 {printf "%d\t%s\n", NR, $0}' "$FILE"
fi
if [ -n "${MOD_START:-}" ]; then
echo "--- modifiedRuleBinding starting at $MOD_START ---"
awk -v s="$MOD_START" 'NR>=s && NR<=s+220 {printf "%d\t%s\n", NR, $0}' "$FILE"
fiRepository: kubescape/node-agent
Length of output: 24559
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "\.addRuleBinding\(" -S . || true
rg -n "\.modifiedRuleBinding\(" -S . || true
rg -n "func \(c \*RBCache\) addRuleBinding\(" pkg/rulebindingmanager/cache/cache.go
rg -n "func \(c \*RBCache\) modifiedRuleBinding\(" pkg/rulebindingmanager/cache/cache.goRepository: kubescape/node-agent
Length of output: 739
Restore watch-context cancellation for RB list calls.
pkg/rulebindingmanager/cache/cache.go keeps c.mutex held in AddHandler/ModifyHandler while it calls addRuleBinding/modifiedRuleBinding, and addRuleBinding currently uses context.Background() for Namespaces().List and Pods(...).List (lines 140, 159, 237-281, 343-345). This prevents list requests from being canceled when the watch context is canceled and can stall cache operations during/after shutdown or restart. Thread ctx through addRuleBinding(ctx, ...) / modifiedRuleBinding(ctx, ...), pass it from AddHandler/ModifyHandler, and use it for both List calls; update cache_test.go call sites for the signature change.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/rulebindingmanager/cache/cache.go` at line 140, The AddHandler and
ModifyHandler implementations hold c.mutex and call
addRuleBinding/modifiedRuleBinding while those helpers use context.Background()
for Namespaces().List and Pods(...).List which prevents cancellation; change the
signatures of addRuleBinding(ctx context.Context, ...) and
modifiedRuleBinding(ctx context.Context, ...) and pass the incoming watch
context from AddHandler/ModifyHandler into those calls, then replace
context.Background() uses in both helpers with the passed ctx for both
Namespaces().List and Pods(...).List; finally update cache_test.go to call the
new function signatures accordingly.
| helpers.Error(err)) | ||
| // Honour strict-mode: refuse to load on any verification failure, | ||
| // but do NOT touch the dedup map or emit R1016. | ||
| return !c.cfg.EnableSignatureVerification |
There was a problem hiding this comment.
c.cfg.EnableSignatureVerification still does not exist on config.Config, so this file has four hard compiler errors at the strict-mode return sites (for example here, and again at lines 108/136/146). Please either add/plumb that config field in this PR or remove the strict-mode branches so the tamper path matches currently available config.
| // signature-payload accessors. | ||
| type Signable interface{} | ||
|
|
||
| func IsSigned(_ Signable) bool { return false } |
There was a problem hiding this comment.
These stubs still leave the new tamper-detection path effectively disabled: IsSigned always returns false, VerifyObjectAllowUntrusted always returns nil, and the adapters in pkg/signature/profiles/stub.go return nil. That means the runtime path never verifies anything and the new tamper tests are not exercising the intended behavior. This needs the real signature implementation (or the feature/tests need to stay out of this PR).
There was a problem hiding this comment.
Yes, those are just to make the PR "atomic" they need to be deleted after merge. I havent yet found a way to completely seperate the PRs 808 and 809
matthyx
left a comment
There was a problem hiding this comment.
I re-checked this PR and I’m still seeing three merge blockers:
pkg/objectcache/containerprofilecache/tamper_alert.goreferencesc.cfg.EnableSignatureVerification, butpkg/config/config.gostill has no such field. LSP flags four compiler errors at the strict-mode returns.cmd/main.gocreates both the runtime alert exporter and the container profile cache, but never callscpc.SetTamperAlertExporter(exporter). As written,emitTamperAlert()is a no-op in production, so the new R1016 path never actually emits alerts end-to-end.pkg/signature/stub.go/pkg/signature/profiles/stub.gostill short-circuit the feature (IsSigned == false,VerifyObjectAllowUntrusted == nil, adapters returnnil), so the new tamper-detection behavior is effectively dormant and the added tamper tests are not validating real verification behavior.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
matthyx
left a comment
There was a problem hiding this comment.
Checked again.
The earlier import/IPAddresses blockers are addressed now: the PR has local pkg/signature stubs, and the storage replace brings NetworkNeighbor.IPAddresses.
I’m still seeing three blockers before merge:
pkg/objectcache/containerprofilecache/tamper_alert.gostill readsc.cfg.EnableSignatureVerificationat the strict-mode return sites, butpkg/config/config.gostill has no such field onconfig.Config.cmd/main.gocreates both the runtime alert exporter and the container profile cache, but never callscpc.SetTamperAlertExporter(exporter), so the new R1016 tamper path still never emits alerts end-to-end.pkg/signature/stub.goandpkg/signature/profiles/stub.gostill short-circuit the feature (IsSigned == false,VerifyObjectAllowUntrusted == nil, adapters returnnil), so the tamper-detection path remains dormant and the added tamper tests are not validating a real verification flow.
|
Component tests are green end-to-end for the tamper-alert wiring in Root cause that |
|
@coderabbitai review |
✅ Action performedReview finished.
|
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 17
🧹 Nitpick comments (1)
tests/component_test.go (1)
2624-2644: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUpdate the stale duplicated enforcement notes.
These bullets still say no R-number fires on signature verification failure, but this PR adds
R1016coverage inTest_31. The block is also duplicated.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/component_test.go` around lines 2624 - 2644, Update the duplicated enforcement note block so it matches the current behavior: remove the stale claim that no R-number rule fires on signature verification failure, since `Test_31` now covers `R1016` for that path. Keep the description in `Test_30_TamperedSignedProfiles` accurate to the remaining behavior, and delete the repeated copy of the same bullets so the comment block appears only once.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/sign-object/main.go`:
- Around line 400-443: The detectObjectType function only uses top-level
apiVersion and kind, so empty-TypeMeta profiles hit the unable to auto-detect
object type fallback even though loadApplicationProfile, loadSeccompProfile, and
loadNetworkNeighborhood can handle missing metadata. Update detectObjectType to
add a deterministic fallback path for the auto case when kind/apiVersion are
empty, using another discriminator from the decoded object shape or known
profile-specific fields instead of trying all loaders blindly. Keep the existing
explicit objectType handling unchanged and make sure the fallback resolves the
same empty-TypeMeta scenario covered by the cluster flow.
In `@pkg/rulemanager/ruleswatcher/watcher.go`:
- Around line 99-101: The skipped verification metric is counting rejected Rules
CRs instead of individual rules, which underreports failures when one invalid
resource contains multiple enabled rules. Update the verification path in
watcher.go around verifyRules so skippedVerificationCount is incremented per
skipped rule, matching the per-rule behavior used by enabledRules and
skippedByVersion. Use the same rule iteration/selection logic in the
RulesWatcher flow to count each enabled rule from a rejected Rules object,
including the similar code path referenced by the other affected block.
In `@pkg/signature/cosign_adapter_test.go`:
- Around line 128-137: The “Certificate is CA” negative test is using raw DER
from x509.CreateCertificate directly in Signature.Certificate, so the decode
step can fail before the CA rejection logic is exercised. Update the subtest in
cosign_adapter_test.go to PEM-encode the generated CA certificate before
assigning it to Signature.Certificate, using the existing certificate handling
path in Signature.Verify/parse logic so the intended “CA certificate rejected”
branch is actually covered.
In `@pkg/signature/cosign_adapter.go`:
- Around line 210-220: The strict verification path is rejecting signatures
created by this adapter because `signKeyless` returns no `RekorBundle` and
`signWithKey` is still treated like a Fulcio-issued cert before the local-issuer
branch runs. Update the verification logic in `VerifySignature`,
`verifyCertificates`, and the related keyless/local certificate handling so
adapter-produced signatures are accepted in strict mode by properly
short-circuiting or exempting local/self-signed issuer flows, and make sure the
round-trip path validates the `Signature` output from `signKeyless` and
`signWithKey` without requiring Rekor or Fulcio roots for those cases.
- Around line 126-185: The keyless signing flow in signKeyless always uses
context.Background(), so token acquisition and Fulcio/interactive auth can hang
indefinitely in non-interactive runs. Update signKeyless to accept and use a
caller-provided context, propagate it to c.tokenProvider, providers.Provide, and
oauthflow.OIDConnect, and wrap the signing path with a bounded timeout or
cancellation handling. Make the interactive fallback opt-in or gated so
tests/agents do not drop into browser-based OAuth by default.
In `@pkg/signature/profiles/adapter_test.go`:
- Around line 276-335: TestAdapterUniqueness currently only verifies both
adapters can be signed, not that signatures are unique per adapter/content.
Update the test to assert uniqueness by comparing the signatures returned from
GetObjectSignature for NewApplicationProfileAdapter and
NewSeccompProfileAdapter, or by signing two distinct adapter payloads and
confirming the resulting signature values differ. Keep the existing
SignObjectDisableKeyless setup, but add an explicit assertion in
TestAdapterUniqueness that proves the signature is bound to each adapter’s
content rather than just being locally signed.
In `@pkg/signature/profiles/applicationprofile_adapter.go`:
- Around line 40-57: GetContent() is mutating the wrapped ApplicationProfile
spec while normalizing PolicyByRuleId, which should not happen in the
hash/sign/verify path. Update ApplicationProfileAdapter.GetContent to build the
normalized payload from a copy of a.profile.Spec, and apply the nil-to-empty-map
normalization only on that copy for Containers, InitContainers, and
EphemeralContainers so the live spec remains unchanged.
- Around line 17-21: GetAnnotations currently mutates the wrapped profile by
initializing Annotations on read, which breaks read-only access patterns used by
VerifyObject and GetObjectSignature. Update
ApplicationProfileAdapter.GetAnnotations to return the existing map without
allocating or assigning, and move any nil-map initialization to the write path
or mutating methods; apply the same read-only behavior to the other
storage-backed adapter GetAnnotations implementations.
In `@pkg/signature/profiles/rules_adapter_test.go`:
- Around line 4-5: Update the rules adapter test to assert the returned error
matches the exported sentinel signature.ErrSignatureMismatch rather than only
checking the error text. In the test case around the verifier call in
rules_adapter_test.go (including the related assertions near the later lines
noted in the comment), keep any string checks only if needed for context, but
make the primary expectation use the exported tamper error so the contract stays
stable even if wrapping or wording changes.
In `@pkg/signature/sign_test.go`:
- Around line 204-208: The keyless signing setup in the sign test is swallowing
failures from SignObjectKeyless, which delays the real error until
GetObjectSignature. Update the setup path in the sign_test subtest so the result
of SignObjectKeyless is checked and any error is surfaced immediately with a
test failure, using the existing tt.setupSign / tt.profile flow to locate the
change.
In `@pkg/signature/signer.go`:
- Around line 18-19: The CosignSigner.Sign method currently dereferences
s.adapter without checking the receiver, so it can panic when the signer is nil
or not fully initialized. Add a nil/invalid receiver guard at the start of Sign,
matching the error-returning pattern used by CosignVerifier.Verify*, and return
a regular error instead of calling adapter.SignData when the signer is unusable.
In `@pkg/signature/verify_test.go`:
- Around line 216-231: The captureLogOutput helper in verify_test.go leaves the
global logger writer and pipe descriptors un-restored if fn fails or panics, so
move the cleanup into deferred teardown around logger.L().SetWriter and os.Pipe
handling. In captureLogOutput, save oldWriter, then defer restoring
logger.L().SetWriter(oldWriter) and closing w/r after the pipe is created, while
still copying the buffer after fn returns, so later tests never inherit the
temporary writer.
In `@pkg/signature/verify.go`:
- Around line 38-40: Treat malformed signature annotations as
ErrSignatureMismatch instead of returning a plain wrapped decode error. In
verify.go, update the error handling around
adapter.DecodeSignatureFromAnnotations in the verification path (and the same
decode handling in the other affected block) so that any decode failure after
AnnotationSignature is present is converted to ErrSignatureMismatch, preserving
the underlying decode details only as context if needed. Keep the fix localized
to the verification flow by using the existing verify logic and
ErrSignatureMismatch symbol so callers using errors.Is can route corrupted
signatures through the tamper path.
In `@tests/component_test.go`:
- Around line 2129-2130: The alert helper logic is masking Alertmanager fetch
failures as empty results, which can make negative assertions pass incorrectly.
Update the alert retrieval path in the helpers that use testutils.GetAlerts,
especially waitAlerts and countR1016, so the final GetAlerts call must succeed
instead of ignoring its error or returning 0/empty alerts on failure. Keep the
“no alerts” case only for a successful empty response, and surface any retrieval
error directly so tests fail when Alertmanager is unavailable.
- Around line 2798-2857: The test currently proves only that wget still triggers
R0001, which does not confirm the tampered exec entry was honored by the loaded
profile. In the same flow in the tamper-verification block, after
`storageClient.ApplicationProfiles(...).Get`,
`profiles.NewApplicationProfileAdapter`, and the `require.Eventually` cache-load
check, add an assertion using `wl.ExecIntoPod` and `testutils.GetAlerts` that
`nslookup` does not emit `R0001`, so the test verifies the node-agent actually
loaded the tampered `ApplicationProfile` rather than the original signed
baseline.
- Around line 2591-2593: The R0001 check is too broad and can pass on an
unrelated alert from the earlier allowed exec. Tighten the assertion in the test
that iterates over alerts so it only matches the anomalous nslookup command’s
alert, using the existing alert fields in that loop and the surrounding
Eventually/final count logic to verify the specific command rather than any
R0001 in the namespace. Apply the same scoping fix in the later assertion block
that has the same issue.
- Around line 2149-2164: The no-alert subtest can pass even when the allowed
traffic never happened because `wl.ExecIntoPod` failures for `nslookup` and
`curl` are only logged. In this test block, make the `fusioncore.ai` checks in
`Test...`/the relevant subtest fail fast if either allowed operation returns an
error, so `waitAlerts`, `countByRule`, and the `R0005`/`R0011` assertions only
run after successful DNS and HTTP traffic generation.
---
Nitpick comments:
In `@tests/component_test.go`:
- Around line 2624-2644: Update the duplicated enforcement note block so it
matches the current behavior: remove the stale claim that no R-number rule fires
on signature verification failure, since `Test_31` now covers `R1016` for that
path. Keep the description in `Test_30_TamperedSignedProfiles` accurate to the
remaining behavior, and delete the repeated copy of the same bullets so the
comment block appears only once.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 89702d91-ffab-479e-a7cf-ff04223b892a
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (35)
cmd/main.gocmd/sign-object/Dockerfilecmd/sign-object/main.gogo.modpkg/config/config.gopkg/containerprofilemanager/v1/lifecycle.gopkg/exporters/alert_manager.gopkg/objectcache/containerprofilecache/tamper_alert_test.gopkg/rulemanager/cel/libraries/applicationprofile/ap.gopkg/rulemanager/ruleswatcher/watcher.gopkg/rulemanager/ruleswatcher/watcher_signature_test.gopkg/signature/annotations.gopkg/signature/cluster_flow_test.gopkg/signature/cluster_scenario_test.gopkg/signature/cosign_adapter.gopkg/signature/cosign_adapter_test.gopkg/signature/interface.gopkg/signature/profiles/adapter_test.gopkg/signature/profiles/applicationprofile_adapter.gopkg/signature/profiles/empty_typemeta_test.gopkg/signature/profiles/networkneighborhood_adapter.gopkg/signature/profiles/networkneighborhood_adapter_test.gopkg/signature/profiles/rules_adapter.gopkg/signature/profiles/rules_adapter_test.gopkg/signature/profiles/seccompprofile_adapter.gopkg/signature/sign.gopkg/signature/sign_test.gopkg/signature/signer.gopkg/signature/verifier.gopkg/signature/verify.gopkg/signature/verify_test.gotests/component_test.gotests/resources/aplint_test.gotests/resources/curl-signed-deployment.yamltests/resources/known-application-profile.yaml
✅ Files skipped from review due to trivial changes (5)
- cmd/sign-object/Dockerfile
- pkg/signature/annotations.go
- pkg/containerprofilemanager/v1/lifecycle.go
- pkg/rulemanager/cel/libraries/applicationprofile/ap.go
- pkg/signature/profiles/networkneighborhood_adapter_test.go
| func detectObjectType(objectType string, data []byte) (signature.SignableObject, error) { | ||
| var decoded map[string]interface{} | ||
| if err := k8syaml.Unmarshal(data, &decoded); err != nil { | ||
| return nil, fmt.Errorf("failed to unmarshal YAML: %w", err) | ||
| } | ||
|
|
||
| kind, _ := decoded["kind"].(string) | ||
| apiVersion, _ := decoded["apiVersion"].(string) | ||
|
|
||
| if verbose { | ||
| fmt.Printf("Detected API: %s, Kind: %s\n", apiVersion, kind) | ||
| } | ||
|
|
||
| if objectType != "auto" { | ||
| switch strings.ToLower(objectType) { | ||
| case "applicationprofile", "application-profile", "ap": | ||
| return loadApplicationProfile(data) | ||
| case "seccompprofile", "seccomp-profile", "sp": | ||
| return loadSeccompProfile(data) | ||
| case "networkneighborhood", "network-neighborhood", "nn": | ||
| return loadNetworkNeighborhood(data) | ||
| case "rules", "rule", "r": | ||
| return loadRules(data) | ||
| default: | ||
| return nil, fmt.Errorf("unknown object type: %s", objectType) | ||
| } | ||
| } | ||
|
|
||
| if strings.Contains(strings.ToLower(apiVersion), "softwarecomposition") { | ||
| switch strings.ToLower(kind) { | ||
| case "applicationprofile", "application-profile": | ||
| return loadApplicationProfile(data) | ||
| case "seccompprofile", "seccomp-profile": | ||
| return loadSeccompProfile(data) | ||
| case "networkneighborhood", "network-neighborhood": | ||
| return loadNetworkNeighborhood(data) | ||
| } | ||
| } | ||
|
|
||
| if strings.Contains(strings.ToLower(apiVersion), "kubescape.io") && strings.ToLower(kind) == "rules" { | ||
| return loadRules(data) | ||
| } | ||
|
|
||
| return nil, fmt.Errorf("unable to auto-detect object type") |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift
--type auto breaks on the empty-TypeMeta profiles this PR explicitly supports.
detectObjectType only dispatches from top-level apiVersion/kind, so profiles loaded from the cluster with both fields empty fall through to unable to auto-detect object type. That makes the default CLI path fail for the same production scenario covered in pkg/signature/cluster_scenario_test.go, even though the adapters themselves can normalize missing TypeMeta.
Please add a real fallback discriminator here for empty-TypeMeta objects rather than relying solely on apiVersion/kind. Avoid a naive “try all unmarshals” loop, since these structs will happily decode unknown YAML into zero values and can false-positive.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/sign-object/main.go` around lines 400 - 443, The detectObjectType
function only uses top-level apiVersion and kind, so empty-TypeMeta profiles hit
the unable to auto-detect object type fallback even though
loadApplicationProfile, loadSeccompProfile, and loadNetworkNeighborhood can
handle missing metadata. Update detectObjectType to add a deterministic fallback
path for the auto case when kind/apiVersion are empty, using another
discriminator from the decoded object shape or known profile-specific fields
instead of trying all loaders blindly. Keep the existing explicit objectType
handling unchanged and make sure the fallback resolves the same empty-TypeMeta
scenario covered by the cluster flow.
| t.Run("Certificate is CA", func(t *testing.T) { | ||
| // Create a CA certificate | ||
| template := x509.Certificate{ | ||
| IsCA: true, | ||
| } | ||
| certDER, _ := x509.CreateCertificate(rand.Reader, &template, &template, &privKey.PublicKey, privKey) | ||
| sig := &Signature{ | ||
| Signature: []byte("sig"), | ||
| Certificate: certDER, | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
PEM-encode the CA cert in this negative test.
x509.CreateCertificate returns DER, but this subtest passes it straight into Signature.Certificate. That means Line 138 can fail on certificate decoding instead of the intended “CA certificate rejected” path, so this branch is not actually covered.
Proposed fix
t.Run("Certificate is CA", func(t *testing.T) {
// Create a CA certificate
template := x509.Certificate{
IsCA: true,
}
certDER, _ := x509.CreateCertificate(rand.Reader, &template, &template, &privKey.PublicKey, privKey)
sig := &Signature{
Signature: []byte("sig"),
- Certificate: certDER,
+ Certificate: pem.EncodeToMemory(&pem.Block{Type: "CERTIFICATE", Bytes: certDER}),
}
err := adapter.VerifyData(data, sig, false)
if err == nil {
t.Error("Expected error for CA certificate, got nil")
}
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| t.Run("Certificate is CA", func(t *testing.T) { | |
| // Create a CA certificate | |
| template := x509.Certificate{ | |
| IsCA: true, | |
| } | |
| certDER, _ := x509.CreateCertificate(rand.Reader, &template, &template, &privKey.PublicKey, privKey) | |
| sig := &Signature{ | |
| Signature: []byte("sig"), | |
| Certificate: certDER, | |
| } | |
| t.Run("Certificate is CA", func(t *testing.T) { | |
| // Create a CA certificate | |
| template := x509.Certificate{ | |
| IsCA: true, | |
| } | |
| certDER, _ := x509.CreateCertificate(rand.Reader, &template, &template, &privKey.PublicKey, privKey) | |
| sig := &Signature{ | |
| Signature: []byte("sig"), | |
| Certificate: pem.EncodeToMemory(&pem.Block{Type: "CERTIFICATE", Bytes: certDER}), | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/signature/cosign_adapter_test.go` around lines 128 - 137, The
“Certificate is CA” negative test is using raw DER from x509.CreateCertificate
directly in Signature.Certificate, so the decode step can fail before the CA
rejection logic is exercised. Update the subtest in cosign_adapter_test.go to
PEM-encode the generated CA certificate before assigning it to
Signature.Certificate, using the existing certificate handling path in
Signature.Verify/parse logic so the intended “CA certificate rejected” branch is
actually covered.
| // 5. Upload to Rekor (Placeholder for real upload) | ||
| // rekorClient, _ := rekor.GetByProxy(rekorURL) | ||
| // entry, _ := cosign.TLogUpload(ctx, rekorClient, sig, certBytes, data) | ||
|
|
||
| return &Signature{ | ||
| Signature: sig, | ||
| Certificate: certBytes, | ||
| Issuer: issuer, | ||
| Identity: identity, | ||
| Timestamp: time.Now().Unix(), | ||
| }, nil |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift
Make strict verification round-trip adapter-produced signatures.
signKeyless emits no RekorBundle, but strict verification rejects non-local/non-interactive issuers without one. signWithKey emits a self-signed local cert, but strict verification still verifies it against Fulcio roots before reaching the local issuer logic. As written, strict mode rejects the main signatures this adapter creates.
Also applies to: 233-244, 383-400, 440-449
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/signature/cosign_adapter.go` around lines 210 - 220, The strict
verification path is rejecting signatures created by this adapter because
`signKeyless` returns no `RekorBundle` and `signWithKey` is still treated like a
Fulcio-issued cert before the local-issuer branch runs. Update the verification
logic in `VerifySignature`, `verifyCertificates`, and the related keyless/local
certificate handling so adapter-produced signatures are accepted in strict mode
by properly short-circuiting or exempting local/self-signed issuer flows, and
make sure the round-trip path validates the `Signature` output from
`signKeyless` and `signWithKey` without requiring Rekor or Fulcio roots for
those cases.
| sig, err := adapter.DecodeSignatureFromAnnotations(annotations) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to decode signature from annotations: %w", err) |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Treat malformed signature annotations as ErrSignatureMismatch.
Once AnnotationSignature is present, a decode failure is no longer “unsigned” — it is a broken or tampered signed payload. Returning a plain error here means downstream callers that rely on errors.Is(err, ErrSignatureMismatch) will classify corrupted signature metadata as operational and skip the tamper path/R1016 emission.
Suggested fix
sig, err := adapter.DecodeSignatureFromAnnotations(annotations)
if err != nil {
- return fmt.Errorf("failed to decode signature from annotations: %w", err)
+ return fmt.Errorf("%w: failed to decode signature from annotations: %w", ErrSignatureMismatch, err)
}Also applies to: 67-71
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/signature/verify.go` around lines 38 - 40, Treat malformed signature
annotations as ErrSignatureMismatch instead of returning a plain wrapped decode
error. In verify.go, update the error handling around
adapter.DecodeSignatureFromAnnotations in the verification path (and the same
decode handling in the other affected block) so that any decode failure after
AnnotationSignature is present is converted to ErrSignatureMismatch, preserving
the underlying decode details only as context if needed. Keep the fix localized
to the verification flow by using the existing verify logic and
ErrSignatureMismatch symbol so callers using errors.Is can route corrupted
signatures through the tamper path.
| alerts, _ = testutils.GetAlerts(ns) | ||
| return alerts |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Don’t convert alert retrieval failures into “no alerts”.
Both helpers can make negative assertions pass when Alertmanager is unavailable: waitAlerts drops the final GetAlerts error, and countR1016 returns 0 on error. Require the final fetch to succeed so “no alert” is meaningful.
Proposed fix
- alerts, _ = testutils.GetAlerts(ns)
+ alerts, err = testutils.GetAlerts(ns)
+ require.NoError(t, err, "get alerts from ns %s", ns)
return alerts alerts, err := testutils.GetAlerts(nsName)
- if err != nil {
- t.Logf("GetAlerts error: %v", err)
- return 0
- }
+ require.NoError(t, err, "get alerts from ns %s", nsName)Also applies to: 3047-3050
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/component_test.go` around lines 2129 - 2130, The alert helper logic is
masking Alertmanager fetch failures as empty results, which can make negative
assertions pass incorrectly. Update the alert retrieval path in the helpers that
use testutils.GetAlerts, especially waitAlerts and countR1016, so the final
GetAlerts call must succeed instead of ignoring its error or returning 0/empty
alerts on failure. Keep the “no alerts” case only for a successful empty
response, and surface any retrieval error directly so tests fail when
Alertmanager is unavailable.
| // DNS lookup via nslookup (domain in NN). | ||
| stdout, stderr, err := wl.ExecIntoPod([]string{"nslookup", "fusioncore.ai"}, "curl") | ||
| t.Logf("nslookup fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr) | ||
|
|
||
| // HTTP via curl (domain + IP in NN). | ||
| stdout, stderr, err = wl.ExecIntoPod([]string{"curl", "-sm5", "http://fusioncore.ai"}, "curl") | ||
| t.Logf("curl fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr) | ||
|
|
||
| alerts := waitAlerts(t, wl.Namespace) | ||
| t.Logf("=== %d alerts ===", len(alerts)) | ||
| logAlerts(t, alerts) | ||
|
|
||
| assert.Equal(t, 0, countByRule(alerts, "R0005"), | ||
| "fusioncore.ai is in NN — should NOT fire R0005") | ||
| assert.Equal(t, 0, countByRule(alerts, "R0011"), | ||
| "fusioncore.ai IP is in NN — should NOT fire R0011") |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Fail the no-alert case when allowed traffic was never generated.
This subtest only logs nslookup/curl errors, so DNS or egress failures can still pass as “no R0005/R0011”. Require the allowed operations to succeed before asserting no alerts.
Proposed fix
stdout, stderr, err := wl.ExecIntoPod([]string{"nslookup", "fusioncore.ai"}, "curl")
t.Logf("nslookup fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr)
+ require.NoError(t, err, "allowed DNS lookup must succeed before asserting no R0005")
// HTTP via curl (domain + IP in NN).
stdout, stderr, err = wl.ExecIntoPod([]string{"curl", "-sm5", "http://fusioncore.ai"}, "curl")
t.Logf("curl fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr)
+ require.NoError(t, err, "allowed HTTP egress must succeed before asserting no R0011")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // DNS lookup via nslookup (domain in NN). | |
| stdout, stderr, err := wl.ExecIntoPod([]string{"nslookup", "fusioncore.ai"}, "curl") | |
| t.Logf("nslookup fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr) | |
| // HTTP via curl (domain + IP in NN). | |
| stdout, stderr, err = wl.ExecIntoPod([]string{"curl", "-sm5", "http://fusioncore.ai"}, "curl") | |
| t.Logf("curl fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr) | |
| alerts := waitAlerts(t, wl.Namespace) | |
| t.Logf("=== %d alerts ===", len(alerts)) | |
| logAlerts(t, alerts) | |
| assert.Equal(t, 0, countByRule(alerts, "R0005"), | |
| "fusioncore.ai is in NN — should NOT fire R0005") | |
| assert.Equal(t, 0, countByRule(alerts, "R0011"), | |
| "fusioncore.ai IP is in NN — should NOT fire R0011") | |
| // DNS lookup via nslookup (domain in NN). | |
| stdout, stderr, err := wl.ExecIntoPod([]string{"nslookup", "fusioncore.ai"}, "curl") | |
| t.Logf("nslookup fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr) | |
| require.NoError(t, err, "allowed DNS lookup must succeed before asserting no R0005") | |
| // HTTP via curl (domain + IP in NN). | |
| stdout, stderr, err = wl.ExecIntoPod([]string{"curl", "-sm5", "http://fusioncore.ai"}, "curl") | |
| t.Logf("curl fusioncore.ai → err=%v stdout=%q stderr=%q", err, stdout, stderr) | |
| require.NoError(t, err, "allowed HTTP egress must succeed before asserting no R0011") | |
| alerts := waitAlerts(t, wl.Namespace) | |
| t.Logf("=== %d alerts ===", len(alerts)) | |
| logAlerts(t, alerts) | |
| assert.Equal(t, 0, countByRule(alerts, "R0005"), | |
| "fusioncore.ai is in NN — should NOT fire R0005") | |
| assert.Equal(t, 0, countByRule(alerts, "R0011"), | |
| "fusioncore.ai IP is in NN — should NOT fire R0011") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/component_test.go` around lines 2149 - 2164, The no-alert subtest can
pass even when the allowed traffic never happened because `wl.ExecIntoPod`
failures for `nslookup` and `curl` are only logged. In this test block, make the
`fusioncore.ai` checks in `Test...`/the relevant subtest fail fast if either
allowed operation returns an error, so `waitAlerts`, `countByRule`, and the
`R0005`/`R0011` assertions only run after successful DNS and HTTP traffic
generation.
| for _, a := range alerts { | ||
| if a.Labels["rule_id"] == "R0001" { | ||
| return true |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Filter the R0001 assertion to the anomalous command.
The test intends to prove nslookup is anomalous, but any R0001 in the namespace satisfies the Eventually and final count. A false positive from the earlier allowed curl exec would still pass.
Proposed fix
- if a.Labels["rule_id"] == "R0001" {
+ if a.Labels["rule_id"] == "R0001" && a.Labels["comm"] == "nslookup" {
return true
} for _, a := range alerts {
- if a.Labels["rule_id"] == "R0001" {
+ if a.Labels["rule_id"] == "R0001" && a.Labels["comm"] == "nslookup" {
r0001Count++
}
}Also applies to: 2611-2617
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/component_test.go` around lines 2591 - 2593, The R0001 check is too
broad and can pass on an unrelated alert from the earlier allowed exec. Tighten
the assertion in the test that iterates over alerts so it only matches the
anomalous nslookup command’s alert, using the existing alert fields in that loop
and the surrounding Eventually/final count logic to verify the specific command
rather than any R0001 in the namespace. Apply the same scoping fix in the later
assertion block that has the same issue.
| // Tamper: attacker adds nslookup to the whitelist. | ||
| ap.Spec.Containers[0].Execs = append(ap.Spec.Containers[0].Execs, | ||
| v1beta1.ExecCalls{Path: "/usr/bin/nslookup"}) | ||
|
|
||
| // Signature is now invalid. | ||
| tamperedAdapter := profiles.NewApplicationProfileAdapter(ap) | ||
| require.Error(t, signature.VerifyObjectAllowUntrusted(tamperedAdapter), | ||
| "tampered AP must fail verification") | ||
|
|
||
| // Push tampered AP to storage (signature annotations are stale). | ||
| _, err := storageClient.ApplicationProfiles(ns.Name).Create( | ||
| context.Background(), ap, metav1.CreateOptions{}) | ||
| require.NoError(t, err, "push tampered AP to storage") | ||
|
|
||
| // Verify stored AP has stale signature. | ||
| require.Eventually(t, func() bool { | ||
| stored, getErr := storageClient.ApplicationProfiles(ns.Name).Get( | ||
| context.Background(), "signed-ap", v1.GetOptions{}) | ||
| if getErr != nil { | ||
| return false | ||
| } | ||
| storedAdapter := profiles.NewApplicationProfileAdapter(stored) | ||
| // Signature annotation exists but verification should fail. | ||
| if !signature.IsSigned(storedAdapter) { | ||
| return false | ||
| } | ||
| return signature.VerifyObjectAllowUntrusted(storedAdapter) != nil | ||
| }, 30*time.Second, 1*time.Second, "stored AP must have stale signature that fails verification") | ||
| t.Log("Stored AP has invalid signature (tamper detected at crypto layer)") | ||
|
|
||
| // Deploy pod referencing the tampered profile. | ||
| wl, err := testutils.NewTestWorkload(ns.Name, | ||
| path.Join(utils.CurrentDir(), "resources/curl-signed-deployment.yaml")) | ||
| require.NoError(t, err) | ||
| require.NoError(t, wl.WaitForReady(80)) | ||
|
|
||
| // Drive the unexpected exec inside Eventually so cache-load latency | ||
| // is absorbed by retries instead of a blind sleep. Same pattern as | ||
| // Test_29 (signed AP, anomalous exec) — without it, the first exec | ||
| // can land before the CP cache projects the user-defined AP, the | ||
| // rule manager evaluates against an empty baseline, and R0001 never | ||
| // fires within the polling window. | ||
| // | ||
| // wget is NOT in the AP (even after the attacker added nslookup), so | ||
| // once the cache loads, every wget exec produces an R0001 alert. | ||
| var alerts []testutils.Alert | ||
| require.Eventually(t, func() bool { | ||
| wl.ExecIntoPod([]string{"wget", "-qO-", "--timeout=2", "http://ebpf.io"}, "curl") | ||
| alerts, err = testutils.GetAlerts(ns.Name) | ||
| if err != nil { | ||
| return false | ||
| } | ||
| for _, a := range alerts { | ||
| if a.Labels["rule_id"] == "R0001" && a.Labels["comm"] == "wget" { | ||
| return true | ||
| } | ||
| } | ||
| return false | ||
| }, 120*time.Second, 10*time.Second, | ||
| "wget not in tampered AP must fire R0001 — proves tampered profile was loaded (enforcement off)") |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Assert the tampered exec is actually honored.
The test appends nslookup to the tampered AP, but only proves wget still alerts. If node-agent loaded the original signed baseline or dropped the tampered exec, wget would still fire and this test would pass. After cache load is proven, also assert nslookup does not produce R0001.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/component_test.go` around lines 2798 - 2857, The test currently proves
only that wget still triggers R0001, which does not confirm the tampered exec
entry was honored by the loaded profile. In the same flow in the
tamper-verification block, after `storageClient.ApplicationProfiles(...).Get`,
`profiles.NewApplicationProfileAdapter`, and the `require.Eventually` cache-load
check, add an assertion using `wl.ExecIntoPod` and `testutils.GetAlerts` that
`nslookup` does not emit `R0001`, so the test verifies the node-agent actually
loaded the tampered `ApplicationProfile` rather than the original signed
baseline.
…naged lifecycle Signed-off-by: entlein <einentlein@gmail.com>
…c1 final state Signed-off-by: entlein <einentlein@gmail.com>
Three matthyx blockers (2026-05-27): (1) tamper_alert.go:28 — imports pkg/signature and pkg/signature/profiles which ship in node-agent#809 (not yet merged). Adds minimum-surface stubs (pkg/signature/stub.go, pkg/signature/profiles/stub.go) so the PR compiles standalone. With IsSigned returning false, the tamper path short-circuits and never invokes Verify — the no-op stub is behavior- safe: signed-profile detection is dormant until kubescape#809 lands and replaces these files with the real implementation. (2) projection_apply.go:265 — NetworkNeighbor.IPAddresses missing. (3) mock.go:202 — same IPAddresses issue. Both (2) and (3) resolved by replacing kubescape/storage with k8sstormcenter/storage's upstream-pr/sbob-network tip 46f37d32 (sibling of kubescape/storage#324). Removed when kubescape#324 merges and a release ships IPAddresses. Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
The tamper-detection path in containerprofilecache detected signature mismatches and logged them, but SetTamperAlertExporter was never called in cmd/main.go, so c.tamperAlertExporter stayed nil and emitTamperAlert returned early — R1016 "Signed profile tampered" was silently dropped. Wire the already-constructed alert exporter into the cache after it is built. Detection runs independently of EnableSignatureVerification (which only gates enforcement), so with this wiring a tampered user-defined profile both loads and fires R1016. Validated: component tests Test_29/Test_30/Test_31 green. Signed-off-by: entlein <einentlein@gmail.com>
…linter
Verify rule-CRD signatures in the rules-watcher sync path: when
EnableSignatureVerification is set, each Rules resource is verified via the
signature/profiles RulesAdapter before its rules are admitted; unsigned or
failing resources are skipped and counted (skippedByVerification). When the
flag is off (default) behaviour is unchanged. NewRulesWatcher takes the config
so the watcher can read the flag; the cmd wiring passes it. Unit tests pin the
gate: disabled (nil cfg or flag off) is a no-op; enabled rejects an unsigned
resource with ErrObjectNotSigned. This reuses the signature verifier already
present for profile tamper detection, extending the same trust model to rule
resources.
Also add an ApplicationProfile fixture linter and its ground-truth fixture.
LintApplicationProfile / LintApplicationProfileYAML return []Violation so the
logic is reusable outside the testing package. The directory scan lints every
ApplicationProfile fixture and skips non-AP kinds; it skips a fixture only when
its strict parse fails AND it carries an unsubstituted {placeholder}, so a
runtime-rendered template can't masquerade as a clean AP and a genuine parse
failure is still surfaced.
Minor: include userDefinedNetwork in the profile-skip debug log, and document
the was_path_opened_with_suffix/prefix rule-author contract (no behaviour
change).
Signed-off-by: entlein <einentlein@gmail.com>
… read only Signed-off-by: entlein <einentlein@gmail.com>
6e0d0c3 to
44dec25
Compare
here things got confusing during the rebase, this is the sister PR to storage and need the signature PR (from matthyx)
Summary by CodeRabbit
New Features
sign-objectCLI for signing, verifying, and inspecting signed objects.Bug Fixes
Documentation / Chores
exepath) and updated signature verification configuration/flags.