Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion .github/workflows/security-fast.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,18 @@ jobs:

- name: Run pip-audit
run: |
# Dev-only transitive CVEs — no runtime impact, fixes require py3.10+ (we support 3.9)
# GHSA-5239-wwwm-4pmq: pygments ReDoS in AdlLexer (dev-only, no fix available)
# GHSA-58qw-9mgm-455v: pip tar/zip confusion (pip itself, no fix available)
# GHSA-jp4c-xjxw-mgf9: pip self-update import ordering (fix requires py3.10+)
# GHSA-qccp-gfcp-xxvc: urllib3 cross-origin header leak (fix 2.7.0 requires py3.10+)
# GHSA-mf9v-mfxr-j63j: urllib3 decompression bomb (fix 2.7.0 requires py3.10+)
uv run pip-audit --desc --format json --output pip-audit-report.json \
--ignore-vuln GHSA-5239-wwwm-4pmq \
--ignore-vuln GHSA-58qw-9mgm-455v
--ignore-vuln GHSA-58qw-9mgm-455v \
--ignore-vuln GHSA-jp4c-xjxw-mgf9 \
--ignore-vuln GHSA-qccp-gfcp-xxvc \
--ignore-vuln GHSA-mf9v-mfxr-j63j

- name: Upload report
if: always()
Expand Down
8 changes: 8 additions & 0 deletions .hooks/check-no-internal-docs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/sh
# Block internal development artifacts from being committed to this public repo.
# Matched files belong in tooling/, strategy/, or MCP memory — not here.
echo "BLOCKED - Internal development files must not be committed to this public repo"
echo "Files:"
for f in "$@"; do echo " $f"; done
echo "Move to tooling/, strategy/, or MCP memory instead."
exit 1
19 changes: 19 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,25 @@ repos:
files: \.rs$
pass_filenames: false

# Block internal development artifacts from public repo
- repo: local
hooks:
- id: check-no-internal-docs
name: Block internal docs from public repo
entry: .hooks/check-no-internal-docs.sh
language: script
files: |
(?x)^(
docs/superpowers/|
\.spec-workflow/specs/|
strategy/|
tooling/sessions/|
sessions/tasks/|
CALIBER_LEARNINGS\.md$|
\.caliber/
)
pass_filenames: true

# GitHub Actions workflow linting
- repo: https://github.com/rhysd/actionlint
rev: 914e7df21a07ef503a81201c76d2b11c789d3fca # v1.7.12 # pragma: allowlist secret
Expand Down
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

63 changes: 58 additions & 5 deletions src/cachekit/decorators/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -484,6 +484,16 @@ def create_cache_wrapper(
# Generated lazily on first use or regenerated after cache_clear()
function_identifier = f"{func.__module__}.{func.__qualname__}"

# Detect whether the wrapped function accepts parameters.
# Used to distinguish "invalidate the zero-arg entry" from "invalidate ALL entries".
_func_has_params = bool(inspect.signature(func).parameters)

# Track all cache keys written by this function (for no-args invalidation).
# When invalidate_cache() is called with no args on a parameterized function,
# we need to clear ALL entries — but key normalization (hashing of long keys)
# makes prefix matching unreliable. Tracking actual keys is simple and correct.
_cached_keys: set[str] = set()
Comment on lines +491 to +495
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Synchronize _cached_keys before mutating or iterating it.

The wrapper adds/discards keys on the request path and iterates the same set during invalidation without any lock. Concurrent calls can raise RuntimeError: Set changed size during iteration or miss keys that are added while the clear is in progress. Guard add/discard/snapshot/clear with a shared lock.

Also applies to: 603-603, 811-811, 945-945, 1048-1048, 1175-1175, 1255-1255, 1313-1324, 1331-1331, 1352-1363, 1370-1370

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cachekit/decorators/wrapper.py` around lines 491 - 495, The _cached_keys
set is accessed concurrently; create a dedicated lock (e.g., _cached_keys_lock =
threading.RLock()) next to the _cached_keys declaration and use it to guard all
mutations and snapshots: wrap add/discard calls that update _cached_keys inside
with _cached_keys_lock:, take a snapshot of the set under the lock (e.g., keys =
set(_cached_keys)) before iterating or clearing, and perform clear() under the
lock as well; apply the same pattern for all places that touch _cached_keys and
for invalidate_cache() so iteration never races with concurrent mutations.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Process-local key tracking can't clear shared L2 state.

_cached_keys only records keys observed by this wrapper instance. After a restart, or when another worker populates the same backend entries, no-arg invalidation will leave those L2 keys behind, so invalidate_cache() / cache_clear() become partial for shared backends. This needs a backend-scoped registry or backend-supported function-level delete, not an in-memory set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cachekit/decorators/wrapper.py` around lines 491 - 495, The wrapper
currently uses an in-memory set _cached_keys to track keys, which can't clear
shared L2 state across processes; replace this process-local registry with a
backend-scoped registry (or use backend-supported function-level delete).
Concretely: stop using the module-level _cached_keys; when the wrapper writes a
cache entry, also add the normalized key to a backend-managed list/set
namespaced by the function identifier (e.g., call
backend.add_function_key(function_id, key)); when
invalidate_cache()/cache_clear() is called with no args, fetch the stored keys
from the backend (backend.get_function_keys(function_id)), delete those keys via
the cache backend, and then remove/expire the registry
(backend.delete_function_keys(function_id)); ensure writes to the registry occur
on every cache miss/store and guard against race conditions/time-to-live for
stale entries. Use the existing wrapper function identifier (function name +
module or generated id used by the decorator) and the methods
invalidate_cache/cache_clear to locate where to call the backend registry APIs.


# Create stats tracker (session ID will be lazy-initialized on first use)
# Pass l1_enabled for rate limit classification header
_stats = _FunctionStats(function_identifier=function_identifier, l1_enabled=l1_enabled)
Expand Down Expand Up @@ -590,6 +600,7 @@ def sync_wrapper(*args: Any, **kwargs: Any) -> Any: # noqa: PLR0912
)
if _l1_cache and cache_key and serialized_bytes:
_l1_cache.put(cache_key, serialized_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)
except Exception as e:
# Serialization/storage failed but function succeeded - log and return result
logger().debug(f"L1-only mode: serialization/storage failed for {cache_key}: {e}")
Expand Down Expand Up @@ -797,6 +808,7 @@ def sync_wrapper(*args: Any, **kwargs: Any) -> Any: # noqa: PLR0912
# Also store in L1 cache for fast subsequent access (using serialized bytes)
if _l1_cache and cache_key and serialized_bytes:
_l1_cache.put(cache_key, serialized_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)

# Record successful cache set
set_duration_ms = (time.time() - start_time) * 1000
Expand Down Expand Up @@ -930,6 +942,7 @@ async def async_wrapper(*args: Any, **kwargs: Any) -> Any:
)
if _l1_cache and cache_key and serialized_bytes:
_l1_cache.put(cache_key, serialized_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)
except Exception as e:
# Serialization/storage failed but function succeeded - log and return result
logger().debug(f"L1-only mode: serialization/storage failed for {cache_key}: {e}")
Expand Down Expand Up @@ -1032,6 +1045,7 @@ async def async_wrapper(*args: Any, **kwargs: Any) -> Any:
# cached_data is already serialized bytes from Redis
cached_bytes = cached_data.encode("utf-8") if isinstance(cached_data, str) else cached_data
_l1_cache.put(cache_key, cached_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)

# Handle TTL refresh if configured and threshold met
if refresh_ttl_on_get and ttl and hasattr(_backend, "get_ttl") and hasattr(_backend, "refresh_ttl"):
Expand Down Expand Up @@ -1096,6 +1110,7 @@ async def async_wrapper(*args: Any, **kwargs: Any) -> Any:
cached_data.encode("utf-8") if isinstance(cached_data, str) else cached_data
)
_l1_cache.put(cache_key, cached_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)

return result
except Exception as e:
Expand All @@ -1121,6 +1136,7 @@ async def async_wrapper(*args: Any, **kwargs: Any) -> Any:
cached_data.encode("utf-8") if isinstance(cached_data, str) else cached_data
)
_l1_cache.put(cache_key, cached_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)

return result
except Exception:
Expand Down Expand Up @@ -1156,6 +1172,7 @@ async def async_wrapper(*args: Any, **kwargs: Any) -> Any:
serialized_data.encode("utf-8") if isinstance(serialized_data, str) else serialized_data
)
_l1_cache.put(cache_key, serialized_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)

# Record successful cache set
set_duration_ms = (time.perf_counter() - start_time) * 1000
Expand Down Expand Up @@ -1235,6 +1252,7 @@ async def async_wrapper(*args: Any, **kwargs: Any) -> Any:
serialized_data.encode("utf-8") if isinstance(serialized_data, str) else serialized_data
)
_l1_cache.put(cache_key, serialized_bytes, redis_ttl=ttl)
_cached_keys.add(cache_key)

# Record successful cache set
set_duration_ms = (time.perf_counter() - start_time) * 1000
Expand Down Expand Up @@ -1289,14 +1307,32 @@ def invalidate_cache(*args: Any, **kwargs: Any) -> None:
# If backend creation fails, can't invalidate L2
_logger.debug("Failed to get backend for invalidation: %s", e)

# Clear both L2 (backend) and L1 cache
# Fix #59: When called with no args on a parameterized function,
# invalidate ALL cached entries for this function.
# Without this, it generates a key for zero-arg call (never cached) → no-op.
if not args and not kwargs and _func_has_params:
# Snapshot prevents RuntimeError if another thread adds during iteration
keys_snapshot = set(_cached_keys)
for key in keys_snapshot:
if _l1_cache:
_l1_cache.invalidate(key)
if _backend and not _l1_only_mode:
invalidator.set_backend(_backend)
try:
_backend.delete(key)
except Exception as e:
_logger.debug("Failed to delete L2 key %s: %s", key, e)
continue # keep key tracked for retry
_cached_keys.discard(key)
return
Comment thread
coderabbitai[bot] marked this conversation as resolved.

# Single-key invalidation (specific args provided, or zero-param function)
cache_key = operation_handler.get_cache_key(func, args, kwargs, namespace, integrity_checking)

# Clear L1 cache first
if _l1_cache and cache_key:
_l1_cache.invalidate(cache_key)
_cached_keys.discard(cache_key)

# Clear L2 cache via invalidator (skip in L1-only mode)
if _backend and not _l1_only_mode:
invalidator.set_backend(_backend)
invalidator.invalidate_cache(func, args, kwargs, namespace)
Expand All @@ -1314,12 +1350,29 @@ async def ainvalidate_cache(*args: Any, **kwargs: Any) -> None:
# If backend creation fails, can't invalidate L2
_logger.debug("Failed to get backend for async invalidation: %s", e)

# Clear both L2 (backend) and L1 cache
# Fix #59: When called with no args on a parameterized function,
# invalidate ALL cached entries for this function.
if not args and not kwargs and _func_has_params:
keys_snapshot = set(_cached_keys)
for key in keys_snapshot:
if _l1_cache:
_l1_cache.invalidate(key)
if _backend and not _l1_only_mode:
invalidator.set_backend(_backend)
try:
_backend.delete(key)
except Exception as e:
_logger.debug("Failed to delete L2 key %s: %s", key, e)
continue
_cached_keys.discard(key)
return

# Single-key invalidation (specific args provided, or zero-param function)
cache_key = operation_handler.get_cache_key(func, args, kwargs, namespace, integrity_checking)

# Clear L1 cache first
if _l1_cache and cache_key:
_l1_cache.invalidate(cache_key)
_cached_keys.discard(cache_key)

# Clear L2 cache via invalidator (skip in L1-only mode)
if _backend and not _l1_only_mode:
Expand Down
Loading
Loading