Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,15 +153,15 @@ def get_user_profile(user_id: int):
return db.fetch_user(user_id)
```

| Feature | `@cache.minimal` | `@cache.production` | `@cache.secure` | `@cache.io()` |
|:--------|:----------------:|:-------------------:|:---------------:|:-------------:|
| Circuit Breaker | - | ✅ | ✅ | ✅ |
| Adaptive Timeouts | - | ✅ | ✅ | ✅ |
| Monitoring | - | ✅ Full | ✅ Full | ✅ Full |
| Integrity Checking | - | ✅ Enabled | ✅ Enforced | ✅ Enabled |
| Encryption | - | - | ✅ Required | - |
| Backend | Redis | Redis | Redis | CachekitIO SaaS |
| **Use Case** | High throughput | Production reliability | Compliance/security | Managed cloud |
| Feature | `@cache.minimal` | `@cache.production` | `@cache.secure` | `@cache.io()` | `@cache.local()` |
|:--------|:----------------:|:-------------------:|:---------------:|:-------------:|:----------------:|
| Circuit Breaker | - | ✅ | ✅ | ✅ | - |
| Adaptive Timeouts | - | ✅ | ✅ | ✅ | - |
| Monitoring | - | ✅ Full | ✅ Full | ✅ Full | ✅ Basic |
| Integrity Checking | - | ✅ Enabled | ✅ Enforced | ✅ Enabled | - |
| Encryption | - | - | ✅ Required | - | - |
| Backend | Redis | Redis | Redis | CachekitIO SaaS | In-process |
| **Use Case** | High throughput | Production reliability | Compliance/security | Managed cloud | Opaque objects |

<details>
<summary><strong>Additional Presets</strong></summary>
Expand Down
310 changes: 310 additions & 0 deletions docs/features/reference-caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,310 @@
**[Home](../README.md)** › **Features** › **Reference Caching**

# Reference Caching with @cache.local()

**Available since v0.6.0**

## TL;DR

Reference caching (`@cache.local()`) caches opaque, non-serializable objects — things that can't be converted to bytes. Perfect for SDK connections, ML models, database connections, and language runtime objects that must maintain identity.

```python notest
from cachekit import cache

@cache.local()
def get_langfuse_client():
"""Returns same instance for identical args."""
return Langfuse(project="my-project")

client1 = get_langfuse_client()
client2 = get_langfuse_client()
assert client1 is client2 # Same object by reference
```

---

## When to Use

Use `@cache.local()` when:

| Scenario | Example | Why |
|----------|---------|-----|
| **SDK connections** | Langfuse, httpx.Client, gRPC stubs | Maintain session state, reuse TCP connection |
| **ML models** | Loaded transformers, embedders | In-memory weight matrices, can't be serialized |
| **Database connections** | SQLAlchemy session, asyncpg connection | Stateful, must be re-created per process |
| **Language runtime objects** | Java objects via jpype, R objects via rpy2 | Opaque to Python, cross-language marshalling complex |

**NOT for**: Serializable data (dicts, dataframes, JSON). Use `@cache` for those.

---

## Quick Start

```python notest
from cachekit import cache

# Cache a connection pool (same instance reused)
@cache.local()
def get_database_client(host: str):
return sqlalchemy.create_engine(f"postgresql://{host}/mydb")

# First call: creates client
db1 = get_database_client("localhost")

# Second call with same args: returns cached instance
db2 = get_database_client("localhost")

assert db1 is db2 # Guaranteed same object
```

Configure TTL and size:

```python notest
@cache.local(ttl=600, max_entries=128)
def get_model(model_name: str):
return transformers.AutoModel.from_pretrained(model_name)
```

---

## Mutation Warning ⚠️

**Critical**: Cached objects are returned by reference. Mutations affect all callers.

```python
from cachekit import cache

@cache.local()
def get_config_dict():
return {"timeout": 30, "retries": 3}

config1 = get_config_dict()
config1["timeout"] = 999 # Mutation!

config2 = get_config_dict()
assert config2["timeout"] == 999 # 999, not 30 — same object!
```

**Fix**: Copy if you need to mutate:

```python
import copy
from cachekit import cache

@cache.local()
def get_safe_config():
return {"timeout": 30, "retries": 3}

config = copy.copy(get_safe_config()) # Shallow copy
config["timeout"] = 999 # Safe — original unchanged
assert get_safe_config()["timeout"] == 30
```

---

## Identity Semantics

**Same args = same object**:

```python
from cachekit import cache

@cache.local()
def get_client(api_key: str):
return {"key": api_key, "session": object()}

client_a = get_client("key123")
client_b = get_client("key123")
assert client_a is client_b # True (identity check)

client_c = get_client("key456")
assert client_a is not client_c # Different args, different object
```

**Feature for connection reuse**:
- Multiple callers within same process reuse same socket, session state, credentials
- Reduces memory overhead, avoids re-authentication

**Footgun for mutable data**:
- Mutations visible to all callers
- Use `@cache` (which serializes) if you need isolation
- Or manually copy on retrieval

---

## Object Lifecycle

Cached objects are held strongly until eviction:

```text
Function call with args A
[Check cache]
├─ Hit: Return cached object (reference count unchanged)
└─ Miss: Call function, store result
[Store in LRU cache (max_entries)]
[LRU eviction when full]
└─ Oldest/least-used entry removed
[Object eligible for garbage collection if no other refs]
```

**Strong reference guarantee**: Cached object won't be garbage-collected until:
1. Evicted from cache (LRU), OR
2. Cache cleared via `cache_clear()`, OR
3. Function invalidated via `invalidate_cache()`

---

## Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `ttl` | int | 300 | Time-to-live in seconds. Objects evicted after TTL expires. |
| `max_entries` | int | 256 | Maximum cache entries. Oldest removed when full. |
| `namespace` | str | None | Optional key namespace for multi-tenant isolation. |
| `key` | callable | None | Custom key function. Default: `(args, kwargs)` hash. |

---

## Wrapper API

All cachekit decorators expose the same cache management interface:

```python
from cachekit import cache

@cache.local(ttl=300)
def get_client(api_key: str):
return {"key": api_key}

get_client("key1") # miss
get_client("key1") # hit

info = get_client.cache_info()
assert info.hits == 1
assert info.misses == 1
assert info.maxsize == 256
assert info.currsize == 1

get_client.invalidate_cache("key1") # Remove specific entry
assert get_client.cache_info().currsize == 0

get_client("key2")
get_client.cache_clear() # Remove all entries
assert get_client.cache_info().currsize == 0

raw = get_client.__wrapped__("key3") # Bypass cache
assert get_client.cache_info().currsize == 0 # Not cached
```

Async variant:

```python notest
@cache.local()
async def get_async_client(api_key: str):
return await create_client(api_key)

await get_async_client.ainvalidate_cache("my-api-key")
```

---

## Comparison

| Feature | `@cache.local()` | `functools.lru_cache` | `cachetools.TTLCache` |
|---------|:----------------:|:---------------------:|:---------------------:|
| **In-process only** | ✅ | ✅ | ✅ |
| **Distributed** | ❌ | ❌ | ❌ |
| **TTL support** | ✅ | ❌ | ✅ |
| **Unhashable args** (dicts, lists) | ✅ | ❌ | ❌ |
| **Async functions** | ✅ | ❌ | ✅ |
| **Per-key invalidation** | ✅ | ❌ | ❌ |
| **Thread-safe** | ✅ | ✅ | ❌ (needs lock) |
| **Hit/miss statistics** | ✅ | ✅ | ❌ (size only) |

**Why @cache.local() wins**:
- Accepts unhashable args (no need to convert to strings)
- TTL + LRU (best of both worlds)
- Async-native, not a decorator shim
- Per-key invalidation without clearing entire cache
- Thread-safe by default

---

## Future: Lifecycle Callbacks (v0.7)

**Planned** (not yet available):

```python notest
def on_evict(key, value):
"""Called when cache entry is evicted."""
value.cleanup() # Close connection, free memory, etc.

@cache.local(on_evict=on_evict)
def get_connection():
return Database.connect()
```

For now, manually call cleanup when needed:

```python notest
get_connection.cache_clear() # All entries evicted, consider calling cleanup
```

---

## Examples

**ML model caching**:
```python notest
import torch
from cachekit import cache

@cache.local(ttl=3600, max_entries=4)
def load_model(model_name: str):
"""Load once, reuse across requests."""
return torch.hub.load("pytorch/vision", model_name, pretrained=True)

# First inference: loads model (slow)
embeddings1 = load_model("resnet50")(image1)

# Second inference: reuses model (fast)
embeddings2 = load_model("resnet50")(image2)
```

**Database session pool**:
```python notest
from sqlalchemy.orm import Session
from cachekit import cache

@cache.local()
def get_session(db_url: str) -> Session:
"""Reuse SQLAlchemy session per connection string."""
engine = sqlalchemy.create_engine(db_url)
return Session(engine)

# Multiple queries reuse same session
result1 = get_session("postgresql://...").query(User).all()
result2 = get_session("postgresql://...").query(Product).all()
```

**Async HTTP client**:
```python notest
import httpx
from cachekit import cache

@cache.local()
async def get_http_client(api_key: str):
"""Reuse HTTP client for connection pooling."""
return httpx.AsyncClient(
headers={"Authorization": f"Bearer {api_key}"},
http2=True
)

# Both requests use same connection pool
response1 = await (await get_http_client("key123")).get("https://api.example.com/users")
response2 = await (await get_http_client("key123")).get("https://api.example.com/products")
```
19 changes: 18 additions & 1 deletion src/cachekit/decorators/intent.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Intent-based cache decorator interface.

Provides the @cache decorator with intent-based variants (@cache.minimal, @cache.production, @cache.secure, @cache.dev, @cache.test).
Provides the @cache decorator with intent-based variants (@cache.minimal, @cache.production,
@cache.secure, @cache.dev, @cache.test, @cache.local).
"""

from __future__ import annotations
Expand Down Expand Up @@ -103,6 +104,21 @@ def cache(
"""

def decorator(f: F) -> F:
# LOCAL INTENT: short-circuit before any DecoratorConfig resolution.
# Must be first — backend pop and l1_enabled mapping below would
# silently consume kwargs that create_local_wrapper must reject.
if _intent == "local":
if config is not None:
raise TypeError(
"@cache.local() does not accept config=. DecoratorConfig configures "
"backends, serialization, and encryption — none of which apply to "
"in-process reference caching. Pass parameters directly: "
"@cache.local(ttl=300, max_entries=256)"
)
from .local_wrapper import create_local_wrapper

return create_local_wrapper(f, **manual_overrides) # type: ignore[return-value]

# Resolve backend at decorator application time
# Track if backend=None was explicitly passed (L1-only mode)
# This is a sentinel problem: we need to distinguish between:
Expand Down Expand Up @@ -193,4 +209,5 @@ def decorator(f: F) -> F:
cache.dev = functools.partial(cache, _intent="dev") # type: ignore[attr-defined]
cache.test = functools.partial(cache, _intent="test") # type: ignore[attr-defined]
cache.io = functools.partial(cache, _intent="io") # type: ignore[attr-defined] # SaaS backend
cache.local = functools.partial(cache, _intent="local") # type: ignore[attr-defined]
# Note: L1-only mode requires explicit backend=None parameter (no preset decorator)
Loading
Loading