Skip to content

Release 0.20.0

Choose a tag to compare

@tercel tercel released this 05 May 07:10
· 28 commits to main since this release

Added

  • Pluggable OverridesStore interface (sync alignment, CRITICAL #1) — New apcore.sys_modules.overrides module exposes the OverridesStore Protocol with default InMemoryOverridesStore and FileOverridesStore (atomic YAML write via tempfile + os.replace) implementations, mirroring TypeScript's apcore-typescript/src/sys-modules/overrides.ts (OverridesStore / InMemoryOverridesStore / FileOverridesStore). register_sys_modules(..., overrides_store=...) accepts any OverridesStore; loaded overrides are applied to Config (and the live ToggleState for toggle.* keys) at startup, and UpdateConfigModule / ToggleFeatureModule persist back through the store on every successful mutation. The legacy sys_modules.control.overrides_path config key is retained as a backwards-compat shim that auto-constructs a FileOverridesStore. The previously-private _load_overrides helper now delegates to FileOverridesStore for symmetry. New top-level exports: OverridesStore, InMemoryOverridesStore, FileOverridesStore.

  • Public SubscriberFactory API (Issue #36)apcore.events.register_subscriber_factory(type_name, factory) and apcore.events.create_subscriber_from_config(config) (also re-exported from apcore) bring Python to parity with TypeScript's createSubscriberFromConfig / registerSubscriberFactory and Rust's create_subscriber / register_factory. Built-in factory types (webhook, a2a, file, stdout, filter) are auto-registered on import. The previously-private _create_subscriber helper remains for back-compat.

  • Pipeline StepMiddleware (Issue #33 §2.2) — Formal middleware mechanism for pipeline steps. New StepMiddleware Protocol exposes optional before_step(step_name, state), after_step(step_name, state, result), and on_step_error(step_name, state, error) hooks; both sync and async implementations are supported (return values detected via inspect.isawaitable(), mirroring the Issue #42 async on_error fix). ExecutionStrategy gains a step_middlewares: list[StepMiddleware] field plus an add_step_middleware() registration method. before_step and on_step_error run in registration order; after_step runs in reverse (onion semantics). A non-None return from on_step_error is treated as a recovery StepResult and execution continues normally; returning None lets the original exception propagate. Exported from apcore as StepMiddleware.

  • apcore.observability.batch_span_processor module (Issue #43 §2) — Dedicated module hosting the canonical BatchSpanProcessor and SimpleSpanProcessor implementations, mirroring the layout of the TypeScript (src/observability/batch-span-processor.ts) and Rust (src/observability/processor.rs) SDKs. Adds a synchronous force_flush(timeout_ms=30000) -> bool method that drains the queue while the processor remains alive (returns True once empty, False on deadline). shutdown() is now idempotent. Queue-full enqueues now log a (rate-limited) WARNING so dropped spans surface in operator logs rather than only via the spans_dropped counter. The classes remain re-exported from apcore.observability.tracing and apcore.observability for backward compatibility.

  • Generic pluggable StorageBackend (PROTOCOL_SPEC Issue #43 §1) — New apcore.observability.storage module defines a namespaced save / get / list / delete Protocol and the default InMemoryStorageBackend implementation, mirroring the shape of TaskStore. ErrorHistory, MetricsCollector, and UsageCollector now accept an optional storage: StorageBackend | None = None constructor argument; when supplied, error entries are mirrored to the backend's "errors" namespace keyed by fingerprint, enabling cross-process persistence. External backends (Redis, SQL, S3) remain user-supplied. Exports: StorageBackend, InMemoryStorageBackend from apcore.observability and the top-level apcore package.

  • TaskStore.list_expired(before_timestamp) (cross-language alignment D-10) — New method on the TaskStore Protocol returning terminal-state (COMPLETED / FAILED / CANCELLED) tasks whose completed_at precedes before_timestamp. Implemented on InMemoryTaskStore. Drives TTL-based reaper logic; non-terminal tasks are never returned. The method is REQUIRED on the Protocol — custom stores written before this release must add an implementation.

  • Registry.discover_multi_class(file_path, extensions_root="extensions") (cross-language alignment D-15) — New instance method on Registry wrapping the existing free function apcore.registry.multi_class.discover_multi_class. The registry's configured pre_approval_hook is forwarded to the underlying scanner so signature-verification and audit policies apply uniformly. The free function remains importable for existing callers; new code SHOULD prefer the method.

  • Granular reload via path_filter input in ReloadModule (#45.4) — Registry.discover(path_filter=...) accepts a glob string or list of patterns and only walks matching files; previously-registered modules outside the filter remain untouched. Patterns are matched (via pathlib.PurePath.match) against both the absolute file path and its path relative to each configured extension root.

  • Error fingerprinting in ErrorHistory — dedup by (error_code, top-frame hash, sanitized message template) (#43 §4). New compute_error_fingerprint(error, module_id) folds the deepest stack-frame file:lineno:func (basename only, for cross-machine stability) into the SHA-256 digest in addition to the existing code/module/normalized-message inputs. Long hex runs (≥ 8 chars) are now collapsed to <HEX> alongside the existing UUID/timestamp/integer placeholders. Legacy 3-arg compute_fingerprint retained.

  • Configurable redaction via obs.redaction.regex_patterns and obs.redaction.sensitive_keys Config keys (#43 §5). New obs namespace ships with sensible defaults (password, secret, token, api_key, authorization, cookie, _secret_*, …); operators can override via apcore.yaml. RedactionConfig.from_config(config) / RedactionConfig.default() build the runtime config; _secret_ prefix matching becomes a default entry rather than a hard-coded rule. Field-name match is case-insensitive substring with -/_/space normalization (so "X-API-Key" matches "api_key"); value-regex match is case-insensitive. apcore.utils.redaction.redact_sensitive accepts new keyword overrides (sensitive_keys, regex_patterns, replacement).

Changed

  • Event names normalized to apcore.<subsystem>.<event> form (#36) — Four legacy event types (module_registered, module_unregistered, error_threshold_exceeded, latency_threshold_exceeded) now also emit canonical aliases apcore.registry.module_registered, apcore.registry.module_unregistered, apcore.health.error_threshold_exceeded, apcore.health.latency_threshold_exceeded. Both forms are emitted during the deprecation window so existing subscribers keep working; the legacy emission carries deprecated: true in data. Glob subscribers using apcore.registry.* and apcore.health.* now match correctly. Deprecation: legacy bare names will be removed in v0.22.0.
  • Contextual auditing for system control modules (Issue #45.2) — Audit events emitted by system.control.update_config (apcore.config.updated), system.control.toggle_feature (apcore.module.toggled), and system.control.reload_module (apcore.module.reloaded) now include the requester's caller_id from context.caller_id (defaults to the @external sentinel when unset) and a redacted identity dict (id, type, roles) when context.identity is present.
  • Pipeline configuration is fail-fast (Issue #33 §1.2)build_strategy_from_config now raises ConfigurationError (new typed error, code PIPELINE_CONFIGURATION_ERROR) instead of logging a warning when YAML refers to a step that does not exist (in remove, configure, or insert.before/insert.after), assigns an unknown field via configure, or omits both after and before anchors on an inserted step. Misconfigurations now surface at start-up rather than producing inscrutable runtime failures.
  • Pipeline strategy dependency validation is fail-fast (Issue #33 §2.1)ExecutionStrategy.__init__ and insert_after/insert_before now raise PipelineDependencyError (new typed error, code PIPELINE_DEPENDENCY_ERROR) when a step's requires keys are not provided by any preceding step's provides. The error names the offending step and the missing keys. A new validate_dependencies: bool = True keyword on ExecutionStrategy.__init__ lets internal callers (e.g. Executor.stream's post-stream sub-strategy) opt out when assembling derived strategies from an already-validated parent. Both new errors are exported from apcore.
  • Cross-language alignment (sync A-001) — Renamed CircuitOpenError (code CIRCUIT_OPEN) to canonical CircuitBreakerOpenError (code CIRCUIT_BREAKER_OPEN) to match TypeScript and Rust SDKs and the protocol spec. The legacy CircuitOpenError class is retained as a deprecated subclass alias of CircuitBreakerOpenError so existing except CircuitOpenError: blocks raising the legacy class continue to work; the legacy class will be removed in a future major release. The wire error code emitted by CircuitBreakerMiddleware is now CIRCUIT_BREAKER_OPEN for both classes. New ErrorCodes.CIRCUIT_BREAKER_OPEN constant added; ErrorCodes.CIRCUIT_OPEN retained as a deprecated alias. CircuitBreakerOpenError is exported from the top-level apcore package.
  • TaskStore.putsave (cross-language alignment D-10) — Renamed the canonical write method on the TaskStore Protocol. InMemoryTaskStore.put is retained as a deprecated shim that delegates to save and emits a DeprecationWarning; it will be removed in a future minor release. Internal AsyncTaskManager calls now route through a _save helper that prefers save and falls back to put for legacy custom stores.
  • TaskStatus.RETRYING removed (cross-language alignment D-12) — During retry backoff, the task status is now TaskStatus.PENDING to match the TypeScript and Rust SDKs and the protocol spec. TaskStatus.RETRYING remains accessible for one minor release as a deprecated attribute that resolves to TaskStatus.PENDING and emits a DeprecationWarning on access. The "retrying" enum value is no longer present in TaskStatus.__members__.
  • TaskInfo.attempt_numberretry_count (cross-language alignment D-13) — Renamed the dataclass field. attempt_number is retained as a deprecated property (with both getter and setter) that reads/writes retry_count and emits a DeprecationWarning. It will be removed in a future minor release.
  • ErrorHistory eviction is min-heap-based (PROTOCOL_SPEC Issue #43 §3) — Confirmed the in-place O(log N) min-heap eviction keyed on last_occurred with lazy deletion of stale entries from dedup-driven timestamp refreshes. Replaces the prior O(excess × M) linear scan for the global-oldest entry; per-insert eviction cost is bounded regardless of the number of tracked modules.
  • AsyncTaskManager.start_reaper aligned with TS / Rust D-11 surface — accepts ttl_seconds (seconds) and sweep_interval_ms (milliseconds) keyword arguments and returns a new ReaperHandle (with stop() / is_running()). The legacy interval_seconds / max_age_seconds arguments still work but emit DeprecationWarning; passing both legacy and new aliases for the same value raises TypeError. ReaperHandle is exported from apcore.async_task.
  • AsyncTaskManager.start_reaper default sweep_interval_ms aligned to 300_000 (sync alignment, WARNING #5) — Default sweep cadence changed from 3_600_000 ms (1 hour) to 300_000 ms (5 minutes), matching TypeScript and Rust. Callers that relied on the 1-hour default must now pass sweep_interval_ms=3_600_000 explicitly.

Fixed

  • Async on_error middleware now detects awaitable return values via inspect.isawaitable(...) rather than inspect.iscoroutinefunction(mw.on_error) (#42). The previous gate missed functools.partial wrappers and decorator-wrapped async handlers (no __wrapped__), causing the recovery coroutine to be silently dropped — isinstance(recovery, dict) then evaluated against an un-awaited coroutine and the chain aborted. The same fix applies to execute_before and execute_after. Truly synchronous handlers continue to run through asyncio.to_thread so blocking calls (time.sleep in RetryMiddleware) do not stall the event loop.

Added — PROTOCOL_SPEC hardening (Issues #32–#45)

  • AsyncTaskManager Evolution (PROTOCOL_SPEC Issue #34) — Pluggable TaskStore protocol with InMemoryTaskStore default; custom backends (Redis, SQL) can be injected at construction time. Per-task retry configuration via new RetryPolicy dataclass (max_retries, retry_delay_ms, backoff_multiplier, max_retry_delay_ms) and BackoffStrategy enum; tasks move to TaskStatus.RETRYING between attempts and FAILED after exhaustion. AsyncTaskManager.start_reaper(interval_seconds, max_age_seconds) / stop_reaper() — opt-in background task for automatic TTL-based deletion of terminal-state (COMPLETED, FAILED, CANCELLED) tasks. Exports: TaskStore, InMemoryTaskStore, RetryPolicy, BackoffStrategy.
  • Observability Hardening (PROTOCOL_SPEC Issue #43) — Pluggable ObservabilityStore protocol with InMemoryObservabilityStore default (apcore.observability.store). BatchSpanProcessor for non-blocking OTEL span export with configurable queue and drop-on-full spans_dropped counter (now exported from apcore.observability.tracing). O(log N) ErrorHistory eviction via min-heap keyed on last_occurred plus O(1) fingerprint index replacing prior O(M) ring-buffer scan. compute_fingerprint() — SHA-256 content-addressable error deduplication with UUID/timestamp normalization (exported from apcore.observability.error_history). RedactionConfig in ContextLogger for glob field_patterns and regex value_patterns applied at log time. PrometheusExporter HTTP server serving /metrics (Prometheus text format), /healthz (liveness), and /readyz (readiness) endpoints (apcore.observability.prometheus_exporter). MetricsCollector.export_prometheus() emits apcore_module_calls_total, apcore_module_errors_total, apcore_module_duration_seconds. UsageCollector.export_prometheus() emits apcore_usage_calls_total, apcore_usage_error_rate, apcore_usage_p50/p95/p99_latency_ms.
  • System Modules Hardening (PROTOCOL_SPEC Issue #45) — overrides_path parameter for register_sys_modules() loads a YAML/JSON override file after base config on startup (via AuditStore / OverridesStore pattern). Structured audit trail: AuditEntry, AuditStore protocol, and InMemoryAuditStore default record all state-modifying control-module calls with timestamp, action, actor_id, actor_type, trace_id, and before/after change dict (apcore.sys_modules.audit). fail_on_error: bool = False on register_sys_modules() — when True raises SysModuleRegistrationError; when False (default) logs ERROR and continues. path_filter glob on system.control.reload_module for bulk reload in dependency topological order; mutually exclusive with module_id (raises ModuleReloadConflictError on conflict). New error classes exported from apcore: SysModuleRegistrationError (code SYS_MODULE_REGISTRATION_FAILED), ModuleReloadConflictError (code MODULE_RELOAD_CONFLICT).
  • Schema System Hardening (PROTOCOL_SPEC Issue #44) — New apcore.schema.hardening module: content_hash(schema) returns the SHA-256 of canonical JSON for content-addressable schema deduplication; validate_schema_dict(schema, data) uses Draft202012Validator to exhaustively evaluate all anyOf/oneOf branches (no short-circuit), resolve recursive $ref, enforce numerical/string constraints, and emit SHOULD-level warnings (not hard errors) on unrecognized format values. Conformance fixtures added: schema_hardening_union.json, schema_hardening_recursive.json, schema_hardening_constraints.json, schema_hardening_formats.json, schema_hardening_cache.json.
  • Multi-Class Module Discovery (PROTOCOL_SPEC §2.1.1) — New apcore.registry.multi_class module: @multi_class decorator opts a class into multi-class per-file scanning; class_name_to_segment() derives a snake_case ID segment from a class name (CamelCase → snake_case); discover_multi_class() scans a file and produces IDs of the form base_id.class_segment. Single-class files with one decorated class receive the bare base_id (backward-compatible). ModuleIdConflictError (code MODULE_ID_CONFLICT) raised when two classes in the same file produce identical snake_case segments.
  • Middleware Architecture Hardening (PROTOCOL_SPEC Issue #42) — CircuitBreakerMiddleware tracks per-module consecutive failures in a rolling window; transitions through CLOSED → OPEN → HALF_OPEN state machine. When OPEN, before() raises CircuitOpenError (code CIRCUIT_OPEN) to short-circuit execution entirely. On state changes emits apcore.circuit.opened / apcore.circuit.closed events. CircuitState enum and CircuitBreakerMiddleware are exported from apcore.
  • Event Management Hardening (PROTOCOL_SPEC Issue #36) — CircuitBreakerWrapper in apcore.events.circuit_breaker wraps any EventSubscriber with independent circuit-breaker protection (CLOSED/OPEN/HALF_OPEN state machine, configurable open_threshold, recovery_window_ms); emits apcore.subscriber.circuit_opened / apcore.subscriber.circuit_closed events via the parent EventEmitter.
  • Conformance test suite expansiontests/conformance/test_pipeline_hardening.py (5 cases: fail-fast, continue-on-ignored-error, replace semantic, run_until termination, O(1) lookup), tests/conformance/test_schema_hardening.py (35 cases across union, recursive, constraints, formats, cache fixtures), tests/conformance/test_system_modules_hardening.py (10 cases: overrides persistence, audit entry extraction, Prometheus metrics, path_filter bulk reload, conflict error, fail_on_error behaviour).