feat(openfeature): emit server-side EVP flagevaluation#3984
feat(openfeature): emit server-side EVP flagevaluation#3984leoromanovsky wants to merge 22 commits into
Conversation
…ith PREP-01 libdatadog - Enable 'flagevaluation-evp' feature on datadog-ffe dep (FfeFlagEvaluationBatch type now compiled) - Fix components-rs/bytes.rs: update 4x VecMap::remove() -> remove_slow() for libdatadog compat post-commit 74284cac7 (VecMap API renamed); this unblocks compilation against the PREP-01 libdatadog ref
…patch - Two-tier aggregation in components-rs/ffe.rs: full→degraded→drop-counted with caps GLOBAL_CAP=131072/PER_FLAG_CAP=10000/DEGRADED_CAP=32768 - Killswitch DD_FLAGGING_EVALUATION_COUNTS_ENABLED (default: on) via evp_enabled() in Rust and isEvpEnabled() in EvaluationMetricRecorder.php - ddog_ffe_flush_flag_evaluation_batch() Rust C-export dispatches SidecarAction::FfeFlagEvaluationBatch via sidecar_blocking::enqueue_actions - ddtrace_ffe_flush_flag_evaluation_batch() C wrapper in tracer/ffe.c mirrors existing exposure/metric flush pattern with sidecar globals - RSHUTDOWN call added in tracer/ddtrace.c after existing flush calls - 11 Rust unit tests covering both tiers, overflow, drain, killswitch
…EVP aggregator race ddog_ffe_evaluate() records into the global EVP_AGGREGATOR; without EVP_TEST_LOCK the test ran concurrently with degraded_tier_overflow tests, causing dropped_degraded_overflow to be 2 instead of 1.
… + regen Cargo.lock Points dd-trace-php's libdatadog submodule at the local PREP-01 commit containing the flagevaluation EVP emitter (FfeFlagEvaluationBatch), so components-rs builds against it via the datadog-ffe path dep with the flagevaluation-evp feature. NOTE: 89a2ba7fc is local/unpushed — re-point to the merged upstream libdatadog SHA before any PR.
The Rust C-export ddog_ffe_flush_flag_evaluation_batch (components-rs/ffe.rs) was added without a matching prototype in the committed cbindgen header components-rs/datadog.h. tracer/ffe.c calls it, so PHP8's stricter toolchain fails with -Werror=implicit-function-declaration (ddtrace.so link Error 2). PHP7 only warned and linked, masking the bug. Prototype matches the Rust signature (SidecarTransport**/InstanceId*/QueueId*/CharSlice x3).
…ow drops The full-tier EVP flagevaluation drain previously emitted context: None and drained the degraded-overflow drop count silently. - Full tier now carries the pruned evaluation context (shared prune_context bounds: <=256 fields, string values >256 bytes skipped) plus context.dd.service, matching the degraded tier's cap enforcement. The pruned context is captured once per bucket at insertion and carried verbatim into the drained event. - The degraded-tier overflow drop counter is read-and-reset at drain and logged via tracing::warn when non-zero, so an undersized degradedCap is observable instead of a silent loss of legitimate counts.
…low surfacing - ddog_ffe_evaluate_populates_evp_aggregator_for_flush / _respects_killswitch: drive the real FFI entry point ddog_ffe_evaluate (the function the PHP/C layer calls) and assert it feeds the aggregator that the sidecar flush drains, closing the 'unit-green but emits nothing' gap that earlier tests left uncovered. - full_tier_event_carries_pruned_context / _prunes_oversized_string_values / _empty_context_emits_no_context_object: assert the full tier carries the pruned context and enforces the field/value bounds. - drain_resets_degraded_overflow_drop_counter: assert drain reads-and-resets the observable overflow drop counter.
…ncode-safe wire + reliable enqueue) Bump the libdatadog submodule to the bincode-safe flagevaluation fix (DataDog/libdatadog#2117): the worker->sidecar IPC is bincode, which the old serde_json::Value + skip_serializing_if wire types could not deserialize, so the sidecar silently dropped every batch. - Stringify the pruned full-tier context (JSON object string) at drain so the bincode wire stays plain; the sidecar flusher re-expands it into a JSON object for the POST. - Use sidecar_blocking::enqueue_actions_reliable for the one-shot RSHUTDOWN flush.
|
…2446-evp-flagevaluation-php # Conflicts: # libdatadog
Benchmarks [ tracer ]Benchmark execution time: 2026-06-23 17:42:30 Comparing candidate commit 37176c9 in PR branch Some scenarios are present only in baseline or only in candidate runs. If you didn't create or remove some scenarios in your branch, this maybe a sign of crashed benchmarks 💥💥💥 Scenarios present only in candidate:
Found 3 performance improvements and 1 performance regressions! Performance is the same for 190 metrics, 0 unstable metrics.
|
…2446-evp-flagevaluation-php # Conflicts: # libdatadog
Motivation
Customers need PHP services to report server-side feature-flag evaluation counts through the same backend contract as the other SDKs, including both the PHP 7 extension path and PHP 8 OpenFeature path. This contribution adds native PHP EVP
flagevaluationdelivery through the shared libdatadog sidecar path while preserving existing OTel metric and exposure behavior, giving APM a backend-verifiable rollout signal for approval.Justification
The 5 MiB limit is unlikely for small applications, but it is reachable at the scale this rollout is designed to support. Using compact JSON estimates against a 5,242,880-byte body limit: a minimal degraded row is about 137 bytes, so about 37,991 rows fit; a small full row is about 277 bytes, so about 18,859 rows fit; a normal full row with 10 context attributes is about 588 bytes, so about 8,901 rows fit; a max bounded-context row with 256 256-character fields is about 68,341 bytes, so only about 76 rows fit.
The target scale is 2,500 flags x 50 full-fidelity buckets, or 125,000 rows. Even small full rows can exceed 35 MiB at that scale, so byte splitting is a real compliance requirement rather than only defensive hardening. Existing backpressure bounds the queue and aggregate cardinality, but it does not bound the final encoded POST body; async posting avoids blocking customer evaluations but a single oversized request can still be rejected with 413.
Changes
reasonout of EVP payloads and aggregation keys.libdatadogto the companion flagevaluation sidecar delivery head.Decisions
reasonis not a hidden aggregate key.flowchart TD A[PHP extension flushes aggregated rows] --> B[send bounded IPC chunks to sidecar] B --> C[sidecar coalesces and serializes candidate JSON] C --> D{POST body <= 5 MiB?} D -- yes --> E[post asynchronously through Agent EVP proxy] D -- no --> F{single full row can degrade?} F -- yes --> G[omit targeting_key and context] G --> C F -- no --> H[drop, log, count]Validation Evidence
Payload Limit Follow-Up
37176c9b4libdatadogsubmodule pointer now references46734bc1e, which contains the sidecar EVP byte-splitting/degrade/drop fix.cargo nextest run -p datadog-sidecar ffe_flagevaluation_flusher,cargo check -p datadog-sidecar,cargo +nightly-2026-02-08 fmt --all -- --check, andcargo +stable clippy -p datadog-sidecar --all-targets --all-features -- -D warningspassed.Dogfooding App
ffe-dogfoodingapp-php7andapp-php8-openfeaturewere run with local PHP artifacts and the companion libdatadog sidecar changes.ffe-dogfooding-string-flagthrough PHP 7 with public-safe targeting keys:php7-evp-agent-20260622T2252-alphaphp7-evp-agent-20260622T2252-bravophp7-evp-agent-20260622T2252-charlieffe-dogfooding-string-flagthrough PHP 8 OpenFeature with public-safe targeting keys:php8of-evp-agent-20260622T2255-alphaphp8of-evp-agent-20260622T2255-bravophp8of-evp-agent-20260622T2255-charlievariant_2.System Tests
Staging End-To-End
flagevaluationrows for the exact PHP 7 and PHP 8 OpenFeature targeting keys above.flag.key=ffe-dogfooding-string-flag,variant.key=variant_2,allocation.key=allocation-override-392dd7c149f8, andevaluation_count=1.