Skip to content

feat(server): add evaluateFlags() API for single-call flag evaluation#498

Merged
dmarticus merged 4 commits intomainfrom
posthog-code/posthog-server-evaluate-flags-api
May 1, 2026
Merged

feat(server): add evaluateFlags() API for single-call flag evaluation#498
dmarticus merged 4 commits intomainfrom
posthog-code/posthog-server-evaluate-flags-api

Conversation

@dmarticus
Copy link
Copy Markdown
Contributor

@dmarticus dmarticus commented Apr 27, 2026

Problem

Phase 1 + Phase 2 of the Server SDK Feature Flag Evaluations RFC for the JVM posthog-server package. Companion to the Node SDK PR (PostHog/posthog-js#3476) and the Python SDK PR (PostHog/posthog-python#539).

Today every flag check on posthog-server fires its own /flags request, and capture(appendFeatureFlags = true) silently fires yet another on every captured event. The flag values on a captured event can diverge from the ones the code actually branched on when person/group properties differ between calls. appendFeatureFlags also attaches every evaluated flag to every event, which bloats properties on high-volume events.

Tracking RFC: requests-for-comments-internal#1020.

Changes

New API (Phase 1)

postHog.evaluateFlags(distinctId, …) returns a PostHogFeatureFlagEvaluations snapshot:

val flags = postHog.evaluateFlags(distinctId, personProperties = mapOf("plan" to "enterprise"))
if (flags.isEnabled("new-dashboard")) {
    renderNewDashboard()
}
postHog.capture(distinctId, "page_viewed", flags = flags)

A single /flags request powers both branching and event enrichment. isEnabled() and getFlag() fire $feature_flag_called events (deduped through the existing per-distinct-id LRU) with the full metadata — $feature_flag_id, $feature_flag_version, $feature_flag_reason, $feature_flag_request_id, $feature_flag_error, plus $feature_flag_definitions_loaded_at for locally-evaluated flags — so experiment exposure tracking keeps working.

Java callers get a PostHogEvaluateFlagsOptions.builder() analogue alongside the Kotlin named-args entry point:

PostHogFeatureFlagEvaluations flags = postHog.evaluateFlags(
    "user-123",
    PostHogEvaluateFlagsOptions.builder()
        .personProperty("plan", "enterprise")
        .flagKeys(List.of("new-dashboard"))
        .build()
);

Two layers of scoping

  • Network-level (flagKeys option): scopes the underlying /flags request itself. Goes into the request body as flag_keys_to_evaluate.

    val flags = postHog.evaluateFlags(distinctId, flagKeys = listOf("new-dashboard", "checkout-flow"))
  • Event-level (filter helpers): narrow which flags get attached to a captured event without re-fetching.

    postHog.capture(distinctId, "page_viewed", flags = flags.onlyAccessed())
    postHog.capture(distinctId, "page_viewed", flags = flags.only(listOf("new-dashboard")))

Both onlyAccessed() and only(...) clone the snapshot with their own accessed set so filtered views don't back-propagate into the parent.

Deprecations (Phase 2)

The legacy single-flag surface keeps working but is now @Deprecated:

  • isFeatureEnabled(...)
  • getFeatureFlag(...)
  • getFeatureFlagPayload(...)
  • getFeatureFlagResult(...)
  • capture(appendFeatureFlags = true) — emits a one-line deprecation log at runtime when truthy (Kotlin can't deprecate a single parameter value at compile time)

Each @Deprecated annotation includes a message pointing at evaluateFlags(...). Phase 3 (removal in next major) ships separately.

New config option

PostHogConfig.featureFlagsLogWarnings (default true) — set to false to silence the warnings emitted by the onlyAccessed() (empty-access fallback) and only(...) (unknown-key drop) filter helpers. Useful for callers who use those helpers conditionally.

Other request-body additions

evaluateFlags(...) also takes disableGeoip = true to forward geoip_disable into the /flags request body, parallel to the option that already exists in posthog-node and posthog-python.

Local evaluation

Transparent. When the poller resolves a flag, the snapshot carries locally_evaluated = true, reason "Evaluated locally", and $feature_flag_definitions_loaded_at is plumbed through, matching what the per-flag local path emits today.

Backwards compatibility

No breaking changes. All existing call paths return the same values they did before. Kotlin callers see a @Deprecated compile-time warning (silenceable with @Suppress("DEPRECATION")); the appendFeatureFlags = true runtime log goes through the standard config logger.

Internals

PostHogStateless.sendFeatureFlagCalled is split into captureFeatureFlagCalledEvent, which is shared between the per-flag accessor path and the new snapshot. Both paths now dedupe identically against the same PostHogFeatureFlagCalledCache LRU. The single-flag path also picks up $feature_flag_id / $feature_flag_version / $feature_flag_reason here — previously only the Android client emitted those, server SDK was missing them.

A small EvaluationsHost interface is what the snapshot calls back into, instead of holding a reference to the full client — keeps the snapshot easy to test in isolation with a fake host.

Response-level errors (errors_while_computing_flags, quota_limited) are propagated into snapshot $feature_flag_called events via EvaluateFlagsResult.responseError, matching the granularity of the per-flag path.

A new getFeatureFlagDetails(...) default method on PostHogFeatureFlagsInterface (returns null by default) lets the server SDK expose the cached FeatureFlag to the per-flag path without changing existing implementations.

Docs: posthog-server/USAGE.md updated with a Kotlin + Java example of the snapshot flow and the flagKeys vs only(...) distinction.

Tests

Two test files cover snapshot semantics and end-to-end flow:

  • PostHogFeatureFlagEvaluationsTest — snapshot accessors, full metadata on captures, filter helpers, onlyAccessed() empty-fallback warning, only(...) unknown-key warning, parent/child filter isolation, blank-distinctId no-op, locally-evaluated tagging, response-error propagation, featureFlagsLogWarnings = false host suppression.
  • PostHogEvaluateFlagsTest — end-to-end via MockWebServer: single /flags round-trip, no events fire before access, dedup across access, getFlagPayload no-event, capture(flags=…) doesn't issue a second request, flag_keys_to_evaluate body forwarding, blank-distinctId short-circuit, local evaluation tagging, quota_limited propagation, appendFeatureFlags = true deprecated path still works end-to-end.

./gradlew :posthog-server:check :posthog:check (tests + lint + apiCheck + animalsniffer) clean.


Created with PostHog Code

@dmarticus dmarticus force-pushed the posthog-code/posthog-server-evaluate-flags-api branch from 91d0f41 to f60ec17 Compare April 27, 2026 20:14
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

posthog-android Compliance Report

Date: 2026-04-30 21:47:36 UTC
Duration: 371ms

⚠️ Some Tests Failed

0/1 tests passed, 1 failed


Feature_Flags Tests

⚠️ 0/1 tests passed, 1 failed

View Details
Test Status Duration
Request Payload.Request With Person Properties Device Id 234ms

Failures

request_payload.request_with_person_properties_device_id

404, message='Not Found', url='http://sdk-adapter:8080/get_feature_flag'

@dmarticus dmarticus marked this pull request as ready for review April 27, 2026 22:12
@dmarticus dmarticus requested a review from a team as a code owner April 27, 2026 22:12
Comment on lines +82 to +89
if (accessedKeys.isEmpty()) {
if (host.warningsEnabled) {
host.logWarning(
"PostHogFeatureFlagEvaluations.onlyAccessed() called before any flag was " +
"accessed; returning all $${flagMap.size} flags instead of an empty snapshot.",
)
}
return cloneWith(flagMap.keys)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be returning an empty array

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1fabff1onlyAccessed() now returns an empty snapshot when nothing has been accessed (no warning, no fall back to all flags).

Comment on lines +736 to +742
getFeatureFlagsFromLocalEvaluation(
distinctId,
groups,
personProperties,
groupProperties,
onlyEvaluateLocally,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably out of scope for this PR, but noting that we don't support flagKeys in local evaluation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also flagKeys and disableGeoip is not considered in the cache key

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 1fabff1 — local evaluation evaluates every defined flag (no way around that without partial flag-definition support), but the result is now filtered by flagKeys post-hoc so callers see the same scoped set regardless of which tier resolved them. Added a comment in the code calling that out.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1fabff1flagKeys and disableGeoip are now part of FeatureFlagCacheKey, so successive evaluateFlags calls with different scopes don't return stale cached results. Added an integration test (evaluateFlags caches per (distinctId, flagKeys, disableGeoip) tuple) that verifies different flagKeys miss the cache.

flag: FeatureFlag?,
) {
synchronized(accessLock) { accessed.add(key) }
if (flag == null) return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python we'd continue to emit a$feature_flag_called for the missing flag (with $feature_flag_error: flag_missing)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1fabff1recordAccess now still fires for flag == null, builds props with $feature_flag_error: flag_missing, and combines with the response-level error when both apply (e.g. errors_while_computing_flags,flag_missing). Two new tests cover this.

* are an inert read.
*/
public fun getFlagPayload(key: String): Any? {
return flagMap[key]?.metadata?.payload
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FeatureFlagMetadata.payload is typed as a String?, so either the return type is wrong or this isn't returning what we expect it to return.

Perhaps we deserialize and copy what FeatureFlagResult.getPayloadAs<T> does

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, is the payload always guaranteed to be a string now? I thought the API response could also be a deserialized JSON value, but maybe this has changed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1fabff1 — split into two methods to make the contract clear:

  • getFlagPayload(key): String? returns the raw JSON-encoded payload string (matches FeatureFlagMetadata.payload).
  • getFlagPayloadAs<T>(key) (and the Class<T> overload) Gson-parses the raw string, mirroring FeatureFlagResult.getPayloadAs.

I couldn't delegate to FeatureFlagResult.getPayloadAs directly because that path treats payload as already-deserialized — when you pass the raw JSON string "\"hello\"" it toJsonTrees it as a primitive instead of parsing. The new method always Gson-parses so quoted JSON literals deserialize correctly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FeatureFlagMetadata.payload is typed as String? today, and the /flags v2 deserializer (in posthog/internal) keeps it as a string. So in this PR the snapshot stores it as the raw JSON string and getFlagPayloadAs<T> does the Gson parse. If the wire format is ever broadened to deserialize JSON values upfront we'd revisit, but matched the current behavior here.

Comment on lines +265 to +269
private fun appendFlagPropertiesFromMap(
properties: Map<String, Any>?,
flags: Map<String, FeatureFlag>?,
): Map<String, Any> {
val props = properties?.toMutableMap() ?: mutableMapOf()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: in Python we prefer user defined properties over generated properties. In this method we override user properties

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1fabff1appendFlagPropertiesFromMap now uses putIfAbsent for both $feature/<key> and $active_feature_flags, so user-supplied values win. Added an integration test (capture preserves user-supplied feature properties over snapshot values).

if (host.warningsEnabled) {
host.logWarning(
"PostHogFeatureFlagEvaluations.onlyAccessed() called before any flag was " +
"accessed; returning all $${flagMap.size} flags instead of an empty snapshot.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"accessed; returning all $${flagMap.size} flags instead of an empty snapshot.",
"accessed; returning all ${flagMap.size} flags instead of an empty snapshot.",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moot now — the warning is gone (onlyAccessed() returns an empty snapshot, no warning). Thanks for catching the typo.

Comment on lines +38 to +40
/** Returns the immutable flag map this snapshot was built from. */
public val flags: Map<String, FeatureFlag>
get() = flagMap
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee that this is immutable when called from Java

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1fabff1flagMap and locallyEvaluated are now wrapped in Collections.unmodifiableMap at construction. Java callers that try to mutate get an UnsupportedOperationException.

Comment on lines +470 to +476
@PostHogInternal
public fun captureFeatureFlagCalledEvent(
distinctId: String,
key: String,
value: Any?,
properties: Map<String, Any>,
) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be protected instead of public

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 1fabff1 — changed to protected. The inner EvaluationsHost anonymous object in PostHog calls it via this@PostHog.captureFeatureFlagCalledEvent(...), which is a subclass-internal access so the visibility constraint holds.

get() = flagMap.keys.toList()

/** Returns the immutable flag map this snapshot was built from. */
public val flags: Map<String, FeatureFlag>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be public? It leaks com.posthog.internal.FeatureFlag into the public API space

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 1fabff1, the flags getter is now internal. Same-package internal callers (PostHog.appendFlagPropertiesFromMap) still read it; tests use the constructor's parameter directly.

Comment on lines +140 to +146
/**
* Whether the SDK logs warnings emitted by `PostHogFeatureFlagEvaluations` filter helpers
* (`onlyAccessed()` with no recorded access, `only([...])` with unknown keys). Set to false
* to silence those messages.
* Defaults to true.
*/
public var featureFlagsLogWarnings: Boolean = true,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? The onlyAccessed case should be fixed, and it doesn't seem worth adding an additional configuration option just for the latter case

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 1fabff1. With onlyAccessed() now returning an empty snapshot, the only remaining warning is the rare only(...) typo case, which doesn't justify a dedicated config.

Adds a `PostHogFeatureFlagEvaluations` snapshot returned by
`PostHogInterface.evaluateFlags(distinctId)`. The snapshot exposes
`isEnabled`/`getFlag`/`getFlagPayload` plus `onlyAccessed()` and
`only(keys)` filters; accessor calls fire deduped `$feature_flag_called`
events with `$feature_flag_id`/`$feature_flag_version`/`$feature_flag_reason`
metadata, an empty-distinctId snapshot short-circuits all events, and
filtered clones keep their own access set.

`capture()` gains a `flags` parameter that takes a snapshot and attaches
`$feature/<key>` and `$active_feature_flags` without making a second
`/flags` request. The dedup helper on `PostHogStateless` is split into
`captureFeatureFlagCalledEvent` so both the per-flag accessor and the
snapshot share the same per-distinct-id LRU. Locally-evaluated flags
carry a "local_evaluation" reason and the snapshot stamps
`locally_evaluated=true` plus `$feature_flag_definitions_loaded_at` to
match posthog-node and posthog-python.

Adds `featureFlagsLogWarnings` config option to silence filter-helper
warnings, threads `flagKeys` and `geoip_disable` into the `/flags`
request body, and ships JUnit coverage for the snapshot, dedup,
empty-distinctId, local-evaluation, and `capture(flags=)` paths.

The existing `appendFeatureFlags = true` capture path is preserved
unchanged; deprecation of the per-flag accessors is Phase 2.

Generated-By: PostHog Code
Task-Id: 87de4c67-f607-4432-b8ee-3c059e368f81
Phase 1 (already on this branch): `evaluateFlags(distinctId)` snapshot,
`capture(flags = …)`, dedup helper extraction, full metadata on
`$feature_flag_called`.

Phase 2 (new in this commit):

- `@Deprecated` annotations on `isFeatureEnabled`, `getFeatureFlag`,
  `getFeatureFlagPayload`, and `getFeatureFlagResult` (all overloads on
  both `PostHogInterface` and the `PostHog` server class), with messages
  pointing callers at `evaluateFlags(...)`.
- Runtime deprecation log when `capture(appendFeatureFlags = true)` is
  used — mirrors the Python PR's "only-when-truthy" runtime warning,
  since Kotlin can't deprecate a single parameter value at compile time.
- All legacy paths keep working unchanged; deprecations can be silenced
  with `@Suppress("DEPRECATION")`. Removal is targeted at the next major.

Also: response-level errors (`errors_while_computing_flags`,
`quota_limited`) are now propagated into snapshot
`$feature_flag_called` events as `$feature_flag_error`, matching the
granularity the per-flag accessor path emits.

New tests:
- `responseError` propagates to `$feature_flag_called` (unit + integration).
- `appendFeatureFlags = true` deprecated path still attaches feature
  properties end-to-end.

Generated-By: PostHog Code
Task-Id: 87de4c67-f607-4432-b8ee-3c059e368f81
All twelve review comments from @dustinbyrne addressed:

- `flag_missing` now fires `$feature_flag_called` (with
  `$feature_flag_error: flag_missing`) for unknown keys, matching the
  per-flag accessor path. Combines with response-level errors when both
  apply.
- `getFlagPayload(key)` returns the raw `String?` payload (signature
  matches reality); new `getFlagPayloadAs<T>(key)` Gson-parses it,
  matching `FeatureFlagResult.getPayloadAs` semantics. Always parses,
  even when payload is a String, so quoted JSON literals deserialize.
- `onlyAccessed()` returns an empty snapshot when nothing has been
  accessed (no warning, no fall back to all flags). Removed the
  `featureFlagsLogWarnings` config option since the surviving
  `only(...)` warning isn't worth a dedicated config.
- `appendFlagPropertiesFromMap` uses `putIfAbsent` so user-supplied
  `$feature/<key>` and `$active_feature_flags` win over snapshot-derived
  values.
- `FeatureFlagCacheKey` now includes `flagKeys` and `disableGeoip` so
  successive `evaluateFlags` calls with different scopes don't return
  stale cached results.
- Local-evaluation path filters its result map by `flagKeys` post-hoc so
  `evaluateFlags(flagKeys = ...)` returns the same scoped set whether
  the underlying tier is local or remote.
- `flags` getter on `PostHogFeatureFlagEvaluations` is now `internal`
  (no longer leaks `com.posthog.internal.FeatureFlag` into the public
  API). Internal map and `locallyEvaluated` are wrapped in
  `Collections.unmodifiableMap` so Java callers can't mutate them.
- `PostHogStateless.captureFeatureFlagCalledEvent` is now `protected`
  instead of `public @PostHogInternal`.
- Tests updated for all new behavior; added coverage for `flag_missing`,
  combined-error formatting, `getFlagPayloadAs` for object/list/string
  payloads, user-property override on `capture(flags = ...)`, and cache
  key isolation across different `flagKeys` tuples.

Generated-By: PostHog Code
Task-Id: 87de4c67-f607-4432-b8ee-3c059e368f81
@dmarticus dmarticus force-pushed the posthog-code/posthog-server-evaluate-flags-api branch from 1fabff1 to 3443458 Compare April 30, 2026 21:11
The release workflow drives versioning + changelog from `.changeset/*.md`
files; manual entries under `## Next` get bypassed by the bot. Move the
content into a proper changeset (posthog-server: minor — additive API +
deprecations, no breaking removals) and clear the unused manual lines.

Generated-By: PostHog Code
Task-Id: 87de4c67-f607-4432-b8ee-3c059e368f81
@dmarticus dmarticus merged commit cd5fada into main May 1, 2026
13 checks passed
@dmarticus dmarticus deleted the posthog-code/posthog-server-evaluate-flags-api branch May 1, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants