Skip to content

feat(gateway): add channel-repo bindings store#672

Merged
marcusrbrown merged 1 commit into
mainfrom
feat/gateway-bindings-store
May 24, 2026
Merged

feat(gateway): add channel-repo bindings store#672
marcusrbrown merged 1 commit into
mainfrom
feat/gateway-bindings-store

Conversation

@marcusrbrown
Copy link
Copy Markdown
Collaborator

Adds the storage layer for channel-repo bindings — the first step in the Gateway v1 Unit 5 work that ends with a /add-project slash command.

Storage uses a two-key shape so the "one bound channel per repo" invariant is enforced atomically by S3:

  • Primary record at {prefix}/{identity}/{owner}/{repo}/bindings/repo.json — written with IfNoneMatch: '*'. Two concurrent setup flows for the same repo can't both win.
  • Channel-id lookup index at {prefix}/{identity}/_/_/bindings/by-channel/{channel_id}.json containing {owner, repo} — enables getBindingByChannelId(channelId) as a single GET + single GET. Used by the future @fro-bot mention dispatch.

If the secondary index write fails after the primary succeeded, the store rolls back the primary via conditionalDelete(primaryKey, {ifMatch: primaryEtag}). If the rollback itself fails, the caller gets PartialWriteError naming both keys for manual cleanup.

API

  • createBinding(binding) — atomic create. Returns both etags or BindingExistsError | StoreError | PartialWriteError.
  • getBindingByRepo(owner, repo) — single GET on the primary key.
  • getBindingByChannelId(channelId) — GET on the index, then GET on the primary. Orphan indexes return null cleanly.
  • listBindings() — LIST under {identity}/ enumerating all primary records.

Bindings parsed with runtime type guards (hasValidRepoBindingShape, hasValidChannelIndexShape) — no as Binding casts. Same posture as hasValidLockRecordShape from the coordination/lock module.

Test coverage: 22 new tests covering happy paths, race losers, orphan indexes, rollback success + failure, malformed JSON, and type-guard rejections.

Spec doc: this PR also lands the full Unit 5 implementation plan (docs/plans/2026-05-23-001-feat-gateway-unit-5-add-project-plan.md) so the remaining three PRs in this series have a single anchor doc. The plan went through coherence + feasibility + security + adversarial + scope-guardian + product-lens review before being committed.

Out of scope here — no Discord, no GitHub App, no slash command, no workspace HTTP service. Each lands in a separate follow-up PR.

@marcusrbrown marcusrbrown requested a review from fro-bot as a code owner May 23, 2026 23:34
@fro-bot fro-bot added the agent: working Agent is currently working on this label May 23, 2026
Copy link
Copy Markdown
Owner

@fro-bot fro-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: CONDITIONAL

Can merge after addressing the two blocking issues below.

Blocking issues

1. buildChannelIndexKey is unsanitized and path-injection-vulnerable (store.ts:61-66)

function buildChannelIndexKey(storeConfig: ObjectStoreConfig, identity: string, channelId: string): string {
  return `${storeConfig.prefix}/${identity}/_/_/bindings/by-channel/${channelId}.json`
}

channelId is passed directly into the S3 key with no sanitization. Discord channel IDs are normally numeric snowflakes, but the store accepts any string. A caller passing `channelId = '../../../other-binding'` would construct a traversal path like `fro-bot-state/gateway///bindings/by-channel/../../../other-binding.json`. While S3 normalises paths server-side, the resulting key would point outside the channel-index namespace, silently overwriting or reading an unintended key.

Fix: run channelId through sanitizeKeyComponent (which blocks .., null bytes, and /) before interpolating, and return a Result<string, Error> so the caller can propagate the error:

function buildChannelIndexKey(
  storeConfig: ObjectStoreConfig,
  identity: string,
  channelId: string,
): Result<string, Error> {
  const sanitized = sanitizeKeyComponent(channelId)
  if (sanitized.success === false) return err(sanitized.error)
  return ok(`${storeConfig.prefix}/${identity}/_/_/bindings/by-channel/${sanitized.data}.json`)
}

All three call sites (createBinding, getBindingByChannelId) would need to unwrap the result and propagate StoreError.


2. listBindings silently skips malformed records instead of surfacing them (store.ts:229-244)

When getObject returns a key-not-found for a key returned by list, the store skips it (continue). But when parseRepoBinding fails (malformed JSON / shape guard rejection), the function returns err(parsed.error) immediately — aborting the entire listing with a ValidationError. These two behaviors are inconsistent with each other and surprising to callers.

More critically: a single corrupted record makes listBindings completely unusable. For an admin listing all bindings to diagnose a problem, this means they can't see any bindings until the bad record is manually deleted.

Either: (a) skip-and-log corrupted records (return them as a separate invalid array) or (b) document this as intentional and add a test that explicitly asserts the abort-on-first-validation-error behavior so future authors don't accidentally "fix" it to skip-and-continue.

Non-blocking concerns

Index write race: IfNoneMatch: '*' on the channel index may be too strict (store.ts:128)

If the same channel was previously bound to a different repo, that binding's deleteBinding (future PR) would delete the primary record but leave the index entry. A subsequent createBinding for the same channel would then fail the IfNoneMatch check on the index write, trigger a rollback of the perfectly-valid primary, and return a StoreError to the user — even though there was no actual conflict.

The PR description acknowledges deleteBinding is out of scope, but the choice of IfNoneMatch: '*' for the index will couple the index write semantics to the delete behavior in the follow-up PR. Worth flagging now so the delete implementation knows it must clean up the index entry, or this store will need a breaking change to use unconditional put for the index.

storeConfig.prefix and identity are also unsanitized in buildChannelIndexKey

The prefix is validated upstream at configuration time and identity originates from internal config, so these are low risk in practice. But for consistency with buildPrimaryKey (which routes through buildObjectStoreKey and therefore sanitizeKeyComponent), it would be cleaner to reuse the same validation path.

listBindings does not check adapter.list for presence (store.ts:212-218)

listBindings checks adapter.getObject == null but calls adapter.list without a null check. list is non-optional on ObjectStoreAdapter, so this can't actually fail at runtime — but the pattern is inconsistent with the other methods. Minor.

Missing tests

  • No test for listBindings when a record is corrupted mid-list (validates or documents the abort-on-first-error behavior from blocking issue 2).
  • No test for createBinding when the channel index key construction fails (needed once blocking issue 1 is fixed and buildChannelIndexKey returns a Result).
  • No test for listBindings when adapter.list itself returns an error.

Risk assessment: MED

The path-injection issue (blocking #1) is a security concern but is partially mitigated in practice by Discord's numeric snowflake IDs. The listBindings abort behavior (blocking #2) is a reliability concern that will surface in production when any stored record drifts from the current schema (e.g., after a future field addition). Both are fixable before merge.


Run Summary
Field Value
Event pull_request
Repository fro-bot/agent
Run ID 26346494370
Cache hit
Session ses_1a8cd533bffeZTtQa0IRoqBrJG

@fro-bot fro-bot removed the agent: working Agent is currently working on this label May 23, 2026
Two-key S3 shape: primary repo record enforces one-binding-per-repo via IfNoneMatch; channel-id index enables fast reverse lookup. Rollback handling for partial writes. 22 new tests. Ships the Unit 5 implementation plan alongside the storage layer.
@marcusrbrown marcusrbrown force-pushed the feat/gateway-bindings-store branch from 10d203b to 27aba91 Compare May 23, 2026 23:43
@fro-bot fro-bot added the agent: working Agent is currently working on this label May 23, 2026
Copy link
Copy Markdown
Owner

@fro-bot fro-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: PASS

Both blocking issues from the previous review are resolved. Changes look correct.

Blocking issue 1 — fixed: buildChannelIndexKey now sanitizes identity and channelId through sanitizeKeyComponent and returns Result<string, Error>. All call sites propagate the error as StoreError. The function also validates identity for consistency with buildPrimaryKey.

Blocking issue 2 — fixed: listBindings now skips corrupted records with continue instead of aborting the listing. The comment at store.ts:265 documents the intent clearly.

All three missing tests added:

  • T1 (listBindings skips corrupted records) — covers the new skip-and-continue behavior.
  • T2 (createBinding returns StoreError when channelId is invalid) — exercises the traversal-attempt guard; asserts no S3 write is attempted.
  • T3 (listBindings returns StoreError when adapter.list fails) — fills the gap in list-error propagation coverage.

Blocking issues

None.

Non-blocking concerns

IfNoneMatch: '*' on channel index write remains (store.ts:147) — still flagged from last review. If deleteBinding (future PR) removes only the primary record and leaves the index, a subsequent createBinding for the same channel will fail the index IfNoneMatch check and roll back a valid primary write. This is a design coupling constraint, not a bug in this PR. The follow-up PR that implements deleteBinding must either (a) also delete the index entry, or (b) this store must switch to unconditional put for the index. Recommend adding a comment on line 147 noting this dependency so it's visible to the implementer.

Missing tests

None.

Risk assessment: LOW

The two security/reliability issues that elevated risk to MED are resolved. Remaining concern is a latent design coupling with the not-yet-written deleteBinding — low risk for this PR's blast radius.


Run Summary
Field Value
Event pull_request
Repository fro-bot/agent
Run ID 26346652102
Cache hit
Session ses_1a8cd533bffeZTtQa0IRoqBrJG

@fro-bot fro-bot removed the agent: working Agent is currently working on this label May 23, 2026
@marcusrbrown marcusrbrown merged commit 029d5f4 into main May 24, 2026
10 checks passed
@marcusrbrown marcusrbrown deleted the feat/gateway-bindings-store branch May 24, 2026 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants