Skip to content

feat: add telemetry audit mode with FileSystemSink#1014

Merged
Hweinstock merged 2 commits into
aws:mainfrom
Hweinstock:feat/filesystem-sink-audit
Apr 30, 2026
Merged

feat: add telemetry audit mode with FileSystemSink#1014
Hweinstock merged 2 commits into
aws:mainfrom
Hweinstock:feat/filesystem-sink-audit

Conversation

@Hweinstock
Copy link
Copy Markdown
Contributor

@Hweinstock Hweinstock commented Apr 29, 2026

Description

Adds telemetry audit mode: when enabled, CLI commands write telemetry data to ~/.agentcore/telemetry/ as JSONL files for local inspection.

  • FilesystemSink — metric-agnostic MetricSink that appends { value, attrs } as JSONL. Each record() appends a line to the file.
  • TelemetryClientAccessor — static class managing the client lifecycle. init() creates eagerly, get() creates lazily if needed, shutdown() proxies to the client. Called from main() at startup/exit.
  • Audit activationAGENTCORE_TELEMETRY_AUDIT=1 env var or telemetry.audit: true in config. The agentcore telemetry audit command is intentionally not introduced since we are not introducing this functionality to customers yet.
  • Instrumented commandshelp and help.modes wrapped with withCommandRun() as proof of concept. Same/similar logic will follow for rest of commands.

Related Issue

Closes #

Documentation PR

N/A — hidden flag, no user-facing docs yet.

Type of Change

  • New feature

Testing

In addition to tests added, also tested e2e with packaged tarball.

  • I ran npm run test:unit and npm run test:integ
  • I ran npm run typecheck
  • I ran npm run lint
  • If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

@github-actions github-actions Bot added size/m PR size: M agentcore-harness-reviewing AgentCore Harness review in progress labels Apr 29, 2026
@agentcore-cli-automation
Copy link
Copy Markdown

Reviewed the sink implementation, public export change, and tests.

The implementation matches the existing MetricSink pattern (mirrors OtelMetricSink), buffering and flush semantics are reasonable for short-lived CLI commands, and error propagation works correctly with CompositeSink's Promise.allSettled. The command_group values used as filename components come from a closed enum via deriveCommandGroup, so path-traversal isn't a real concern with current wiring. Tests cover the meaningful edge cases (entrypoint stability, missing command_group fallback, empty-buffer no-op, shutdown logging behavior, directory auto-creation, repeat flushes).

LGTM — no blocking issues from me. 🚀

@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from a450075 to ee14f54 Compare April 29, 2026 14:16
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from ee14f54 to df58164 Compare April 29, 2026 14:18
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from df58164 to 4a59361 Compare April 29, 2026 14:21
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from 4a59361 to 2416e93 Compare April 29, 2026 14:26
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from 2416e93 to 3817b33 Compare April 29, 2026 14:32
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from 3817b33 to be67dce Compare April 29, 2026 14:34
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from be67dce to 85582a1 Compare April 29, 2026 14:35
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from 6d61a36 to 5f4f27c Compare April 29, 2026 14:47
@github-actions github-actions Bot removed the size/m PR size: M label Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from 0a05c8f to c24b9cf Compare April 29, 2026 16:31
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from c24b9cf to fa54cb7 Compare April 29, 2026 16:32
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from fa54cb7 to fa2076f Compare April 29, 2026 16:36
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from fa2076f to 56a429b Compare April 29, 2026 16:42
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock Hweinstock force-pushed the feat/filesystem-sink-audit branch from 56a429b to 9db18ad Compare April 29, 2026 16:46
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 29, 2026
@Hweinstock
Copy link
Copy Markdown
Contributor Author

/strands review

@Hweinstock Hweinstock changed the title feat: add telemetry audit mode with FilesystemSink feat: add telemetry audit mode with FileSystemSink Apr 29, 2026
@Hweinstock Hweinstock marked this pull request as ready for review April 29, 2026 17:06
@Hweinstock Hweinstock requested a review from a team April 29, 2026 17:06
@github-actions
Copy link
Copy Markdown
Contributor

Review Summary

I've reviewed the telemetry audit mode implementation. The code quality is solid and follows existing patterns well. No blocking issues found, but I have a few suggestions and questions for clarification.

✅ What's Working Well

  • Consistent architecture: FileSystemSink mirrors the existing OtelMetricSink pattern perfectly
  • Proper error handling: Telemetry failures never affect CLI behavior (try-catch in all the right places)
  • Good test coverage: Both unit tests for FileSystemSink and integration tests for the full flow
  • Security: Path traversal is not a concern since command_group comes from a closed enum via deriveCommandGroup
  • Async safety: The pendingWrite promise chain properly serializes writes while keeping record() synchronous

💡 Suggestions

1. TelemetryClientAccessor.init() timing (src/cli/cli.ts:226)

Observation: init() is called after the TUI path returns (line 218-219), so TUI mode won't have telemetry initialized.

Context: This is actually fine for the current scope since:

  • TUI uses a separate rendering flow (doesn't call program.parseAsync)
  • Only CLI commands are instrumented with withCommandRun in action handlers
  • PR description states "CLI commands" as the scope

Suggestion: Consider documenting this limitation or adding a TODO comment for future TUI telemetry if planned.

2. FileSystemSink race condition in shutdown (src/cli/telemetry/sinks/filesystem-sink.ts:34-39)

Observation: shutdown() awaits pendingWrite, but new record() calls after that point could add more writes. The hasRecords flag is checked after the await, creating a race window.

Impact: Minimal in practice since commands finish before shutdown, but the interface technically allows it.

Suggestion: Consider setting hasRecords state before awaiting to make the behavior more deterministic:

async shutdown(): Promise<void> {
  const hadRecords = this.hasRecords;
  await this.pendingWrite;
  if (hadRecords) {
    this.log(`[audit mode] Telemetry written to ${this.filePath}`);
  }
}

❓ Questions

1. Entrypoint derivation from args[0] (src/cli/cli.ts:226)

Current behavior:

TelemetryClientAccessor.init(args[0] ?? 'unknown');

This uses the command name (e.g., "help") as the entrypoint, which becomes the filename prefix: help-{sessionId}.json.

Question: For subcommands like help modes, args[0] is still "help", not "help.modes". This means:

  • Multiple subcommands in the same session write to the same file (intended?)
  • The filename doesn't reflect the actual command executed

Is this intentional? Should it use:

  • A fixed prefix like "cli" for all commands?
  • The full command path including subcommands?
  • The command group (current approach)?

2. Config file setting: telemetry.audit (src/cli/global-config.ts:21)

Observation: The global config schema includes telemetry.audit: boolean, but there's no CLI command to set it (no agentcore telemetry audit command).

Questions:

  • How do users set this? Manual JSON editing of ~/.agentcore/config.json?
  • The PR description says the command is "intentionally not introduced" — is this temporary?
  • Should we add a comment in the code explaining this is for internal use only?

3. File naming strategy (related to #1)

Current behavior: Multiple commands in a session (e.g., running both help and help modes in sequence) would write to the same file since both have command_group: "help".

Inside the file, they're distinguished:

{"value": 42, "attrs": {"command": "help", ...}}
{"value": 55, "attrs": {"command": "help.modes", ...}}

Question: Is this the intended UX for audit mode? Or should each command invocation get its own file?


🔍 Nits

  1. Import ordering (integ-tests/help.test.ts:1-7)
    Imports aren't alphabetically ordered (spawnAndCollect before runCLI). Not blocking, just inconsistent with typical conventions.

  2. Typo in PR description
    Description says "FilesystemSink" but code uses "FileSystemSink" (two words). Only affects the PR description.

  3. Magic string filter (integ-tests/help.test.ts:70)

    .filter(f => f.startsWith('help-'))

    This hardcodes the filename prefix. More robust: match any .json file or use the exact pattern from resolveAuditFilePath.


Overall Assessment

Status: ✅ Approve with suggestions

The implementation is solid and production-ready. The suggestions above are for polish and clarification, not correctness issues. The questions are mainly to ensure the design choices align with the intended use case (internal audit mode for development/debugging).

Great work on maintaining consistency with the existing codebase! 🚀

@Hweinstock
Copy link
Copy Markdown
Contributor Author

  1. this is intentional. The prefix is meant to communicate the high-level entrypoint.
  2. intentional, we're not ready to expose this to customers.
  3. intentional, each file is session bound, so we want multiple commands within a session to be in the same file.

@Hweinstock Hweinstock merged commit 397c187 into aws:main Apr 30, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants