Skip to content

feat(templates): add conversation history persistence to HTTP agent templates#794

Open
aidandaly24 wants to merge 2 commits into
aws:mainfrom
aidandaly24:fix/template-conversation-history
Open

feat(templates): add conversation history persistence to HTTP agent templates#794
aidandaly24 wants to merge 2 commits into
aws:mainfrom
aidandaly24:fix/template-conversation-history

Conversation

@aidandaly24
Copy link
Copy Markdown
Contributor

@aidandaly24 aidandaly24 commented Apr 8, 2026

Description

All four HTTP agent templates (Strands, OpenAI Agents, Google ADK, LangGraph) were stateless per-invocation — each call started with zero conversation context, so multi-turn recall within a session didn't work. Strands additionally had a session-bleed risk because all invocations sharing a Python process used a single global _agent whose Agent.messages accumulated across calls.

Note on Strands and Runtime session isolation

AgentCore Runtime isolates each session_id to its own dedicated microVM (terminated and memory-sanitized at session end), so the Strands cross-session bleed cannot manifest in a deployed AgentCore Runtime agent. The bleed is real, however, in:

Keying the agent by session_id makes the template correct independent of the runtime's isolation guarantee, costs the same memory profile as the previous global, and aligns Strands with the per-session pattern the other three templates use.

Per-template fixes

  • Strands: replace global _agent with @lru_cache(maxsize=128) get_or_create_agent(session_id). Strands Agent accumulates messages internally across stream_async() calls, so reusing the per-session instance gives natural multi-turn memory and isolates sessions. Bound at 128 entries to cap process memory; evicted sessions silently start fresh on their next invoke (documented in a code comment, suggesting durable session stores for production). Applied to both the no-memory/no-config-bundle branch and the no-memory/config-bundle branch — hasMemory already keys per (session_id, user_id). The ConfigBundleHook callbacks re-read bundle state at invocation time, so caching the agent is safe and consistent with how hasMemory already attaches the hook.
  • OpenAI Agents: @lru_cache(maxsize=128) get_session(session_id) returning SQLiteSession, threaded via session= on Runner.run(). Auto-loads/saves prior history in-process. Same 128-session cap and documentation comment as Strands.
  • Google ADK: module-level _session_service + _runner with a get_or_create_session() helper that calls get_session() then create_session() if missing — avoids AlreadyExistsError on repeat calls. ADK's InMemorySessionService is itself a single instance retained across invocations, so no per-session cache cap is needed.
  • LangGraph: module-level _checkpointer = InMemorySaver() passed to create_react_agent(). thread_id mapped to context.session_id via configurable on graph.ainvoke(). InMemorySaver retains all checkpoints in process memory.

All conversation state is in-memory (best-effort) — persists across invocations within the runtime process and resets on cold starts. This is the correct default for starter templates; users can swap to durable backends (Strands FileSessionManager, OpenAI SQLiteSession with file path, ADK DatabaseSessionService, LangGraph persistent checkpointers) when they need durability.

Note: #810 (missing system prompts on OpenAI Agents and LangGraph) was already addressed by other PRs that landed on main after this PR was opened — both templates now ship INSTRUCTIONS / DEFAULT_SYSTEM_PROMPT constants. This PR's scope is therefore #808 + #809.

Related Issue

Closes #808
Closes #809

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

Testing

  • npm run typecheck
  • npm run lint
  • npm run test:unit (275 test files passing)
  • npm run test:update-snapshots

End-to-end deploy + invoke testing (from earlier iteration of this branch):

  • Strands (Bedrock): Deployed, invoked 3x — recall ✅, session isolation ✅
  • OpenAI Agents (OpenAI): Deployed, invoked 3x — recall ✅, session isolation ✅
  • LangGraph (Bedrock): Deployed, invoked 3x — recall ✅, session isolation ✅
  • Google ADK (Gemini): Deployed, code structurally verified — Gemini API quota prevented live invocation testing

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@aidandaly24 aidandaly24 requested a review from a team April 8, 2026 20:59
@github-actions github-actions Bot added the size/s PR size: S label Apr 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

Package Tarball

aws-agentcore-0.14.1.tgz

How to install

gh release download pr-794-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.14.1.tgz

Copy link
Copy Markdown
Contributor

@notgitika notgitika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 comments but LGTM otherwise

Comment thread src/assets/python/http/strands/base/main.py Outdated
Comment thread src/assets/python/http/openaiagents/base/main.py
@github-actions github-actions Bot added size/m PR size: M and removed size/s PR size: S labels Apr 13, 2026
@aidandaly24 aidandaly24 force-pushed the fix/template-conversation-history branch from 1e4f23a to ea2f0cc Compare April 13, 2026 16:22
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 13, 2026
@aidandaly24 aidandaly24 closed this May 7, 2026
@aidandaly24 aidandaly24 reopened this May 20, 2026
@github-actions github-actions Bot added size/m PR size: M agentcore-harness-reviewing AgentCore Harness review in progress and removed size/m PR size: M labels May 20, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label May 20, 2026
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label May 20, 2026
@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 20, 2026
…emplates

Closes aws#808, aws#809.

All four HTTP agent templates were stateless per-invocation. Each call started
with zero conversation context, so multi-turn recall within a session did not
work. Strands additionally leaked history across sessions because all
invocations shared a single global Agent instance.

Each template now uses its framework's idiomatic per-session mechanism:

- Strands: replace global _agent with @lru_cache(128)
  get_or_create_agent(session_id). The Agent accumulates messages internally
  across stream_async() calls, so reusing the per-session instance gives
  multi-turn memory and isolates sessions. Bound at 128 entries to cap process
  memory. Only applied to the no-memory, no-config-bundle branch — hasMemory
  already keys per (session_id, user_id) and hasConfigBundle intentionally
  recreates the agent each invoke for hooks.
- OpenAI Agents: @lru_cache(128) get_session(session_id) returning
  SQLiteSession, threaded via session= on Runner.run(). Auto-loads/saves prior
  history in-process.
- Google ADK: module-level _session_service + _runner with a
  get_or_create_session() helper that calls get_session() then create_session()
  if missing. Avoids AlreadyExistsError on repeat calls.
- LangGraph: module-level _checkpointer = InMemorySaver() passed to
  create_react_agent(). thread_id mapped to context.session_id via configurable
  on graph.ainvoke.

All state is in-memory and best-effort: persists across invocations within the
runtime process, resets on cold starts. Users who need durability can swap to
each framework's persistent backend.

End-to-end deploy + invoke testing covered in the original PR description.
@aidandaly24 aidandaly24 force-pushed the fix/template-conversation-history branch from ea2f0cc to 5ad1381 Compare May 20, 2026 03:09
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 20, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label May 20, 2026
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label May 20, 2026
@aidandaly24 aidandaly24 requested a review from notgitika May 20, 2026 03:24
@aidandaly24
Copy link
Copy Markdown
Contributor Author

@notgitika Pushed a fresh rebase onto main with both of your review comments addressed:

  • Strands: @lru_cache(maxsize=128) on get_or_create_agent (caps unbounded growth)
  • OpenAI Agents: instructions=INSTRUCTIONS on all four Agent() constructors (now using the existing INSTRUCTIONS constant rather than inlining the string)

Also dropped #810 from scope since both LangGraph and OpenAI templates already have system prompts on main from other PRs that landed in the meantime. Scope is now #808 + #809.

All checks green; ready for re-review when you have a chance.

tools=tools
)
return _agent
@lru_cache(maxsize=128)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens for turn 129? they would lose the whole history?

if so, can we add some logging so the user is aware?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the 129th session arrives, the session #1's agent will be evicted and its message history lost. The next call to session #1, will get a new Agent with no prior memory without any error/warning. lru_cache doesn't support eviction callbacks, so this cannot be logged. Maybe add a comment in the template, "Caches up to 128 active sessions; LRU eviction silently resets history for the oldest session. For production use, replace with a durable session store."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep — turn 129 evicts session #1, and the next call to session #1 starts fresh with no warning. functools.lru_cache doesn't expose an eviction callback so we can't log on it directly. Took @padmak30's suggestion below and added a documentation comment above the cache in both Strands and OpenAI templates explaining the cap and pointing to durable session stores (Strands FileSessionManager, SQLiteSession with file path) for production use.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used your suggested wording (lightly adapted). Added the comment above the @lru_cache in both Strands and OpenAI Agents templates pointing users to FileSessionManager / SQLiteSession with a file path for production use. Thanks for the clean phrasing.

Comment on lines 168 to 175
{{#if hasConfigBundle}}
def create_agent():
return Agent(
model=load_model(),
system_prompt=DEFAULT_SYSTEM_PROMPT,
tools=tools,
hooks=[ConfigBundleHook()],
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config bundle path should also get conversation persistence across sessions

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in the latest commit. The hasConfigBundle branch now uses the same @lru_cache(128) get_or_create_agent(session_id) pattern as the non-bundle branch, with hooks=[ConfigBundleHook()] attached at construction. Verified safe to cache: ConfigBundleHook reads bundle state inside its callbacks (_inject_system_prompt, _override_tool_desc) at invocation time, not at hook registration, so reusing the same Agent across invocations still picks up bundle changes. The hasMemory branch already attaches the hook the same way to a cached agent, so this matches existing precedent in the template.

tools=tools
)
return _agent
@lru_cache(maxsize=128)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the 129th session arrives, the session #1's agent will be evicted and its message history lost. The next call to session #1, will get a new Agent with no prior memory without any error/warning. lru_cache doesn't support eviction callbacks, so this cannot be logged. Maybe add a comment in the template, "Caches up to 128 active sessions; LRU eviction silently resets history for the oldest session. For production use, replace with a durable session store."

… document LRU cap

Addresses follow-up review comments on aws#794:

- Strands hasConfigBundle branch now caches Agent per session_id via the same
  @lru_cache(128) get_or_create_agent() pattern as the non-bundle branch, with
  hooks=[ConfigBundleHook()] attached at construction. ConfigBundleHook reads
  bundle state inside its callbacks (not at registration), so caching the
  agent is safe and consistent with how the hasMemory branch already attaches
  the hook. Drops the create_agent() per-invoke pattern, which had no
  conversation persistence.

- Document the LRU cache cap in both Strands and OpenAI Agents templates:
  callers of an evicted session start fresh on the next invoke. Comment also
  points to durable session stores for production use.

Snapshot regenerated.
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 21, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
content = types.Content(role="user", parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner(user_id, session_id)
runner = get_or_create_runner()
session = await get_or_create_session(user_id, session_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-user conversation access via client-supplied user_id.

user_id here flows from the entrypoint, where it is read from payload (caller-controlled): user_id = payload.get("user_id", "default_user"). Before this change that was harmless — every invocation built a fresh InMemorySessionService — but the new module-level _session_service now persists history keyed by (app_name, user_id, session_id) across invocations.

That means any caller can now set user_id to another user's identifier and, with a known/guessable session_id, attach to that user's stored conversation: reading prior messages on the next turn and appending new ones. The Strands template avoids this by sourcing user_id from context (getattr(context, 'user_id', 'default-user')); this template should do the same — read user_id from context, not payload, before it is used to scope persistent session state.

session_service = InMemorySessionService()
session = await session_service.create_session(
# Module-level session service and runner (preserves history across invocations)
_session_service = InMemorySessionService()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded in-memory session growth (DoS / OOM in long-lived containers).

InMemorySessionService is a plain in-memory store with no built-in eviction. With this PR it is now process-lifetime scoped, and every distinct (user_id, session_id) invocation creates a new entry that is never freed. An attacker (or a noisy client) can rotate session_ids and grow the process's heap until it OOMs — the runtime container typically has a fixed memory budget, so this is a denial-of-service vector.

The sibling Strands and OpenAIAgents templates in this same PR explicitly cap their session caches with @lru_cache(maxsize=128) and call out LRU eviction in a comment. The googleadk template (and the langchain template, with InMemorySaver()) should apply the same mitigation, or call out durable-store guidance just as prominently. As written, these two templates are noticeably less safe to ship into production than the other two.

tools = [add_numbers]

# Module-level checkpointer preserves conversation history across invocations
_checkpointer = InMemorySaver()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded in-memory checkpoint growth (DoS / OOM in long-lived containers).

InMemorySaver is a process-lifetime, unbounded dict-style store: every distinct thread_id (which this PR derives from context.session_id) creates checkpoint entries that are never evicted. In a long-running runtime container, a client that rotates session_id (or any caller producing many sessions over time) can grow the heap until the process OOMs — a denial-of-service vector against the runtime.

The sibling Strands and OpenAIAgents templates in this same PR explicitly cap their session caches with @lru_cache(maxsize=128) and document the trade-off in a comment. This template should either bound _checkpointer similarly (or wrap session→checkpointer mapping in an LRU), or include the same prominent guidance pointing users at a durable backend (e.g. SqliteSaver/PostgresSaver) before deploying. As written, this template is noticeably less safe than its peers in this PR.

@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

3 participants