Skip to content

[integration] Agent workflows (big-agents)#4791

Open
mmabrouk wants to merge 76 commits into
mainfrom
big-agents
Open

[integration] Agent workflows (big-agents)#4791
mmabrouk wants to merge 76 commits into
mainfrom
big-agents

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 22, 2026

Copy link
Copy Markdown
Member

Context

big-agents is the integration branch for the agent-workflows feature. Every agent PR targets big-agents (directly, or by stacking on one that does). The plan is to review and merge each sub-PR into big-agents, then merge big-agents into main as a single unit.

This PR is a draft tracker. It stays open until all the open sub-PRs below are merged into big-agents. The branch started from an empty commit, so the diff fills in as sub-PRs land.

Integrated PRs

Each box gets checked when that PR is merged into big-agents. Indented items stack on the item above them.

SDK and service

Runner

Frontend

Hosting

Sandbox-agent deployment

The three deployment PRs were originally opened against chore/sandbox-agent-core as #4787 / #4788 / #4789. After #4786 merged, they were re-pointed at big-agents, which closed the old numbers and reopened them as #4802 / #4803 / #4804.

Docs

Branch-only (no PR yet)

These design-doc branches are stacked on big-agents but have no PR. Open one if you want them reviewed separately, otherwise they fold in with the docs.

  • docs/agent-model-config-and-provider-auth
  • docs/agent-skills-config
  • docs/agent-code-tool-sandbox
  • docs/agent-harness-capabilities

Notes

mmabrouk added 17 commits June 19, 2026 18:27
…r image

Python `code` tools failed with `spawn python3 ENOENT` because neither runner image
installed python3 (code.ts spawns python3). Add it to both. Also rebuild the Pi extension
bundle from the mounted src on dev container start: the dev image bakes the bundle and only
mounts src, so an edited extension went stale and silently stopped registering custom tools
on the Rivet path. Adds a regression test for the extension tool-registration contract.
Found via the agent-workflows QA matrix (findings F-005, F-006).

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD
Adds docs/design/agent-workflows/qa/: the autohealing QA recipe (README), the Gherkin
scenario matrix with a live scoreboard, the findings log (F-001..F-010 in the open-issues
style), a reusable /invoke driver with captured runs, and the regression-test research plus
the replay-test skill draft. Produced by a live end-to-end QA pass across the harness x
environment x capability matrix; it documents and motivates the runner fixes in the sibling
PRs (#4776, #4778).

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD
@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 24, 2026 7:30pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Important

Review skipped

Too many files!

This PR contains 529 files, which is 379 over the limit of 150.

To get a review, narrow the scope:
• coderabbit review --type committed # exclude uncommitted changes
• coderabbit review --dir # limit to a subdirectory
• coderabbit review --base # compare against a closer base

Upgrade to a paid plan to raise the limit.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: fcb8d1e2-f14d-486d-a065-d15e451af075

📥 Commits

Reviewing files that changed from the base of the PR and between 2eed5d0 and 9cbcbfd.

⛔ Files ignored due to path filters (3)
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/poc/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
  • services/agent/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
  • web/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (529)
  • .github/workflows/12-check-unit-tests.yml
  • .github/workflows/42-railway-build.yml
  • .github/workflows/43-railway-deploy.yml
  • .gitignore
  • api/entrypoints/routers.py
  • api/oss/src/apis/fastapi/tools/models.py
  • api/oss/src/apis/fastapi/tools/router.py
  • api/oss/src/apis/fastapi/vault/router.py
  • api/oss/src/apis/fastapi/workflows/exceptions.py
  • api/oss/src/apis/fastapi/workflows/router.py
  • api/oss/src/core/tools/dtos.py
  • api/oss/src/core/tools/exceptions.py
  • api/oss/src/core/tools/providers/composio/adapter.py
  • api/oss/src/core/tools/providers/composio/catalog.py
  • api/oss/src/core/tools/service.py
  • api/oss/src/core/workflows/dtos.py
  • api/oss/src/core/workflows/interfaces.py
  • api/oss/src/core/workflows/platform_catalog.py
  • api/oss/src/core/workflows/service.py
  • api/oss/src/core/workflows/types.py
  • api/oss/tests/pytest/unit/tools/__init__.py
  • api/oss/tests/pytest/unit/tools/test_agent_resolution.py
  • api/oss/tests/pytest/unit/tools/test_no_auth_connection.py
  • api/oss/tests/pytest/unit/workflows/test_flag_ownership.py
  • api/oss/tests/pytest/unit/workflows/test_platform_catalog.py
  • docs/design/agent-workflows/README.md
  • docs/design/agent-workflows/archive/README.md
  • docs/design/agent-workflows/archive/harness-port-redesign/README.md
  • docs/design/agent-workflows/archive/harness-port-redesign/implementation.md
  • docs/design/agent-workflows/archive/harness-port-redesign/plan.md
  • docs/design/agent-workflows/archive/harness-port-redesign/proposal.md
  • docs/design/agent-workflows/archive/harness-port-redesign/research.md
  • docs/design/agent-workflows/archive/harness-port-redesign/status.md
  • docs/design/agent-workflows/archive/old-rfcs/agent-protocol-rfc.md
  • docs/design/agent-workflows/archive/old-rfcs/streaming-and-sessions.md
  • docs/design/agent-workflows/archive/research/auth-secrets.md
  • docs/design/agent-workflows/archive/research/daytona-sandbox.md
  • docs/design/agent-workflows/archive/research/diskless-in-memory-config.md
  • docs/design/agent-workflows/archive/research/open-questions.md
  • docs/design/agent-workflows/archive/research/otel-instrumentation.md
  • docs/design/agent-workflows/archive/research/pi-interaction.md
  • docs/design/agent-workflows/archive/research/sandbox-sharing.md
  • docs/design/agent-workflows/archive/sdk-local-backend/status.md
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/README.md
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/integrating-the-tracing-extension.md
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/poc/.env.example
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/poc/README.md
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/poc/agenta-otel.ts
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/poc/package.json
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/poc/run.ts
  • docs/design/agent-workflows/archive/wp-1-pi-tracing/tracing-in-the-agent-service.md
  • docs/design/agent-workflows/archive/wp-2-agent-service/README.md
  • docs/design/agent-workflows/archive/wp-2-agent-service/implementation-plan.md
  • docs/design/agent-workflows/archive/wp-2-agent-service/qa.md
  • docs/design/agent-workflows/archive/wp-3-daytona-sandbox/README.md
  • docs/design/agent-workflows/archive/wp-3-daytona-sandbox/poc/README.md
  • docs/design/agent-workflows/archive/wp-3-daytona-sandbox/poc/bench_coldstart.py
  • docs/design/agent-workflows/archive/wp-3-daytona-sandbox/poc/build_snapshot.py
  • docs/design/agent-workflows/archive/wp-3-daytona-sandbox/poc/cleanup.py
  • docs/design/agent-workflows/archive/wp-3-daytona-sandbox/poc/run_agent.py
  • docs/design/agent-workflows/archive/wp-4-multi-message-output/README.md
  • docs/design/agent-workflows/archive/wp-5-chat-vs-completion/README.md
  • docs/design/agent-workflows/archive/wp-6-workflow-type-and-template/README.md
  • docs/design/agent-workflows/archive/wp-7-tools/README.md
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/README.md
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/architecture.md
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/context.md
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/isolation-and-fork.md
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/plan.md
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/poc/commit_agent_config.py
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/poc/debug-events.ts
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/poc/dump-full.ts
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/poc/package.json
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/poc/spike.ts
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/research.md
  • docs/design/agent-workflows/archive/wp-8-rivet-acp-runtime/status.md
  • docs/design/agent-workflows/documentation/adapters/agenta.md
  • docs/design/agent-workflows/documentation/adapters/claude-code.md
  • docs/design/agent-workflows/documentation/adapters/pi.md
  • docs/design/agent-workflows/documentation/agent-configuration.md
  • docs/design/agent-workflows/documentation/agent-template.md
  • docs/design/agent-workflows/documentation/architecture.md
  • docs/design/agent-workflows/documentation/ground-truth.md
  • docs/design/agent-workflows/documentation/ports-and-adapters.md
  • docs/design/agent-workflows/documentation/protocol.md
  • docs/design/agent-workflows/documentation/running-the-agent.md
  • docs/design/agent-workflows/documentation/sessions.md
  • docs/design/agent-workflows/documentation/skills.md
  • docs/design/agent-workflows/documentation/tools.md
  • docs/design/agent-workflows/documentation/triggers.md
  • docs/design/agent-workflows/projects/capability-config/README.md
  • docs/design/agent-workflows/projects/capability-config/context.md
  • docs/design/agent-workflows/projects/capability-config/plan.md
  • docs/design/agent-workflows/projects/capability-config/proposal.md
  • docs/design/agent-workflows/projects/capability-config/research.md
  • docs/design/agent-workflows/projects/capability-config/status.md
  • docs/design/agent-workflows/projects/model-config/proposal.md
  • docs/design/agent-workflows/projects/model-config/research.md
  • docs/design/agent-workflows/projects/provider-model-auth/README.md
  • docs/design/agent-workflows/projects/provider-model-auth/build-notes.md
  • docs/design/agent-workflows/projects/provider-model-auth/context.md
  • docs/design/agent-workflows/projects/provider-model-auth/design.md
  • docs/design/agent-workflows/projects/provider-model-auth/explainer.md
  • docs/design/agent-workflows/projects/provider-model-auth/harness-provider-matrix.md
  • docs/design/agent-workflows/projects/provider-model-auth/plan.md
  • docs/design/agent-workflows/projects/provider-model-auth/research.md
  • docs/design/agent-workflows/projects/provider-model-auth/status.md
  • docs/design/agent-workflows/projects/qa/README.md
  • docs/design/agent-workflows/projects/qa/cleanup-plan.md
  • docs/design/agent-workflows/projects/qa/findings.md
  • docs/design/agent-workflows/projects/qa/implementation-plan.md
  • docs/design/agent-workflows/projects/qa/matrix.md
  • docs/design/agent-workflows/projects/qa/regression-skill-DRAFT.md
  • docs/design/agent-workflows/projects/qa/regression-testing-research.md
  • docs/design/agent-workflows/projects/qa/runs/E1__append_system_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E1__builtin_bash_agenta.json
  • docs/design/agent-workflows/projects/qa/runs/E1__builtin_bash_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E1__code_tool_agenta.json
  • docs/design/agent-workflows/projects/qa/runs/E1__code_tool_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E1__smoke_chat_agenta.json
  • docs/design/agent-workflows/projects/qa/runs/E1__smoke_chat_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E2__append_system_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E2__builtin_bash_agenta.json
  • docs/design/agent-workflows/projects/qa/runs/E2__builtin_bash_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E2__claude_code_tool.json
  • docs/design/agent-workflows/projects/qa/runs/E2__claude_smoke.json
  • docs/design/agent-workflows/projects/qa/runs/E2__code_tool_agenta.json
  • docs/design/agent-workflows/projects/qa/runs/E2__code_tool_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E2__mcp_claude.json
  • docs/design/agent-workflows/projects/qa/runs/E2__smoke_chat_agenta.json
  • docs/design/agent-workflows/projects/qa/runs/E2__smoke_chat_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E3__builtin_bash_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E3__code_tool_agenta.json
  • docs/design/agent-workflows/projects/qa/runs/E3__code_tool_pi.json
  • docs/design/agent-workflows/projects/qa/runs/E3__smoke_chat_pi.json
  • docs/design/agent-workflows/projects/qa/scripts/mcp_qa_server.mjs
  • docs/design/agent-workflows/projects/qa/scripts/run_matrix.py
  • docs/design/agent-workflows/projects/research/opencode-architecture.md
  • docs/design/agent-workflows/projects/runner-interface/README.md
  • docs/design/agent-workflows/projects/sandbox-agent-refactor/sandbox-agent-refactor-plan.md
  • docs/design/agent-workflows/projects/sdk-local-tools/README.md
  • docs/design/agent-workflows/projects/sdk-local-tools/codebase-conventions.md
  • docs/design/agent-workflows/projects/sdk-local-tools/context.md
  • docs/design/agent-workflows/projects/sdk-local-tools/conventions-review.md
  • docs/design/agent-workflows/projects/sdk-local-tools/organization-proposal.md
  • docs/design/agent-workflows/projects/sdk-local-tools/plan.md
  • docs/design/agent-workflows/projects/sdk-local-tools/research.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/evidence/app-mcp-reassign.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/evidence/description-default-inconsistency.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/evidence/gateway-no-logging.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/evidence/handler-resolution-error.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/findings.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/metadata.json
  • docs/design/agent-workflows/projects/sdk-local-tools/review/plan.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/progress.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/questions.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/risks.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/scope.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/scorecard.md
  • docs/design/agent-workflows/projects/sdk-local-tools/review/summary.md
  • docs/design/agent-workflows/projects/sdk-local-tools/status.md
  • docs/design/agent-workflows/projects/sidecar-deployment-proposal/README.md
  • docs/design/agent-workflows/projects/sidecar-deployment-proposal/proposal.md
  • docs/design/agent-workflows/projects/sidecar-deployment-proposal/status.md
  • docs/design/agent-workflows/projects/skills-config/architecture.md
  • docs/design/agent-workflows/projects/skills-config/build-notes.md
  • docs/design/agent-workflows/projects/tool-resolution-layering/plan.md
  • docs/design/agent-workflows/projects/typescript-structure/README.md
  • docs/design/agent-workflows/projects/typescript-structure/context.md
  • docs/design/agent-workflows/projects/typescript-structure/plan.md
  • docs/design/agent-workflows/projects/typescript-structure/research.md
  • docs/design/agent-workflows/projects/typescript-structure/status.md
  • docs/design/agent-workflows/scratch/agent-coordination.md
  • docs/design/agent-workflows/scratch/branch-cleanup-report.md
  • docs/design/agent-workflows/scratch/branch-pr-cleanup-report.md
  • docs/design/agent-workflows/scratch/branch-pr-cleanup-status.md
  • docs/design/agent-workflows/scratch/capability-architecture.md
  • docs/design/agent-workflows/scratch/capability-map.md
  • docs/design/agent-workflows/scratch/dead-code-report.md
  • docs/design/agent-workflows/scratch/feature-matrix-test.md
  • docs/design/agent-workflows/scratch/flows-and-capabilities.md
  • docs/design/agent-workflows/scratch/implementation-review.md
  • docs/design/agent-workflows/scratch/meeting-alignment.md
  • docs/design/agent-workflows/scratch/notes-architecture.md
  • docs/design/agent-workflows/scratch/notes-config-runsh.md
  • docs/design/agent-workflows/scratch/notes-model-auth.md
  • docs/design/agent-workflows/scratch/notes-tools-mcp-capabilities.md
  • docs/design/agent-workflows/scratch/open-issues.md
  • docs/design/agent-workflows/scratch/pr-stack.md
  • docs/design/agent-workflows/scratch/status.md
  • docs/design/agent-workflows/trash/.gitkeep
  • docs/design/vault-named-secrets/README.md
  • docs/design/vault-named-secrets/context.md
  • docs/design/vault-named-secrets/plan.md
  • docs/design/vault-named-secrets/research.md
  • docs/design/vault-named-secrets/status.md
  • docs/docs/self-host/02-configuration.mdx
  • docs/docs/self-host/guides/04-deploy-on-railway.mdx
  • docs/docs/self-host/guides/07-deploy-the-agent-runner.mdx
  • docs/docs/self-host/guides/08-custom-agent-runner-images.mdx
  • docs/docs/self-host/guides/09-agent-daytona-sandboxes.mdx
  • docs/docs/self-host/infrastructure/01-architecture.mdx
  • examples/python/RAG_QA_chatbot/backend/agent_loop.py
  • examples/python/RAG_QA_chatbot/backend/contract_stream.py
  • examples/python/RAG_QA_chatbot/backend/main.py
  • examples/python/RAG_QA_chatbot/backend/rag.py
  • examples/python/RAG_QA_chatbot/env.example
  • examples/python/RAG_QA_chatbot/ingest/fix_urls.py
  • examples/python/RAG_QA_chatbot/ingest/loaders.py
  • examples/python/RAG_QA_chatbot/ingest/store.py
  • examples/python/RAG_QA_chatbot/run-agent-chat-slice.sh
  • hosting/docker-compose/ee/docker-compose.dev.yml
  • hosting/docker-compose/ee/docker-compose.gh.local.yml
  • hosting/docker-compose/ee/docker-compose.gh.yml
  • hosting/docker-compose/ee/env.ee.dev.example
  • hosting/docker-compose/ee/env.ee.gh.example
  • hosting/docker-compose/oss/docker-compose.dev.yml
  • hosting/docker-compose/oss/docker-compose.gh.local.yml
  • hosting/docker-compose/oss/docker-compose.gh.ssl.yml
  • hosting/docker-compose/oss/docker-compose.gh.yml
  • hosting/docker-compose/oss/env.oss.dev.example
  • hosting/docker-compose/oss/env.oss.gh.example
  • hosting/kubernetes/ee/values.ee.example.yaml
  • hosting/kubernetes/helm/templates/NOTES.txt
  • hosting/kubernetes/helm/templates/_helpers.tpl
  • hosting/kubernetes/helm/templates/sandbox-agent-deployment.yaml
  • hosting/kubernetes/helm/templates/sandbox-agent-service.yaml
  • hosting/kubernetes/helm/templates/secrets.yaml
  • hosting/kubernetes/helm/templates/services-deployment.yaml
  • hosting/kubernetes/helm/values.schema.json
  • hosting/kubernetes/helm/values.yaml
  • hosting/kubernetes/oss/values.oss.example.yaml
  • hosting/railway/oss/README.md
  • hosting/railway/oss/sandbox-agent/Dockerfile
  • hosting/railway/oss/scripts/bootstrap.sh
  • hosting/railway/oss/scripts/build-and-push-images.sh
  • hosting/railway/oss/scripts/configure.sh
  • hosting/railway/oss/scripts/deploy-from-images.sh
  • hosting/railway/oss/scripts/deploy-services.sh
  • hosting/railway/oss/scripts/preview-resolve-env.sh
  • sdks/python/agenta/__init__.py
  • sdks/python/agenta/sdk/agents/__init__.py
  • sdks/python/agenta/sdk/agents/adapters/__init__.py
  • sdks/python/agenta/sdk/agents/adapters/_runner_config.py
  • sdks/python/agenta/sdk/agents/adapters/agenta_builtins.py
  • sdks/python/agenta/sdk/agents/adapters/claude_settings.py
  • sdks/python/agenta/sdk/agents/adapters/harnesses.py
  • sdks/python/agenta/sdk/agents/adapters/local.py
  • sdks/python/agenta/sdk/agents/adapters/sandbox_agent.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/__init__.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/messages.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/routing.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/sse.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/stream.py
  • sdks/python/agenta/sdk/agents/capabilities.py
  • sdks/python/agenta/sdk/agents/connections/__init__.py
  • sdks/python/agenta/sdk/agents/connections/errors.py
  • sdks/python/agenta/sdk/agents/connections/interfaces.py
  • sdks/python/agenta/sdk/agents/connections/models.py
  • sdks/python/agenta/sdk/agents/connections/resolver.py
  • sdks/python/agenta/sdk/agents/dtos.py
  • sdks/python/agenta/sdk/agents/errors.py
  • sdks/python/agenta/sdk/agents/interfaces.py
  • sdks/python/agenta/sdk/agents/mcp/__init__.py
  • sdks/python/agenta/sdk/agents/mcp/errors.py
  • sdks/python/agenta/sdk/agents/mcp/interfaces.py
  • sdks/python/agenta/sdk/agents/mcp/models.py
  • sdks/python/agenta/sdk/agents/mcp/parsing.py
  • sdks/python/agenta/sdk/agents/mcp/resolver.py
  • sdks/python/agenta/sdk/agents/mcp/wire.py
  • sdks/python/agenta/sdk/agents/platform/__init__.py
  • sdks/python/agenta/sdk/agents/platform/connection.py
  • sdks/python/agenta/sdk/agents/platform/connections.py
  • sdks/python/agenta/sdk/agents/platform/gateway.py
  • sdks/python/agenta/sdk/agents/platform/resolve.py
  • sdks/python/agenta/sdk/agents/platform/secrets.py
  • sdks/python/agenta/sdk/agents/skills/__init__.py
  • sdks/python/agenta/sdk/agents/skills/errors.py
  • sdks/python/agenta/sdk/agents/skills/models.py
  • sdks/python/agenta/sdk/agents/skills/parsing.py
  • sdks/python/agenta/sdk/agents/skills/wire.py
  • sdks/python/agenta/sdk/agents/streaming.py
  • sdks/python/agenta/sdk/agents/tools/__init__.py
  • sdks/python/agenta/sdk/agents/tools/compat.py
  • sdks/python/agenta/sdk/agents/tools/errors.py
  • sdks/python/agenta/sdk/agents/tools/interfaces.py
  • sdks/python/agenta/sdk/agents/tools/models.py
  • sdks/python/agenta/sdk/agents/tools/parsing.py
  • sdks/python/agenta/sdk/agents/tools/resolver.py
  • sdks/python/agenta/sdk/agents/utils/__init__.py
  • sdks/python/agenta/sdk/agents/utils/ts_runner.py
  • sdks/python/agenta/sdk/agents/utils/wire.py
  • sdks/python/agenta/sdk/decorators/routing.py
  • sdks/python/agenta/sdk/engines/running/interfaces.py
  • sdks/python/agenta/sdk/engines/running/registry.py
  • sdks/python/agenta/sdk/engines/running/utils.py
  • sdks/python/agenta/sdk/middlewares/running/normalizer.py
  • sdks/python/agenta/sdk/middlewares/running/resolver.py
  • sdks/python/agenta/sdk/models/workflows.py
  • sdks/python/agenta/sdk/utils/types.py
  • sdks/python/agenta/tests/agents/test_streaming.py
  • sdks/python/oss/tests/pytest/acceptance/workflows/test_new_uri_handlers.py
  • sdks/python/oss/tests/pytest/integration/agents/__init__.py
  • sdks/python/oss/tests/pytest/integration/agents/_in_process_backend.py
  • sdks/python/oss/tests/pytest/integration/agents/test_transport_roundtrip.py
  • sdks/python/oss/tests/pytest/unit/agents/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/adapters/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/adapters/test_claude_settings.py
  • sdks/python/oss/tests/pytest/unit/agents/conftest.py
  • sdks/python/oss/tests/pytest/unit/agents/connections/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/connections/test_capabilities.py
  • sdks/python/oss/tests/pytest/unit/agents/connections/test_dtos_model_ref.py
  • sdks/python/oss/tests/pytest/unit/agents/connections/test_models.py
  • sdks/python/oss/tests/pytest/unit/agents/connections/test_resolver.py
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_request.claude.json
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_request.pi.json
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_result.error.json
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_result.ok.json
  • sdks/python/oss/tests/pytest/unit/agents/mcp/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/mcp/test_resolver.py
  • sdks/python/oss/tests/pytest/unit/agents/platform/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/platform/conftest.py
  • sdks/python/oss/tests/pytest/unit/agents/platform/test_connection.py
  • sdks/python/oss/tests/pytest/unit/agents/platform/test_connections_http.py
  • sdks/python/oss/tests/pytest/unit/agents/platform/test_gateway_http.py
  • sdks/python/oss/tests/pytest/unit/agents/platform/test_resolve.py
  • sdks/python/oss/tests/pytest/unit/agents/platform/test_secrets_http.py
  • sdks/python/oss/tests/pytest/unit/agents/skills/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/skills/test_models.py
  • sdks/python/oss/tests/pytest/unit/agents/skills/test_parsing.py
  • sdks/python/oss/tests/pytest/unit/agents/skills/test_skills_e2e.py
  • sdks/python/oss/tests/pytest/unit/agents/skills/test_wire.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_capabilities_events.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_content_blocks.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_harness_configs.py
  • sdks/python/oss/tests/pytest/unit/agents/test_environment_lifecycle.py
  • sdks/python/oss/tests/pytest/unit/agents/test_harness_adapters.py
  • sdks/python/oss/tests/pytest/unit/agents/test_runner_adapter_config.py
  • sdks/python/oss/tests/pytest/unit/agents/test_ui_messages.py
  • sdks/python/oss/tests/pytest/unit/agents/test_wire_contract.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/test_models.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/test_parsing.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/test_resolver.py
  • sdks/python/oss/tests/pytest/unit/test_normalizer_passthrough.py
  • sdks/python/oss/tests/pytest/unit/test_skill_config_catalog.py
  • sdks/python/oss/tests/pytest/unit/test_skill_flags.py
  • sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py
  • sdks/python/oss/tests/pytest/utils/test_resolver_middleware.py
  • sdks/python/oss/tests/pytest/utils/test_routing.py
  • services/agent/.dockerignore
  • services/agent/AGENTS.md
  • services/agent/CLAUDE.md
  • services/agent/README.md
  • services/agent/config/AGENTS.md
  • services/agent/config/agent.json
  • services/agent/docker/Dockerfile
  • services/agent/docker/Dockerfile.dev
  • services/agent/docker/README.md
  • services/agent/package.json
  • services/agent/sandbox-images/daytona/README.md
  • services/agent/sandbox-images/daytona/build_snapshot.py
  • services/agent/scripts/build-extension.mjs
  • services/agent/skills/agenta-getting-started/SKILL.md
  • services/agent/src/cli.ts
  • services/agent/src/engines/pi.ts
  • services/agent/src/engines/sandbox_agent.ts
  • services/agent/src/engines/sandbox_agent/capabilities.ts
  • services/agent/src/engines/sandbox_agent/daemon.ts
  • services/agent/src/engines/sandbox_agent/daytona.ts
  • services/agent/src/engines/sandbox_agent/errors.ts
  • services/agent/src/engines/sandbox_agent/mcp.ts
  • services/agent/src/engines/sandbox_agent/model.ts
  • services/agent/src/engines/sandbox_agent/permissions.ts
  • services/agent/src/engines/sandbox_agent/pi-assets.ts
  • services/agent/src/engines/sandbox_agent/provider.ts
  • services/agent/src/engines/sandbox_agent/run-plan.ts
  • services/agent/src/engines/sandbox_agent/transcript.ts
  • services/agent/src/engines/sandbox_agent/usage.ts
  • services/agent/src/engines/sandbox_agent/workspace.ts
  • services/agent/src/engines/skills.ts
  • services/agent/src/entry.ts
  • services/agent/src/extensions/agenta.ts
  • services/agent/src/protocol.ts
  • services/agent/src/responder.ts
  • services/agent/src/server.ts
  • services/agent/src/tools/callback.ts
  • services/agent/src/tools/code.ts
  • services/agent/src/tools/dispatch.ts
  • services/agent/src/tools/mcp-bridge.ts
  • services/agent/src/tools/mcp-server.ts
  • services/agent/src/tools/public-spec.ts
  • services/agent/src/tools/relay.ts
  • services/agent/src/tracing/otel.ts
  • services/agent/src/version.ts
  • services/agent/tests/unit/cli.test.ts
  • services/agent/tests/unit/code-tool.test.ts
  • services/agent/tests/unit/continuation.test.ts
  • services/agent/tests/unit/extension-tools.test.ts
  • services/agent/tests/unit/mcp-servers.test.ts
  • services/agent/tests/unit/pi-capability-guard.test.ts
  • services/agent/tests/unit/pi-provider-env.test.ts
  • services/agent/tests/unit/responder.test.ts
  • services/agent/tests/unit/sandbox-agent-capabilities.test.ts
  • services/agent/tests/unit/sandbox-agent-daemon.test.ts
  • services/agent/tests/unit/sandbox-agent-daytona.test.ts
  • services/agent/tests/unit/sandbox-agent-errors.test.ts
  • services/agent/tests/unit/sandbox-agent-model.test.ts
  • services/agent/tests/unit/sandbox-agent-orchestration.test.ts
  • services/agent/tests/unit/sandbox-agent-permissions.test.ts
  • services/agent/tests/unit/sandbox-agent-pi-assets.test.ts
  • services/agent/tests/unit/sandbox-agent-provider.test.ts
  • services/agent/tests/unit/sandbox-agent-run-plan.test.ts
  • services/agent/tests/unit/sandbox-agent-usage.test.ts
  • services/agent/tests/unit/sandbox-agent-workspace.test.ts
  • services/agent/tests/unit/server.test.ts
  • services/agent/tests/unit/skills.test.ts
  • services/agent/tests/unit/stream-events.test.ts
  • services/agent/tests/unit/tool-bridge.test.ts
  • services/agent/tests/unit/tool-dispatch.test.ts
  • services/agent/tests/unit/tool-relay-permission.test.ts
  • services/agent/tests/unit/wire-contract.test.ts
  • services/agent/tests/utils/golden.ts
  • services/agent/tsconfig.json
  • services/agent/vitest.config.ts
  • services/entrypoints/main.py
  • services/oss/src/agent/__init__.py
  • services/oss/src/agent/app.py
  • services/oss/src/agent/config.py
  • services/oss/src/agent/schemas.py
  • services/oss/src/agent/secrets.py
  • services/oss/src/agent/tools/__init__.py
  • services/oss/src/agent/tools/gateway.py
  • services/oss/src/agent/tools/resolver.py
  • services/oss/src/agent/tools/secrets.py
  • services/oss/src/agent/tracing.py
  • services/oss/tests/pytest/integration/__init__.py
  • services/oss/tests/pytest/integration/agent/__init__.py
  • services/oss/tests/pytest/integration/agent/conftest.py
  • services/oss/tests/pytest/integration/agent/test_resolve_secrets_http.py
  • services/oss/tests/pytest/integration/agent/tools/__init__.py
  • services/oss/tests/pytest/integration/agent/tools/test_gateway_http.py
  • services/oss/tests/pytest/integration/agent/tools/test_secrets_http.py
  • services/oss/tests/pytest/unit/__init__.py
  • services/oss/tests/pytest/unit/agent/__init__.py
  • services/oss/tests/pytest/unit/agent/conftest.py
  • services/oss/tests/pytest/unit/agent/test_invoke_handler.py
  • services/oss/tests/pytest/unit/agent/test_secrets_mapping.py
  • services/oss/tests/pytest/unit/agent/test_select_backend.py
  • services/oss/tests/pytest/unit/agent/tools/__init__.py
  • services/oss/tests/pytest/unit/agent/tools/test_gateway_mapping.py
  • services/oss/tests/pytest/unit/agent/tools/test_resolution.py
  • web/ee/src/pages/w/[workspace_id]/p/[project_id]/apps/[app_id]/agent-chat/index.tsx
  • web/oss/package.json
  • web/oss/src/components/AgentChatSlice/AgentChatPanel.tsx
  • web/oss/src/components/AgentChatSlice/assets/agConfig.ts
  • web/oss/src/components/AgentChatSlice/assets/constants.ts
  • web/oss/src/components/AgentChatSlice/assets/files.ts
  • web/oss/src/components/AgentChatSlice/assets/loadSession.ts
  • web/oss/src/components/AgentChatSlice/assets/markdown.tsx
  • web/oss/src/components/AgentChatSlice/assets/rewind.ts
  • web/oss/src/components/AgentChatSlice/assets/toAgentaMessage.ts
  • web/oss/src/components/AgentChatSlice/assets/trace.ts
  • web/oss/src/components/AgentChatSlice/assets/transport.ts
  • web/oss/src/components/AgentChatSlice/components/AgentChatConversation.tsx
  • web/oss/src/components/AgentChatSlice/components/AgentMessage.tsx
  • web/oss/src/components/AgentChatSlice/components/SessionHistoryMenu.tsx
  • web/oss/src/components/AgentChatSlice/components/SessionTabLabel.tsx
  • web/oss/src/components/AgentChatSlice/components/ToolPart.tsx
  • web/oss/src/components/AgentChatSlice/index.tsx
  • web/oss/src/components/AgentChatSlice/state/sessions.ts
  • web/oss/src/components/Layout/Layout.tsx
  • web/oss/src/components/Playground/Playground.tsx
  • web/oss/src/components/SharedDrawers/SessionDrawer/assets/utils.ts
  • web/oss/src/components/SharedDrawers/SessionDrawer/components/SessionHeader/index.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/components/TraceContent/components/TraceTypeHeader/index.tsx
  • web/oss/src/components/pages/app-management/components/CreateAppDropdown/index.tsx
  • web/oss/src/components/pages/app-management/modals/CreateAppTypeModal/index.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/assets/sessionCellStore.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/DurationCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/EndTimeCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/FirstInputCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/LastOutputCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/StartTimeCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/TotalCostCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/TotalLatencyCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/TotalUsageCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/components/Cells/TracesCountCell.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/index.tsx
  • web/oss/src/components/pages/prompts/assets/iconHelpers.tsx
  • web/oss/src/lib/helpers/dynamicEnv.ts
  • web/oss/src/pages/w/[workspace_id]/p/[project_id]/apps/[app_id]/agent-chat/index.tsx
  • web/oss/src/state/newObservability/atoms/queries.ts
  • web/oss/src/state/newObservability/selectors/tracing.ts
  • web/packages/agenta-entities/src/loadable/controller.ts
  • web/packages/agenta-entities/src/workflow/core/schema.ts
  • web/packages/agenta-entities/src/workflow/state/appUtils.ts
  • web/packages/agenta-entities/src/workflow/state/evaluatorUtils.ts
  • web/packages/agenta-entities/src/workflow/state/helpers.ts
  • web/packages/agenta-entities/src/workflow/state/molecule.ts
  • web/packages/agenta-entities/src/workflow/state/store.ts
  • web/packages/agenta-entities/tests/unit/derive-workflow-type-agent.test.ts
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/ClaudePermissionsControl.tsx
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/McpServerItemControl.tsx
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/SandboxPermissionControl.tsx
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/SchemaPropertyRenderer.tsx
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/SkillConfigControl.tsx
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/ToolItemControl.tsx
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/connectionUtils.ts
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/index.ts
  • web/packages/agenta-entity-ui/tests/unit/connectionUtils.test.ts
  • web/packages/agenta-entity-ui/tests/unit/skillConfigControl.test.ts
  • web/packages/agenta-playground-ui/src/components/ExecutionHeader/index.tsx
  • web/packages/agenta-playground-ui/src/components/ExecutionItems/index.tsx
  • web/packages/agenta-playground-ui/src/context/PlaygroundUIContext.tsx
  • web/packages/agenta-playground/src/index.ts
  • web/packages/agenta-playground/src/state/controllers/executionController.ts
  • web/packages/agenta-playground/src/state/execution/agentRequest.ts
  • web/packages/agenta-playground/src/state/execution/generationSelectors.ts
  • web/packages/agenta-playground/src/state/execution/index.ts
  • web/packages/agenta-playground/src/state/execution/selectors.ts
  • web/packages/agenta-playground/src/state/index.ts
  • web/packages/agenta-playground/tests/unit/agentMode.test.ts
  • web/packages/agenta-playground/tests/unit/agentRequest.test.ts

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.76% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '[integration] Agent workflows (big-agents)' directly describes the PR as an integration tracking branch for the agent-workflows feature, which matches the stated objective of organizing and merging multiple related feature PRs.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The description matches the integration-tracking nature of the changes and references the same agent-workflows components added in the diff.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch big-agents

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

mmabrouk added 5 commits June 22, 2026 14:16
…t connection flags

No-auth Composio toolkits (codeinterpreter, the composio meta-toolkit) could not be connected.
The adapter always POSTs an auth config, which Composio rejects for a no-auth toolkit
(Auth_Config_NoAuthApp), and resolve/execute required a connected-account id those toolkits do
not have, so the whole no-auth path was unreachable.

Detect a no-auth toolkit (every auth_config_details[].mode == NO_AUTH), skip the auth-config and
connected-account creation, and persist a usable connection with no Composio account. Resolve and
execute omit the account id for a no-auth connection (Composio runs those tools with no account).
Connection validity is now server-owned: a client can no longer send flags.is_valid to mark a
pending auth connection usable. Refresh on a no-auth connection is a no-op, not a not-found error.

Verified: connect 500 to 200, resolve 200, /tools/call ran print(6*7) and returned 42. New
test_no_auth_connection.py (11 tests); all 15 tools unit tests pass, ruff clean. Reviewed by a
second agent and Codex; their one blocker (client-settable is_valid) is fixed here.

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 76c33a7d-feff-4e5f-acc0-962498f74cfc

📥 Commits

Reviewing files that changed from the base of the PR and between a97e608 and 2eed5d0.

📒 Files selected for processing (70)
  • sdks/python/agenta/__init__.py
  • sdks/python/agenta/sdk/agents/__init__.py
  • sdks/python/agenta/sdk/agents/adapters/__init__.py
  • sdks/python/agenta/sdk/agents/adapters/_runner_config.py
  • sdks/python/agenta/sdk/agents/adapters/agenta_builtins.py
  • sdks/python/agenta/sdk/agents/adapters/harnesses.py
  • sdks/python/agenta/sdk/agents/adapters/in_process.py
  • sdks/python/agenta/sdk/agents/adapters/local.py
  • sdks/python/agenta/sdk/agents/adapters/sandbox_agent.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/__init__.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/messages.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/routing.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/sse.py
  • sdks/python/agenta/sdk/agents/adapters/vercel/stream.py
  • sdks/python/agenta/sdk/agents/dtos.py
  • sdks/python/agenta/sdk/agents/errors.py
  • sdks/python/agenta/sdk/agents/interfaces.py
  • sdks/python/agenta/sdk/agents/mcp/__init__.py
  • sdks/python/agenta/sdk/agents/mcp/errors.py
  • sdks/python/agenta/sdk/agents/mcp/interfaces.py
  • sdks/python/agenta/sdk/agents/mcp/models.py
  • sdks/python/agenta/sdk/agents/mcp/parsing.py
  • sdks/python/agenta/sdk/agents/mcp/resolver.py
  • sdks/python/agenta/sdk/agents/mcp/wire.py
  • sdks/python/agenta/sdk/agents/streaming.py
  • sdks/python/agenta/sdk/agents/tools/__init__.py
  • sdks/python/agenta/sdk/agents/tools/compat.py
  • sdks/python/agenta/sdk/agents/tools/errors.py
  • sdks/python/agenta/sdk/agents/tools/interfaces.py
  • sdks/python/agenta/sdk/agents/tools/models.py
  • sdks/python/agenta/sdk/agents/tools/parsing.py
  • sdks/python/agenta/sdk/agents/tools/resolver.py
  • sdks/python/agenta/sdk/agents/tools/wire.py
  • sdks/python/agenta/sdk/agents/ui_messages.py
  • sdks/python/agenta/sdk/agents/utils/__init__.py
  • sdks/python/agenta/sdk/agents/utils/ts_runner.py
  • sdks/python/agenta/sdk/agents/utils/wire.py
  • sdks/python/agenta/sdk/decorators/routing.py
  • sdks/python/agenta/sdk/engines/running/interfaces.py
  • sdks/python/agenta/sdk/engines/running/utils.py
  • sdks/python/agenta/sdk/middlewares/running/normalizer.py
  • sdks/python/agenta/sdk/models/workflows.py
  • sdks/python/agenta/sdk/utils/types.py
  • sdks/python/agenta/tests/agents/test_streaming.py
  • sdks/python/oss/tests/pytest/integration/agents/__init__.py
  • sdks/python/oss/tests/pytest/integration/agents/test_transport_roundtrip.py
  • sdks/python/oss/tests/pytest/unit/agents/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/conftest.py
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_request.claude.json
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_request.pi.json
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_result.error.json
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_result.ok.json
  • sdks/python/oss/tests/pytest/unit/agents/mcp/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/mcp/test_resolver.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_capabilities_events.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_content_blocks.py
  • sdks/python/oss/tests/pytest/unit/agents/test_dtos_harness_configs.py
  • sdks/python/oss/tests/pytest/unit/agents/test_environment_lifecycle.py
  • sdks/python/oss/tests/pytest/unit/agents/test_harness_adapters.py
  • sdks/python/oss/tests/pytest/unit/agents/test_runner_adapter_config.py
  • sdks/python/oss/tests/pytest/unit/agents/test_ui_messages.py
  • sdks/python/oss/tests/pytest/unit/agents/test_wire_contract.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/__init__.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/test_models.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/test_parsing.py
  • sdks/python/oss/tests/pytest/unit/agents/tools/test_resolver.py
  • sdks/python/oss/tests/pytest/unit/test_normalizer_passthrough.py
  • sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py
  • sdks/python/oss/tests/pytest/utils/test_routing.py

Comment on lines +9 to +13
NOTE on packaging: the Node runner is NOT part of this Python wheel (``pip install agenta``
stays pure Python; the wheel contains zero ``.ts``/``.js``). How a standalone Pi user obtains
the runner -- an ``npx`` npm package, a local checkout, or a Docker sidecar over HTTP -- is an
open distribution decision; see ``docs/design/agent-workflows/typescript-structure/``. Do NOT
silently bundle a JS runner into the wheel.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align LocalBackend wording with the stated packaging contract.

Line 9-13 says the wheel must not bundle a JS runner, but Line 30 and the NotImplementedError messages still say “bundled JS”. This contradiction will confuse integrators.

Suggested wording fix
-class LocalBackend(Backend):
-    """Run Pi (bundled JS) or Claude (``claude-agent-sdk``) on this machine."""
+class LocalBackend(Backend):
+    """Run Pi (external Node runner) or Claude (``claude-agent-sdk``) on this machine."""
...
         raise NotImplementedError(
-            "LocalBackend is not implemented yet (Phase 3: Pi via bundled JS, "
+            "LocalBackend is not implemented yet (Phase 3: Pi via external Node runner, "
             "Phase 4: Claude via claude-agent-sdk)."
         )
...
         raise NotImplementedError(
-            "LocalBackend is not implemented yet (Phase 3: Pi via bundled JS, "
+            "LocalBackend is not implemented yet (Phase 3: Pi via external Node runner, "
             "Phase 4: Claude via claude-agent-sdk)."
         )

Also applies to: 30-38, 50-53

Comment on lines +126 to +136
def __init__(
self,
*,
sandbox: str = "local",
url: Optional[str] = None,
command: Optional[Sequence[str]] = None,
cwd: Optional[str] = None,
timeout: float = float(os.getenv("AGENTA_AGENT_RUNNER_TIMEOUT_SECONDS", "180")),
) -> None:
self._sandbox = sandbox
self._url = url

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate sandbox at construction time.

Line 129 currently accepts any string; invalid values get sent over the wire and fail late. Restrict this to supported values (local, daytona) and raise a configuration error early.

Suggested validation
 from ..dtos import (
@@
 )
+from ..errors import AgentRunnerConfigurationError
@@
     def __init__(
         self,
         *,
         sandbox: str = "local",
@@
         timeout: float = float(os.getenv("AGENTA_AGENT_RUNNER_TIMEOUT_SECONDS", "180")),
     ) -> None:
+        allowed_sandboxes = {"local", "daytona"}
+        if sandbox not in allowed_sandboxes:
+            raise AgentRunnerConfigurationError(
+                f"Unsupported sandbox '{sandbox}'. Expected one of: {sorted(allowed_sandboxes)}."
+            )
         self._sandbox = sandbox
         self._url = url

Comment thread sdks/python/agenta/sdk/agents/adapters/vercel/messages.py
Comment on lines +693 to +699
llm_config = prompt_cfg.get("llm_config") or {}
model = llm_config.get("model") or defaults.model
instructions = _system_text(prompt_cfg.get("messages")) or defaults.instructions
raw_tools = llm_config.get("tools")
if raw_tools is None:
raw_tools = prompt_cfg.get("tools")
else:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard llm_config type before dictionary access.

Line 694 assumes prompt["llm_config"] is a dict. If it’s a non-dict value, this path crashes with AttributeError instead of applying defaults.

Proposed fix
     prompt_cfg = params.get("prompt")
     if isinstance(prompt_cfg, dict):
-        llm_config = prompt_cfg.get("llm_config") or {}
+        raw_llm_config = prompt_cfg.get("llm_config")
+        llm_config = raw_llm_config if isinstance(raw_llm_config, dict) else {}
         model = llm_config.get("model") or defaults.model
         instructions = _system_text(prompt_cfg.get("messages")) or defaults.instructions
         raw_tools = llm_config.get("tools")
         if raw_tools is None:
             raw_tools = prompt_cfg.get("tools")

Comment on lines +222 to +232
sandbox = await self._sandbox()
if provisioning:
await sandbox.add_files(provisioning)
return await self._backend.create_session(
sandbox,
config,
harness=harness,
secrets=session_config.secrets,
trace=session_config.trace,
session_id=session_config.session_id,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Destroy per-session sandbox on setup/session-creation failure.

If Line 224 (add_files) or Line 225 (create_session) raises, a per-session sandbox is left alive with no owner to tear it down.

Proposed fix
     async def create_session(
         self,
         config: HarnessAgentConfig,
         *,
         harness: HarnessType,
         session_config: SessionConfig,
         provisioning: Optional[Mapping[str, bytes]] = None,
     ) -> Session:
         """Provision a sandbox per policy, then open a session in it."""
         sandbox = await self._sandbox()
-        if provisioning:
-            await sandbox.add_files(provisioning)
-        return await self._backend.create_session(
-            sandbox,
-            config,
-            harness=harness,
-            secrets=session_config.secrets,
-            trace=session_config.trace,
-            session_id=session_config.session_id,
-        )
+        try:
+            if provisioning:
+                await sandbox.add_files(provisioning)
+            return await self._backend.create_session(
+                sandbox,
+                config,
+                harness=harness,
+                secrets=session_config.secrets,
+                trace=session_config.trace,
+                session_id=session_config.session_id,
+            )
+        except Exception:
+            if self._sandbox_per_session:
+                try:
+                    await sandbox.destroy()
+                except Exception:
+                    pass
+            raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sandbox = await self._sandbox()
if provisioning:
await sandbox.add_files(provisioning)
return await self._backend.create_session(
sandbox,
config,
harness=harness,
secrets=session_config.secrets,
trace=session_config.trace,
session_id=session_config.session_id,
)
sandbox = await self._sandbox()
try:
if provisioning:
await sandbox.add_files(provisioning)
return await self._backend.create_session(
sandbox,
config,
harness=harness,
secrets=session_config.secrets,
trace=session_config.trace,
session_id=session_config.session_id,
)
except Exception:
if self._sandbox_per_session:
try:
await sandbox.destroy()
except Exception:
pass
raise

Comment on lines +315 to +321
session = await self.create_session(config)

def _absorb(result: AgentResult) -> None:
if result.session_id:
config.session_id = result.session_id

return session.stream(messages).on_result(_absorb).on_cleanup(session.destroy)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Ensure session cleanup if stream setup fails synchronously.

Line 321 only registers cleanup after session.stream(messages) succeeds. If stream construction raises, the session is leaked.

Proposed fix
         session = await self.create_session(config)
+        try:
+            run = session.stream(messages)
+        except Exception:
+            await session.destroy()
+            raise
 
         def _absorb(result: AgentResult) -> None:
             if result.session_id:
                 config.session_id = result.session_id
 
-        return session.stream(messages).on_result(_absorb).on_cleanup(session.destroy)
+        return run.on_result(_absorb).on_cleanup(session.destroy)

Comment on lines +7 to +20
from agenta.sdk.agents.tools.models import MissingSecretPolicy

from .errors import MissingMCPSecretError
from .interfaces import MCPSecretProvider
from .models import MCPServerConfig, ResolvedMCPServer


class MCPResolver:
def __init__(
self,
*,
secret_provider: MCPSecretProvider,
missing_secret_policy: MissingSecretPolicy = MissingSecretPolicy.ERROR,
) -> None:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Breaks declared layer direction by importing tools model into MCP.

MCPResolver currently depends on agenta.sdk.agents.tools.models.MissingSecretPolicy, but this cohort declares tools as depending on MCP, not the other way around. This reverse edge can create import-order fragility and circular dependency risk as the stack evolves. Move MissingSecretPolicy to a neutral/shared module (or MCP/shared contract module) and import it from both subsystems.

Possible direction
- from agenta.sdk.agents.tools.models import MissingSecretPolicy
+ from agenta.sdk.agents.shared.missing_secret_policy import MissingSecretPolicy

(then define/move the enum in that shared module and update tools imports accordingly)

Comment on lines +67 to +75
out = stdout.decode("utf-8", "replace")
err = stderr.decode("utf-8", "replace")
if not out.strip():
raise RuntimeError(
f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"
)
try:
return json.loads(out)
except json.JSONDecodeError as exc:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat non-zero subprocess exit as transport failure even with parseable JSON.

Line 74 returns parsed JSON without checking proc.returncode; a crashed runner can look successful if it emitted partial/legacy JSON before exiting non-zero.

Suggested fix
@@ async def deliver_subprocess(...):
     out = stdout.decode("utf-8", "replace")
     err = stderr.decode("utf-8", "replace")
+    if proc.returncode not in (0, None):
+        raise RuntimeError(
+            "Agent runner exited non-zero. "
+            f"exit={proc.returncode} stderr={err[-2000:]} stdout={out[:500]}"
+        )
     if not out.strip():
         raise RuntimeError(
             f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
out = stdout.decode("utf-8", "replace")
err = stderr.decode("utf-8", "replace")
if not out.strip():
raise RuntimeError(
f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"
)
try:
return json.loads(out)
except json.JSONDecodeError as exc:
out = stdout.decode("utf-8", "replace")
err = stderr.decode("utf-8", "replace")
if proc.returncode not in (0, None):
raise RuntimeError(
"Agent runner exited non-zero. "
f"exit={proc.returncode} stderr={err[-2000:]} stdout={out[:500]}"
)
if not out.strip():
raise RuntimeError(
f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"
)
try:
return json.loads(out)
except json.JSONDecodeError as exc:

Comment thread sdks/python/agenta/sdk/agents/utils/ts_runner.py
# agenta:builtin:* — application-only (not evaluators)
("builtin", "chat"): (True, False, False),
("builtin", "completion"): (True, False, False),
("builtin", "agent"): (True, False, False),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

is_agent is never inferred, so agent workflows keep WorkflowFlags.is_agent=False.

You added the built-in agent role mapping, but infer_flags_from_data still never computes/passes is_agent into WorkflowFlags, so the new agent flag/filter path won’t work as intended.

💡 Proposed fix
@@
-    is_chat = key == "chat" or _has_messages_input(inputs_schema)
+    is_chat = key == "chat" or _has_messages_input(inputs_schema)
+    is_agent = key == "agent"
@@
     return WorkflowFlags(
@@
         # schema-derived
         is_chat=is_chat,
+        is_agent=is_agent,
         # interface-derived
         has_url=has_url,

mmabrouk added 2 commits June 23, 2026 12:10
…ckage

Move the Agenta-platform-backed tool and secret resolution out of the agent service
into a new SDK package (agenta.sdk.agents.platform) so a standalone SDK user with a
local backend resolves gateway tools and secrets the same way the service does.

- New SDK package: PlatformConnection, AgentaGatewayToolResolver, AgentaNamedSecretProvider
  + resolve_named_secrets, resolve_provider_keys, and three entrypoints resolve_tools /
  resolve_mcp / resolve_secrets.
- Service is now thin: client.py deleted (logic in PlatformConnection, timeout guarded);
  tools/{gateway,secrets}.py and secrets.py are re-export shims; resolver.py keeps only the
  AGENTA_AGENT_ENABLE_MCP gate; app.py calls the three entrypoints with symmetric helpers.
- Behavior-preserving: /run wire + resolved bundle unchanged (golden test green). Secret
  logs count-only; named secrets restricted to the requested set.
- Tests: SDK agents 164 + service agent unit 20; HTTP integration tests relocated to the SDK.

Claude-Session: https://claude.ai/code/session_019gCmobHk9Pi3Y2HDTw3Wrs

test(agent): add SDK platform conftest and gateway resolver test
…links

fix(docs): remove broken custom-agent-runner-images links
ci(agent): build and test sandbox-agent images
mmabrouk added 2 commits June 24, 2026 19:03
chore(railway): add sandbox-agent preview deployment
chore(kubernetes): deploy sandbox-agent sidecar
@mmabrouk mmabrouk marked this pull request as ready for review June 24, 2026 17:05
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. draft labels Jun 24, 2026
mmabrouk added 2 commits June 24, 2026 19:26
ClaudeAgentConfig must override wire_skills to return {} since Claude's
headless SDK cannot load inline skill packages. The override was lost in
the main->big-agents merge, regressing
test_invoke_cross_harness_same_body_divergent_configs.

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT
Two files failed ruff format --check on the integration branch.

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT
mmabrouk added 2 commits June 24, 2026 19:44
* fix(frontend): repair SessionsTable JSX from botched merge

The main->big-agents merge left a duplicate ternary and a mismatched
<div>/</SessionStoreProvider> wrapper, breaking prettier, eslint and the
web build. Unite both sides: wrap in SessionStoreProvider, keep one
table with store={store} and the flex-1 layout class.

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT

* fix(entities): add is_agent flag in ephemeral workflow build

The merge added a defaulted is_agent flag to workflowFlagsSchema, but the
agent-playground ephemeral workflow constructed its flags without it. With
the literal true/false flag values, the 'as Workflow' cast then failed
bidirectional overlap (TS2352), breaking the agenta-web build. Set
is_agent from the workflow type.

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT

* fix(playground,entity-ui): clear package type errors from merge

Two more tsc --noEmit failures broke the agenta-web build (turbo builds
each package before the app):
- agentRequest.ts: annotate headers as Record<string,string> so the
  conditional header-factory spread does not narrow away the index
  signature (Authorization access, TS2339).
- AgentConfigControl.tsx: drop the stale 'default' key from
  CONNECTION_MODE_LABELS; ConnectionMode is agenta|self_managed only
  after the provider-model-auth refactor (TS2353).

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT
…4825)

My earlier #4824 added a ClaudeAgentConfig.wire_skills override returning
{} (graceful-degrade), but that contradicts the authoritative behavior
from 08212c6 (fix(agent): materialize skills for Claude harness): the
runner materializes skills under .claude/skills/<name>, so Claude carries
them on the wire. The override broke the SDK unit test
test_claude_carries_skills_for_project_local_materialization.

Remove the override (Claude inherits the base wire_skills) and update the
stale services-test assertion to expect Claude carries the skill.

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Railway Preview Environment

Preview URL https://gateway-production-0f05.up.railway.app/w
Project agenta-oss-pr-4791
Image tag pr-4791-237d7c6
Status Deployed
Railway logs Open logs
Workflow logs View workflow run
Updated at 2026-06-24T19:33:14.381Z

…4826)

* test(agent): align acceptance/integration tests with refactors

These suites were skipped while the web build was broken; once it passed
they ran and surfaced pre-existing drift on big-agents:

- sdk acceptance: the agent builtin now ships a registered interface (no
  in-process handler), so test_agent_alias_is_not_registered was stale.
  Renamed to assert the interface is registered and the handler is absent.
- services integration: gateway/secret resolution moved into the SDK
  platform package (#4772), so the agent_api_base/request_authorization/
  httpx/log module attributes the conftest patched no longer exist on the
  service shims. Patch the SDK platform connection derivation helpers and
  the SDK platform module httpx/log instead.

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT

* test(agent): patch SDK platform secrets module in resolve-secrets test

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT
The store path strips the server-owned is_platform flag before persisting
(_scrub_server_owned_flags), but the query path did not, so any
/workflows/query carrying is_platform (e.g. a client re-posting a
workflow's own echoed flags) built a JSONB containment filter for a key
that is never stored, matching zero rows.

Scrub server-owned flags on both the artifact and revision query builders,
symmetric with the write path. Platform-catalogue workflows are served
from the code catalog, not the DB, so is_platform must never gate a DB
containment query.

Fixes the skipped-then-surfaced acceptance test
test_query_workflows_by_flags (count 0 -> 1).

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT
)

#4827 scrubbed the server-owned is_platform flag from query filters
unconditionally, which broke test_query_with_explicit_is_platform_filters_on_it:
an explicit is_platform=True is a deliberate platform-catalogue filter and
must be preserved.

Use a query-specific scrub that drops a server-owned flag only when its
value is False (the echoed default that would otherwise match nothing,
since the key is scrubbed on write). An explicit True is kept. The write
path keeps the unconditional scrub.

Claude-Session: https://claude.ai/code/session_01DEZYALzKjh9ocjkscaBWRT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

draft size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants