Skip to content

feat(llm): make SSE buffer caps configurable (tool-JSON, thinking, compaction) #4750

@bug-ops

Description

@bug-ops

Description

The SSE buffer caps added in #4727 are hardcoded constants and cannot be tuned per-deployment without a code change.

Buffer File Constant Current cap
Tool-call JSON crates/zeph-llm/src/sse.rs:324 MAX_TOOL_JSON_BUF 4 MiB
Thinking crates/zeph-llm/src/sse.rs:339 MAX_THINKING_BUF 1 MiB
Compaction crates/zeph-llm/src/sse.rs:171 MAX_COMPACTION_BUF 32 KiB

Expected Behavior

Deployments handling large tool payloads (e.g., code generation tools that emit large JSON) or extended thinking sessions should be able to raise the caps via config without rebuilding.

Suggested config path: [llm.stream_limits] with max_tool_json_bytes, max_thinking_bytes, max_compaction_bytes; default values unchanged from current constants.

Actual Behavior

Caps are compile-time constants; exceed → warn + discard, no recovery possible.

Environment

  • Version: HEAD 6585ebf
  • Features: full

Logs / Evidence

Detected during CI-952 arch audit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P4Long-term / exploratoryenhancementNew feature or requestllmzeph-llm crate (Ollama, Claude)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions