Skip to content

Add LangSmith tracing plugin for Temporal workflows#1369

Draft
xumaple wants to merge 17 commits intomainfrom
maplexu/langsmith-plugin
Draft

Add LangSmith tracing plugin for Temporal workflows#1369
xumaple wants to merge 17 commits intomainfrom
maplexu/langsmith-plugin

Conversation

@xumaple
Copy link

@xumaple xumaple commented Mar 17, 2026

Summary

  • Adds temporalio.contrib.langsmith plugin that creates LangSmith trace hierarchies for Temporal operations (workflows, activities, signals, queries, updates, child workflows, Nexus)
  • Supports ambient @traceable context propagation through Temporal headers, replay-safe tracing, and an add_temporal_runs toggle for lightweight context-only mode
  • 48 tests covering unit, integration, and comprehensive end-to-end scenarios

🤖 Generated with Claude Code

@CLAassistant
Copy link

CLAassistant commented Mar 17, 2026

CLA assistant check
All committers have signed the CLA.

xumaple and others added 4 commits March 16, 2026 21:44
Implements a LangSmith contrib plugin that creates trace hierarchies
for Temporal operations (workflows, activities, signals, queries,
updates, child workflows, Nexus). Supports ambient @Traceable context
propagation, replay-safe tracing, and an add_temporal_runs toggle for
lightweight context-only mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…late

- Add ReplaySafeRunTree wrapper that handles replay skipping and sandbox
  safety (post/end/patch no-op during replay, sandbox_unrestricted in
  workflow context), inspired by OTel plugin's _ReplaySafeSpan pattern
- Add config.maybe_run() to eliminate repeated config kwargs at every
  call site
- Add _traced_call (client outbound) and _traced_outbound (workflow
  outbound) helpers to reduce interceptor methods to one-liners
- Fold _extract_context into _workflow_maybe_run for workflow inbound
- Remove _safe_post, _safe_patch helpers (internalized in wrapper)
- Remove in_workflow parameter from _maybe_run (wrapper detects it)
- Establish consistent wrapping invariant: all run references are
  ReplaySafeRunTree, unwrapping is unconditional ._run at RunTree
  constructor boundary
- Parametrize redundant unit tests (client outbound, workflow
  inbound/outbound) and remove duplicate test
- Remove _make_interceptor test helper, use LangSmithInterceptor directly
- Collapse plugin constructor tests into one, add comprehensive plugin
  integration test, remove redundant sandbox tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix ruff I001 import sorting violations in _interceptor.py and
test_integration.py. Extract _get_current_run_safe() helper for
reading ambient LangSmith context with replay safety.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@xumaple xumaple force-pushed the maplexu/langsmith-plugin branch from b6a7751 to 80d981e Compare March 17, 2026 01:44
xumaple and others added 5 commits March 17, 2026 11:09
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@xumaple xumaple force-pushed the maplexu/langsmith-plugin branch from 2803b95 to 768ac70 Compare March 17, 2026 17:22
xumaple and others added 8 commits March 18, 2026 16:26
- Change add_temporal_runs default to False in both plugin and
  interceptor (reviewer preference for opt-in behavior)
- Rename plugin to langchain.LangSmithPlugin per organization.PluginName
  convention
- Prefix header key with _temporal- to avoid collisions
- Update all tests to explicitly pass add_temporal_runs=True

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add @Traceable call (outer_chain) directly in ComprehensiveWorkflow
  to test non-deterministic tracing alongside deterministic replay
- Set max_cached_workflows=0 on all test workers to force replay on
  every workflow task, exposing header non-determinism
- Restructure comprehensive tests with mid-workflow worker restart:
  one shared collector across two worker lifetimes proves context
  propagates via headers, not cached plugin state
- Add is_waiting_for_signal query and poll helper for deterministic
  sync (no arbitrary sleeps)
- Consolidate make_mock_ls_client in conftest.py, remove unused
  fixtures, use raw client for polling to avoid trace contamination
- Tests are expected to fail (TDD): sandbox blocks @Traceable in
  workflows, max_cached_workflows=0 exposes outputs=None on eviction

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move RunTree.post()/patch() I/O off the workflow task thread to a
single-worker ThreadPoolExecutor, preventing deadlocks from
compressed_traces.lock contention with the LangSmith drain thread.

Key changes:
- _ReplaySafeRunTree.create_child() override propagates replay safety
  and deterministic IDs to nested @langsmith.traceable calls
- Executor-backed post()/patch() with FIFO ordering and fire-and-forget
  error logging via Future.add_done_callback
- _ContextBridgeRunTree for add_temporal_runs=False without external
  context — invisible parent that produces root @Traceable runs
- aio_to_thread patch simplified: removed harmful replay-time tracing
  disable, added error gate for async @Traceable without plugin
- Plugin shutdown via SimplePlugin.run_context instead of dead method
- Fix misleading comments referencing test artifacts instead of
  production reasons, remove OTel cross-references
- Strict dump_runs catches dangling parent_run_id references
- Add **/CLAUDE.md to .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace ~35 Any annotations across _plugin.py and _interceptor.py with
precise types (langsmith.Client, RunTree, _ReplaySafeRunTree, specific
SDK interceptor input types, etc.). Add _InputWithHeaders Protocol for
private helpers matching the OTel interceptor pattern. Narrow return
types to match base class signatures exactly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prefix unused mock parameters with underscore (_args, _kwargs) and
rename unused variable (_collector) to satisfy basedpyright's
reportUnusedParameter and reportUnusedVariable checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove useless _get_current_run_safe wrapper (inline get_current_run_tree)
- Restore generic type params on interceptor return types (ActivityHandle[Any],
  ChildWorkflowHandle[Any, Any]) to match base class exactly
- Fix _make_bridge return type (Any → _ContextBridgeRunTree)
- Fix _poll_query helper types (Any → WorkflowHandle, Callable)
- Strengthen weak assertions in mixed sync/async integration tests
- Add _InputWithHeaders Protocol for private helper input params

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wrap all 5 activity definitions with @Traceable as outer decorator to
test LangSmith tracing through the full activity execution path. Update
all 9 expected trace hierarchies to account for the additional @Traceable
run nested under each RunActivity. Fix outputs assertion to only check
interceptor runs (colon-prefixed names) since @Traceable captures actual
return values rather than the interceptor's {'status': 'ok'}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants