Skip to content

feat(pinecone): add Inference API instrumentation for embed and rerank#4269

Open
by-Kaimercer wants to merge 1 commit into
traceloop:mainfrom
by-Kaimercer:main
Open

feat(pinecone): add Inference API instrumentation for embed and rerank#4269
by-Kaimercer wants to merge 1 commit into
traceloop:mainfrom
by-Kaimercer:main

Conversation

@by-Kaimercer

@by-Kaimercer by-Kaimercer commented Jun 16, 2026

Copy link
Copy Markdown

Summary

Adds OpenTelemetry tracing for Pinecone's Inference API calls:

  • — generates vector embeddings
  • — reranks documents using cross-encoder models

New span attributes

  • — the embedding/reranking model used
  • — number of input texts/documents
  • — number of embeddings returned
  • — dimension of embedding vectors
  • — top_n parameter for reranking
  • — number of reranked results
  • — read units consumed

Design decisions

  • Spans use GenAI semantic conventions () matching the pattern used by other LLM provider instrumentations (OpenAI, Anthropic, etc.)
  • Inference spans don't set since they're not DB operations
  • No SDK changes needed — is already wired in

Files changed

  • — added wrapped methods, inference attribute helpers, updated and
  • — bumped 0.61.0 → 0.62.0

Fixes #1618

Summary by CodeRabbit

  • New Features

    • Enhanced observability for Pinecone inference operations, now instrumenting embed and rerank API calls with detailed metrics including model information, request/response token counts, query content, embedding vector dimensionality, top-k parameters, reranking result counts, and platform resource usage statistics.
  • Version

    • Released version 0.62.0

Adds OpenTelemetry tracing for Pinecone's Inference API calls:
- pc.inference.embed() — generates vector embeddings
- pc.inference.rerank() — reranks documents using cross-encoder models

New span attributes:
- gen_ai.request.model — the embedding/reranking model used
- gen_ai.usage.input_count — number of input texts/documents
- gen_ai.usage.output_count — number of embeddings returned
- pinecone.embedding.dimensionality — dimension of embedding vectors
- pinecone.rerank.top_n — top_n parameter for reranking
- pinecone.rerank.result_count — number of reranked results
- pinecone.usage.read_units — read units consumed

Spans use GenAI semantic conventions (gen_ai.*) matching the pattern
used by other LLM provider instrumentations (OpenAI, Anthropic, etc.).

Bumped version to 0.62.0.

Fixes traceloop#1618
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Ayaan Khann seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The Pinecone instrumentation package is extended to trace PineconeInference.embed and PineconeInference.rerank calls. Two new span attribute helpers capture inference-specific request and response fields. The wrapper conditionally applies vector DB vendor attributes and routes inference calls through the new helpers. _instrument/_uninstrument are wired to the pinecone.inference module, and the version is bumped to 0.62.0.

Changes

Pinecone Inference Instrumentation

Layer / File(s) Summary
Inference span targets and attribute helpers
packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py
WRAPPED_METHODS adds PineconeInference.embed and PineconeInference.rerank entries. _set_inference_input_attributes records model, input count, query text, and top_n. _set_inference_response_attributes records vector count, dimensionality, rerank result count, and usage read units.
Wrapper inference detection and response routing
packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py
Span setup checks the pinecone.inference. name prefix to skip VECTOR_DB_VENDOR and dispatch to inference input helper. Response handling similarly forks between inference response helper and the existing vector DB query path.
Instrument/uninstrument wiring and version bump
packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py, packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/version.py
_instrument wraps PineconeInference.embed and PineconeInference.rerank via pinecone.inference with debug logging on failure. _uninstrument adds matching unwrap calls. Version bumped from 0.61.0 to 0.62.0.

Sequence Diagram(s)

sequenceDiagram
    participant App
    participant PineconeInferenceWrapper
    participant _set_inference_input_attributes
    participant PineconeInference
    participant _set_inference_response_attributes
    participant OTelSpan

    App->>PineconeInferenceWrapper: embed(model, inputs) / rerank(model, query, documents, top_n)
    PineconeInferenceWrapper->>OTelSpan: start span (pinecone.inference.embed / pinecone.inference.rerank)
    PineconeInferenceWrapper->>_set_inference_input_attributes: kwargs
    _set_inference_input_attributes->>OTelSpan: set model, input_count, query, top_n
    PineconeInferenceWrapper->>PineconeInference: call original method
    PineconeInference-->>PineconeInferenceWrapper: response
    PineconeInferenceWrapper->>_set_inference_response_attributes: response
    _set_inference_response_attributes->>OTelSpan: set vector_count, dimensionality / rerank_count, read_units
    PineconeInferenceWrapper-->>App: response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 A bunny hops through inference land,
Where embeddings bloom and rerankings planned.
Span names whisper pinecone.inference.*,
Read units counted, dimensions on the fix.
No vendor tag for the new API trail —
Version bumped, and tracing sets sail! 🌲

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: adding Inference API instrumentation for embed and rerank operations, which matches the core objective of the pull request.
Linked Issues check ✅ Passed The PR successfully implements Inference API instrumentation (embed and rerank) with GenAI semantic conventions for span attributes, fully addressing the coding requirements from issue #1618. The Assistant API portion remains unaddressed but is explicitly out-of-scope for this PR.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the PR objectives. Version bump is standard practice for feature releases, and both instrumentation additions and version updates are in-scope for adding new API instrumentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py`:
- Around line 129-139: The Pinecone-specific span attributes used in the
set_span_attribute calls are incorrectly sourced from SpanAttributes instead of
AISpanAttributes. Replace SpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY with
AISpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY in the first
set_span_attribute call, and replace SpanAttributes.PINECONE_RERANK_RESULT_COUNT
with AISpanAttributes.PINECONE_RERANK_RESULT_COUNT in the second
set_span_attribute call. This ensures Pinecone-specific attributes are correctly
imported and referenced from the appropriate attributes class.
- Line 112: The call to set_span_attribute with
SpanAttributes.PINECONE_RERANK_TOP_N on line 112 is using the wrong attributes
class. Replace SpanAttributes with AISpanAttributes in the set_span_attribute
call for PINECONE_RERANK_TOP_N, since Pinecone-specific attributes are defined
in AISpanAttributes from opentelemetry.semconv_ai (as already correctly used in
lines 167-170 for PINECONE_USAGE_READ_UNITS and PINECONE_USAGE_WRITE_UNITS), not
in SpanAttributes from opentelemetry.semconv.trace. This will prevent an
AttributeError at runtime when the span attribute is accessed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fb1c6386-ebf8-4703-a094-71d3cf4b5f76

📥 Commits

Reviewing files that changed from the base of the PR and between aa4a469 and e589df9.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py
  • packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/version.py


top_n = kwargs.get("top_n")
if top_n is not None:
set_span_attribute(span, SpanAttributes.PINECONE_RERANK_TOP_N, top_n)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the attribute exists in semconv_ai but not in standard semconv
echo "=== Checking AISpanAttributes for PINECONE_RERANK_TOP_N ==="
rg -n "PINECONE_RERANK_TOP_N" --type py

echo "=== Checking semconv_ai SpanAttributes definition ==="
python -c "from opentelemetry.semconv_ai import SpanAttributes; print([a for a in dir(SpanAttributes) if 'PINECONE' in a])" 2>/dev/null || echo "Could not import semconv_ai"

echo "=== Checking standard semconv SpanAttributes ==="
python -c "from opentelemetry.semconv.trace import SpanAttributes; print([a for a in dir(SpanAttributes) if 'PINECONE' in a])" 2>/dev/null || echo "Could not import semconv.trace"

Repository: traceloop/openllmetry

Length of output: 440


🏁 Script executed:

# Check the imports and the relevant code sections
head -50 packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py | grep -E "^from|^import"

Repository: traceloop/openllmetry

Length of output: 997


🏁 Script executed:

# Read the _set_inference_input_attributes function around line 112
sed -n '88,115p' packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py

Repository: traceloop/openllmetry

Length of output: 1004


🏁 Script executed:

# Check lines 167, 170 mentioned in the review
sed -n '165,175p' packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py

Repository: traceloop/openllmetry

Length of output: 448


Change line 112 to use AISpanAttributes instead of SpanAttributes.

SpanAttributes from opentelemetry.semconv.trace does not contain Pinecone-specific attributes like PINECONE_RERANK_TOP_N. These attributes are defined in AISpanAttributes from opentelemetry.semconv_ai, as correctly used in lines 167-170 for PINECONE_USAGE_READ_UNITS and PINECONE_USAGE_WRITE_UNITS. Accessing SpanAttributes.PINECONE_RERANK_TOP_N will raise an AttributeError at runtime.

Fix
-        set_span_attribute(span, SpanAttributes.PINECONE_RERANK_TOP_N, top_n)
+        set_span_attribute(span, AISpanAttributes.PINECONE_RERANK_TOP_N, top_n)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
set_span_attribute(span, SpanAttributes.PINECONE_RERANK_TOP_N, top_n)
set_span_attribute(span, AISpanAttributes.PINECONE_RERANK_TOP_N, top_n)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py`
at line 112, The call to set_span_attribute with
SpanAttributes.PINECONE_RERANK_TOP_N on line 112 is using the wrong attributes
class. Replace SpanAttributes with AISpanAttributes in the set_span_attribute
call for PINECONE_RERANK_TOP_N, since Pinecone-specific attributes are defined
in AISpanAttributes from opentelemetry.semconv_ai (as already correctly used in
lines 167-170 for PINECONE_USAGE_READ_UNITS and PINECONE_USAGE_WRITE_UNITS), not
in SpanAttributes from opentelemetry.semconv.trace. This will prevent an
AttributeError at runtime when the span attribute is accessed.

Comment on lines +129 to +139
set_span_attribute(
span,
SpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY,
len(first_emb.values),
)

# For rerank responses
if hasattr(response, "results") and response.results:
set_span_attribute(
span, SpanAttributes.PINECONE_RERANK_RESULT_COUNT, len(response.results)
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Same issue: SpanAttributes does not contain Pinecone-specific attributes.

Lines 131 and 138 use SpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY and SpanAttributes.PINECONE_RERANK_RESULT_COUNT, but these Pinecone-specific attributes should come from AISpanAttributes.

🐛 Proposed fix
         if hasattr(first_emb, "values") and first_emb.values:
             set_span_attribute(
                 span,
-                SpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY,
+                AISpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY,
                 len(first_emb.values),
             )

     # For rerank responses
     if hasattr(response, "results") and response.results:
         set_span_attribute(
-            span, SpanAttributes.PINECONE_RERANK_RESULT_COUNT, len(response.results)
+            span, AISpanAttributes.PINECONE_RERANK_RESULT_COUNT, len(response.results)
         )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/opentelemetry-instrumentation-pinecone/opentelemetry/instrumentation/pinecone/__init__.py`
around lines 129 - 139, The Pinecone-specific span attributes used in the
set_span_attribute calls are incorrectly sourced from SpanAttributes instead of
AISpanAttributes. Replace SpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY with
AISpanAttributes.PINECONE_EMBEDDING_DIMENSIONALITY in the first
set_span_attribute call, and replace SpanAttributes.PINECONE_RERANK_RESULT_COUNT
with AISpanAttributes.PINECONE_RERANK_RESULT_COUNT in the second
set_span_attribute call. This ensures Pinecone-specific attributes are correctly
imported and referenced from the appropriate attributes class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🚀 Feature: Support new Pinecone APIs

2 participants