Skip to content

Large StateVector and DensityMatrix results cannot be serialized efficiently in GateModelTaskResult #373

@AbeCoull

Description

@AbeCoull

Problem

Large analytical result types from the default simulator are efficient while they stay in the current process, but they don't serialize efficiently through GateModelTaskResult.

StateVector and DensityMatrix return np.ndarray values, and _generate_results() stores those arrays directly in ResultTypeValue.value. That works for local LocalSimulator(...).run(...).result() because the ndarray can pass through as a Python object.

The problem appears at a JSON boundary, for example when persisting a raw result, round tripped through GateModelTaskResult.parse_raw(...), or emitting a simulator result envelope from another process. ResultTypeValue.value is currently list | float | dict, with no compact ndarray representation. A raw ndarray is not JSON serializable, and converting it to lists causes a large memory blowup.

Full density matrices are worse because they scale as 16 * 4^n bytes before serialization overhead.

Proposed fix

Add an optional binary ndarray payload to ResultTypeValue, for example:

class ResultTypeValue(BaseModel):
    type: Results
    value: list | float | dict | None = None

    binaryValue: str | None = None
    binaryEncoding: Literal["base64"] | None = None
    dtype: Literal["complex64", "complex128", "float64"] | None = None
    shape: list[int] | None = None

Use base64 text rather than raw bytes, since arbitrary ndarray bytes are not guaranteed to be valid UTF-8 under pydantic v1 JSON serialization.

For StateVector and DensityMatrix, the simulator can populate binaryValue, dtype, and shape instead of expanding the ndarray into value. SDK consumers can prefer binaryValue when present and fall back to value for backward compatibility.

Scope

In scope:

  • StateVector
  • full or reduced DensityMatrix
  • schema support in ResultTypeValue
  • SDK decode path from binaryValue
  • default simulator encode path

Out of scope:

  • expectation, variance, sample, and amplitude result types
  • changing the local in-process fast path
  • side-channel storage by URI, which may still be useful for very large results

Benefits

The simulator core can produce these results, but the result envelope becomes the limiting factor once serialization is required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions