chore: Python CUDA bridge: CI and buffer handoff ABI#8618
Conversation
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
169 µs | 205.6 µs | -17.79% |
| ⚡ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
259.4 µs | 224.3 µs | +15.65% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
306.7 µs | 271.4 µs | +13% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
273.6 ns | 244.4 ns | +11.93% |
| ⚡ | Simulation | eq_i64_constant |
319.3 µs | 289.4 µs | +10.32% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ad/pycudf3 (98966bf) with develop (2a15a9f)
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
ac4165b to
59fb909
Compare
f2bb115 to
f9cf177
Compare
|
@claude review in depth |
|
Claude review automation is disabled for pull requests that modify Why:
Ask a human reviewer to inspect workflow changes directly. |
Add explicit GPU-runner CI coverage for the Python CUDA bridge through the vortex-data[cuda] optional-extra path. Extend the private metadata bridge to carry host buffer-export capsules instead of only a buffer count. The base Python package exports repr(C) VortexBufferExport descriptors, and vortex-python-cuda imports them into local BufferHandles before deserializing arrays through its own VortexSession. Tests now cover primitive, nullable, bool, and struct arrays across the bridge, plus the existing CUDA Arrow Device smoke path. Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
CI coverage for the Python CUDA bridge through the vortex-data[cuda] optional-extra path.
Extend the private metadata bridge to carry host buffer-export capsules instead of only a buffer count. The base Python package exports repr(C) VortexBufferExport descriptors, and vortex-python-cuda imports them into local BufferHandles before deserializing arrays through its own VortexSession.
Tests now cover primitive, nullable, bool, and struct arrays across the bridge, plus the existing CUDA Arrow Device smoke path.