Add a WebAssembly SIMD backend for reusable intrinsics kernels by teddygood · Pull Request #5685 · OpenMathLib/OpenBLAS

teddygood · 2026-03-18T09:19:26Z

Follow-up to #5676 and #4023.

This PR adds a WebAssembly SIMD backend for the shared SIMD intrinsics layer. The goal is to let existing reusable kernels that already use the common intrinsics API execute through WebAssembly SIMD on ARCH_WASM instead of falling back to scalar code. As part of that, it adds kernel/simd/intrin_wasm.h, wires it up from kernel/simd/intrin.h for __wasm_simd128__, and fills in the missing reduction helpers needed for backend completeness.

I also enabled the generic vector path for srot on ARCH_WASM. In local Pyodide/Emscripten testing, contiguous daxpy improved by about 1.23x–1.40x over the current baseline, and contiguous srot improved by about 1.41x–2.13x. Stride-2 cases were approximately flat. drot was evaluated too, but local testing did not show a clear or consistent benefit on WASM SIMD, so I left it out of this PR.

This PR intentionally keeps the scope small. It does not add new WASM-specific kernels, and it does not try to cover every reusable kernel that could potentially use the shared intrinsics layer at once.

If this is not the direction you would prefer upstream, I would be happy to adjust it.

martin-frbg · 2026-03-18T09:44:35Z

Thanks - having the universal simd header should also allow leveraging the simd-based optimizations from #2867 in the generic dot.c (SDOT/DDOT/DSDOT) - the kernel file copied out of the riscv64_generic support currently uses its own copy of the more trivial generic code from arm/dot.c for some reason.

teddygood · 2026-03-18T09:50:44Z

Thanks - having the universal simd header should also allow leveraging the simd-based optimizations from #2867 in the generic dot.c (SDOT/DDOT/DSDOT) - the kernel file copied out of the riscv64_generic support currently uses its own copy of the more trivial generic code from arm/dot.c for some reason.

Thanks, that is very helpful. Would you prefer to include that in this PR as well, or keep it as a follow-up once this backend is in place?

martin-frbg · 2026-03-18T10:03:47Z

As you prefer - come to think of it, it should also be possible to trivially copy the "ifndef DOUBLE" part of the SIMD fallback kernel from x86_64 daxpy.c to saxpy.c (which currently has only a trivial C loop for when no assembly microkernel is available). But I guess that is all existing SIMD kernels then. :)

teddygood · 2026-03-18T10:21:08Z

Thanks - having the universal simd header should also allow leveraging the simd-based optimizations from #2867 in the generic dot.c (SDOT/DDOT/DSDOT) - the kernel file copied out of the riscv64_generic support currently uses its own copy of the more trivial generic code from arm/dot.c for some reason.

Thanks, that is very helpful. In that case I’ll keep this PR to the intrinsics backend plus SAXPY, and leave the dot-related changes for a follow-up PR.

teddygood · 2026-03-18T13:06:09Z

I went ahead and added SAXPY in this PR by switching SAXPYKERNEL to x86_64/saxpy.c and adding the same kind of SIMD fallback used in daxpy.c. In local direct WASM benchmarking, contiguous saxpy improved by about 2.14x, 1.46x, and 1.10x for the sizes I tested, while a stride-2 case was roughly flat at about 1.04x.

teddygood added 2 commits March 18, 2026 03:23

Add WebAssembly SIMD backend for universal intrinsics

53d0be8

Refine WebAssembly SIMD backend scope

7ff3588

martin-frbg added this to the 0.3.32 milestone Mar 18, 2026

Enable SAXPY for WebAssembly SIMD backend

99d0557

martin-frbg merged commit adba2c3 into OpenMathLib:develop Mar 18, 2026
79 of 80 checks passed

teddygood mentioned this pull request Mar 19, 2026

Use generic dot kernels for WASM128_GENERIC #5689

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a WebAssembly SIMD backend for reusable intrinsics kernels#5685

Add a WebAssembly SIMD backend for reusable intrinsics kernels#5685
martin-frbg merged 3 commits intoOpenMathLib:developfrom
teddygood:wasm-intrin-backend-exp

teddygood commented Mar 18, 2026

Uh oh!

martin-frbg commented Mar 18, 2026 •

edited

Loading

Uh oh!

teddygood commented Mar 18, 2026

Uh oh!

martin-frbg commented Mar 18, 2026

Uh oh!

teddygood commented Mar 18, 2026

Uh oh!

teddygood commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teddygood commented Mar 18, 2026

Uh oh!

martin-frbg commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

teddygood commented Mar 18, 2026

Uh oh!

martin-frbg commented Mar 18, 2026

Uh oh!

teddygood commented Mar 18, 2026

Uh oh!

teddygood commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

martin-frbg commented Mar 18, 2026 •

edited

Loading