Skip to content

SIMD32: lookback_set_op_kernel (Lb1E) variants dropped from native object after RetryManager; ocloc still succeeds #403

@pvelesko

Description

@pvelesko

Summary

When compiling large SPIR-V (rocThrust lookback_set_op_kernel with HAS_VALUES=true, mangled Lb1E) for Intel DG2 / Arc at the default SIMD32 path, IGC silently omits several kernel instantiations from the final native object even though:

  • All variants are still present as OpEntryPoint in the input SPIR-V,
  • ocloc compile reports “Build succeeded.”,
  • The SIMD32 build log can show [RetryManager] Start recompilation (e.g. on the float template).

Workaround: IGC_ForceOCLSIMDWidth=16 — all eight element-type specializations then appear in the object’s symbol table.

Affected pattern

thrust::hip_rocprim::__set_operations::lookback_set_op_kernel< default_set_operations_config<10000u, T, T>, /*HAS_VALUES=*/true, ...> — eight element-type pairs in SPIR-V: dd ff ii jj ss tt xx yy (Itanium-style ILj10000E**EELb1E in mangled names).

Observed with offline ocloc: the SIMD32 processed_0_dg2.bin exports 5 distinct lookback_set_op_kernel Lb1E type encodings (dd ss tt xx yy), while SIMD16 exports 8. Missing from SIMD32 vs SPIR-V / SIMD16: ff (float), ii (int), jj (uint).

Downstream (Level Zero / HIP): zeModuleGetKernelNames then lacks those kernel names → Failed to find kernel via kernel name at launch.

Reproducer (offline)

  1. Input: processed_0.spv — chipStar-processed SPIR-V (e.g. from CHIP_DUMP_PROCESSED_SPIRV while building rocThrust set_difference_by_key); same IL can be attached to this issue.

  2. Tools: spirv-as/spirv-dis (validation inventory), ocloc compile -spirv_input -device dg2, llvm-readobj --symbols on processed_0_dg2.bin.

  3. Check: Extract symbol Name: lines for lookback_set_op_kernel and distinct ILj10000E[dfijstxy]{2}EELb1E tags — compare default SIMD32 vs IGC_ForceOCLSIMDWidth=16 SIMD16.

Expected: SIMD32 object missing ILj10000EffEELb1E, ILj10000EiiEELb1E, ILj10000EjjEELb1E relative to SPIR-V and SIMD16; SPIR-V still lists all eight OpEntryPoint kernels.

Environment

  • GPU / target: DG2 (reproduced on Intel Arc A770)
  • IGC / compute stack: via distro ocloc (version as shipped with Intel compute runtime)

Related

Distinct from subgroup-shuffle / OpUConvert miscompile issues — this is missing kernels in the final ELF after SIMD32 finalization / retry, not wrong arithmetic in a compiled kernel.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions