Skip to content

Harden Wasmtime's compiled GC code against corruption#13321

Open
alexcrichton wants to merge 4 commits intobytecodealliance:mainfrom
alexcrichton:harden-gc-compild-code
Open

Harden Wasmtime's compiled GC code against corruption#13321
alexcrichton wants to merge 4 commits intobytecodealliance:mainfrom
alexcrichton:harden-gc-compild-code

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

In the spirit of #13320 this commit goes through the compiled code for the GC proposal to ensure that, in the face of GC corruption, Wasmtime by default can recover and return a "bug" to the embedder. This was also discussed a bit in #13112 as well, and the changes made here are:

  • Plumbing traps from translation into the runtime now uses a new CompiledTrap enum instead of just the normal Trap. This new enum has branches for InternalAssert (not previously present) and additionally GcHeapCorrupted (now added).
  • Whether or not CompiledTrap::{InternalAssert,GcHeapCorrupted} is encoded into the final *.cwasm is now a Tunables configuration option. Internal asserts are not encoded by default but GC heap corruption is.
  • Traps caught as CompiledTrap::{InternalAssert,GcHeapCorrupted} are turned into WasmtimeBug and propagated upwards. Traps stay as normal traps.
  • All memory accesses to the GC heap now use CompiledTrap::GcHeapCorrupted as their trap code. Additionally they're also no longer marked as readonly in a few places.
  • A few locations in GC translation using InternalAssert now use GcHeapCorrupted, such as the checked arithmetic around array lengths. Other assertions which are about control flow are left untouched.

The end state is that faults in the GC heap in compiled code itself should show up as a bug! on the other end by default. This requires extra metadata in *.cwasms mapping traps, but this is similar to linear-memory-using-wasms which have lots of trap metadata for loads/stores. Being able to catch InternalAssert as a first-class error (as opposed to a signal) is a debugging nicety I've added here but remains off-by-default to avoid bloating *.cwasms for internal debugging.

Closes #13112

@alexcrichton alexcrichton requested review from a team as code owners May 7, 2026 23:07
@alexcrichton alexcrichton requested review from fitzgen and removed request for a team May 7, 2026 23:07
@alexcrichton alexcrichton force-pushed the harden-gc-compild-code branch from 8a0b0df to bf9d63e Compare May 7, 2026 23:11
@github-actions github-actions Bot added fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels May 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Subscribe to Label Action

cc @fitzgen

Details This issue or pull request has been labeled: "fuzzing", "wasmtime:api", "wasmtime:config"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

  • If you added a new Config method, you wrote extensive documentation for
    it.

    Details

    Our documentation should be of the following form:

    Short, simple summary sentence.
    
    More details. These details can be multiple paragraphs. There should be
    information about not just the method, but its parameters and results as
    well.
    
    Is this method fallible? If so, when can it return an error?
    
    Can this method panic? If so, when does it panic?
    
    # Example
    
    Optional example here.
    
  • If you added a new Config method, or modified an existing one, you
    ensured that this configuration is exercised by the fuzz targets.

    Details

    For example, if you expose a new strategy for allocating the next instance
    slot inside the pooling allocator, you should ensure that at least one of our
    fuzz targets exercises that new strategy.

    Often, all that is required of you is to ensure that there is a knob for this
    configuration option in wasmtime_fuzzing::Config (or one
    of its nested structs).

    Rarely, this may require authoring a new fuzz target to specifically test this
    configuration. See our docs on fuzzing for more details.

  • If you are enabling a configuration option by default, make sure that it
    has been fuzzed for at least two weeks before turning it on by default.


Details

To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

In the spirit of bytecodealliance#13320 this commit goes through the compiled code for
the GC proposal to ensure that, in the face of GC corruption, Wasmtime
by default can recover and return a "bug" to the embedder. This was also
discussed a bit in bytecodealliance#13112 as well, and the changes made here are:

* Plumbing traps from translation into the runtime now uses a new
  `CompiledTrap` enum instead of just the normal `Trap`. This new enum
  has branches for `InternalAssert` (not previously present) and
  additionally `GcHeapCorrupted` (now added).
* Whether or not `CompiledTrap::{InternalAssert,GcHeapCorrupted}` is
  encoded into the final `*.cwasm` is now a `Tunables` configuration
  option. Internal asserts are not encoded by default but GC heap
  corruption is.
* Traps caught as `CompiledTrap::{InternalAssert,GcHeapCorrupted}` are
  turned into `WasmtimeBug` and propagated upwards. Traps stay as normal
  traps.
* All memory accesses to the GC heap now use
  `CompiledTrap::GcHeapCorrupted` as their trap code. Additionally
  they're also no longer marked as `readonly` in a few places.
* A few locations in GC translation using `InternalAssert` now use
  `GcHeapCorrupted`, such as the checked arithmetic around array
  lengths. Other assertions which are about control flow are left
  untouched.

The end state is that faults in the GC heap in compiled code itself
should show up as a `bug!` on the other end by default. This requires
extra metadata in `*.cwasm`s mapping traps, but this is similar to
linear-memory-using-wasms which have lots of trap metadata for
loads/stores. Being able to catch `InternalAssert` as a first-class
error (as opposed to a signal) is a debugging nicety I've added here but
remains off-by-default to avoid bloating `*.cwasm`s for internal debugging.

Closes bytecodealliance#13112
Implement this for both little and big-endian loads.
@alexcrichton alexcrichton force-pushed the harden-gc-compild-code branch from bf9d63e to c084e97 Compare May 8, 2026 16:45
@alexcrichton alexcrichton requested a review from a team as a code owner May 8, 2026 16:45
@alexcrichton
Copy link
Copy Markdown
Member Author

I've pushed up a significant follow-up for this as well now to get tests passing. Notably Pulley and the Cranelift backend for Pulley didn't support loads/stores with custom trap codes. Previously the only possible trapping loads/stores were those with the HeapOutOfBounds trap code that were little endian (aka wasm loads/stores). The stores here now have different trap codes (custom ones for Wasmtime) and additionally might be big-endian (as not all GC data is little-endian). To handle that I've done two changes to Pulley:

  • The AddrZ addressing mode now carries a custom u8 payload for a trap code. This inflates the size of the payload by one byte.
  • I've added big-endian variants of all loads/stores using AddrZ to the extended opcode namespace.

Together these are then able to handle all possible loads/stores that can trap coming out of GC translation.

@alexcrichton
Copy link
Copy Markdown
Member Author

The AddrZ addressing mode now carries a custom u8 payload for a trap code

Just kidding, that ran into trap encoding issues. Now the Cranelift *_z instructions have a Cranelift TrapCode and that gets plumbed into the MachBuffer like usual. The trap code associated in the interpreter with AddrZ is now None, meaning that we look up in trap side tables for the actual code. That gets everything to align correctly.

@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:meta Everything related to the meta-language. isle Related to the ISLE domain-specific language labels May 9, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure isle Related to the ISLE domain-specific language wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should accesses to the GC heap use MemFlags::trusted()?

1 participant