Skip to content

feat: support callback struct arguments on AMD64 #42

Merged
kolkov merged 11 commits into
go-webgpu:mainfrom
pekim:bugfix/fix-issue-41
May 9, 2026
Merged

feat: support callback struct arguments on AMD64 #42
kolkov merged 11 commits into
go-webgpu:mainfrom
pekim:bugfix/fix-issue-41

Conversation

@pekim
Copy link
Copy Markdown
Contributor

@pekim pekim commented May 8, 2026

Fixes #41

Adds support for callback struct by-value arguments on AMD64 Unix (System V ABI).

It's the inverse of the struct type arg support in internal/arch/amd64/call_unix.go.
Instead of marshalling from a struct to registers and the stack, data from
registers and the stack are marshalled in to a struct.

Pull Request Requirements

  • Code is formatted (go fmt ./...)
  • Linter passes (golangci-lint run)
  • All tests pass with race detector (go test -race ./...)
  • Benchmarks don't regress (FFI overhead < 200ns)
    • Benchmarks report over 400ns for me locally. But it's the same on main, so I guess that my hardware is just slow. In any case it doesn't look like there's any regression.
  • New code has tests (minimum 70% coverage, current: 89.1%)
    • On main coverage of the github.com/go-webgpu/goffi/ffi package is 87.3%. With this PR it's 91.0%.
  • Platform-specific code tested on target OS
  • Assembly changes validated on real hardware
    • N/A
  • Documentation updated (if applicable)
  • Commit messages follow conventions
  • No sensitive data (credentials, tokens, etc.)

I've tried to broadly follow the practices of the project. But I'm still getting familiar with it, so it's quite possible that I may have deviated too far in places.

I've tried to follow the instructions in https://github.com/go-webgpu/goffi/blob/main/CONTRIBUTING.md where applicable. But I've ignored the references to the develop branch, as it appears to be quite out of date. So I've targeted main.

tests

I'm also somewhat new to the intricacies of the function calling ABI. So while all of the tests that I've added pass, it wouldn't surprise me if I have any of the tests wrong. I have tried the implementation out with the use of clang_visitChildren that was my original use case that prompted #41. And it appears to work, so I have some confidence that the implementation might be broadly on the right lines.

data race

There was a data race when writing to the reflect.Value used for a new struct. For example at https://github.com/pekim/goffi/blob/bugfix/fix-issue-41/ffi/callback.go#L306.

					*(*float64)(valPtr) = getFloat()

Commit pekim@50eb215 works around this. But perhaps there's a better approach?

overhead

The use of a []byte and the subsequent copy (to address the data race) will introduce an overhead. But I've not yet managed to find a better solution.

pekim added 2 commits May 8, 2026 11:41
Add support for callback struct by-value arguments on AMD64 Unix
(System V ABI).

It's the inverse of the struct type arg support in internal/arch/amd64/call_unix.go.
Instead of marshalling from a struct to registers and the stack, data from
registers and the stack are marshalled in to a struct.
Writing to valPtr, for example
    *(*float64)(valPtr) = getFloat()
would result in a data race
    checkptr: converted pointer straddles multiple allocations

Using a []byte to represent the struct value, and then copying the
slice in to a struct reflect.Value avoids the data race.
@pekim pekim requested a review from kolkov as a code owner May 8, 2026 14:10
@pekim
Copy link
Copy Markdown
Contributor Author

pekim commented May 8, 2026

...it wouldn't surprise me if I have any of the tests wrong

I'm a bit more confident now that I've added some end-to-end tests.

Copy link
Copy Markdown
Contributor

@kolkov kolkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pekim Thank you for implementing this — callback struct args are a significant feature, and your approach (all three size buckets + e2e tests with C library) is exactly right. purego doesn't support this, so goffi would be the first pure Go FFI library with callback struct args.

The getInt()/getFloat() refactoring is clean and the reflect.Type-based classification is well done. Four issues to fix before merge:

1. MEMORY class: double stackIdx increment (bug)

In the > 16B loop, the partial last chunk (bytesLeft < 8) calls getInt() which increments stackIdx, then the loop's stackIdx++ runs again — advancing by 2 instead of 1. Any argument following a >16B struct reads from the wrong slot.

// Current (line ~336):
chunk := frame[stackIdx]
if bytesLeft >= 8 {
    *(*uintptr)(chunkPtr) = chunk
} else {
    writePartial(chunkPtr, bytesLeft, getInt())  // getInt() does stackIdx++
}
stackIdx++  // second increment for partial chunk

Fix — use the pre-read chunk in both branches:

chunk := frame[stackIdx]
if bytesLeft >= 8 {
    *(*uintptr)(chunkPtr) = chunk
} else {
    writePartial(chunkPtr, bytesLeft, chunk)  // use pre-read value
}
stackIdx++  // single increment

To verify: add a test with MEMORY-class struct followed by a scalar argument (e.g., func(s TripleI64, extra int64)).

2. Buffer overrun in structData allocation (bug)

make([]byte, sz) allocates exactly sz bytes, but *(*float64)(valPtr) = getFloat() writes 8 bytes and *(*uintptr)(valPtr) writes 8 bytes. For a 4-byte struct (e.g., struct{float32}), this overwrites 4 bytes past the allocation.

Fix:

structData := make([]byte, max(sz, 8))

This provides safe headroom. The final copy(valByteSlice, structData) already uses sz, so only the correct bytes are copied to the reflect.Value.

3. ARM64 build tag (crash)

callback_struct_args_test.go has //go:build (linux || darwin || freebsd) && (amd64 || arm64), but callback_arm64.go does not support reflect.StructNewCallback would panic on ARM64.

Fix: change to amd64 only:

//go:build (linux || darwin || freebsd) && amd64

4. Windows e2e tests (crash)

struct_e2e_test.go includes Windows in the build tag. The new TestCallbackStruct* tests call NewCallback with struct-accepting functions, but callback_windows.go doesn't support reflect.Struct → panic.

Fix: add skip at the top of each TestCallbackStruct* function:

if runtime.GOOS == "windows" {
    t.Skip("callback struct args not supported on Windows")
}

Nice to have (non-blocking)

  1. classifyEightbyte simplification — consider using field.Offset from reflect.StructField instead of manually recomputing alignment. Simpler and guaranteed to match Go's actual layout.

  2. VoidTypeDescriptorTestCallbackStructArg8B_FloatPair and other callback tests use SInt64TypeDescriptor as return type for void C functions. VoidTypeDescriptor would be more accurate.

  3. Nested struct classificationisStructAllFloats and classifyEightbyte don't recurse into nested struct fields. This is fine for the initial implementation but worth a comment noting the limitation.


Overall: excellent contribution. The four fixes are straightforward — happy to help if you have questions on any of them.

pekim added 8 commits May 9, 2026 11:06
It was unnecessay to call getInt for the partial chunk, as the chunk
is already in the chunk variable. And the getInt function would
increment stackIdx an extra time.

Fix the tests that test MEMORY class with a final partial chunk. The
structs used were all a multiple of 8 bytes rather than the intended
size.
@pekim
Copy link
Copy Markdown
Contributor Author

pekim commented May 9, 2026

@kolkov , thank you for the clear and comprehensive review. It was very helpful

1. MEMORY class: double stackIdx increment (bug)

Suggested change implemented.

To verify: add a test with MEMORY-class struct followed by a scalar argument (e.g., func(s TripleI64, extra int64)).

I don't think that such a test would be affected by this, as the extra scalar argument would be in an integer register not the stack.

However I did notice that my tests intended to exercise the partial chunk code were wrong. Their structs' sizes were all multiples of 8 because of Go's struct field alignment and padding. So I've fixed those tests.

2. Buffer overrun in structData allocation (bug)

Suggested change implemented.

3. ARM64 build tag (crash)

fixed

4. Windows e2e tests (crash)

fixed

5. classifyEightbyte simplification

Nice suggestion, implemented.

6. VoidTypeDescriptor

Oops, the perils of copy and pasted. Fixed.

7. Nested struct classification

I added support for nested structs to isStructAllFloats, and added tests for the function.

When it came to classifyEightbyte I struggled to work out what the behaviour should be. So I've just added added a comment noting the absence of support for nested structs.

commits

I've left all of the fixes as separate commits for ease of reviewing. I can squash all of the PR's commits once you're happy with everything.

@pekim pekim changed the title Bugfix/fix issue 41 feat: support callback struct arguments on AMD64 May 9, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@kolkov kolkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 7 fixes verified, CI 12/12 green. Excellent work — goffi is now the first pure Go FFI library with callback struct argument support. Welcome as a contributor, @pekim.

@kolkov kolkov merged commit 513b756 into go-webgpu:main May 9, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unsupported callback argument type: struct

2 participants