Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,22 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.27.4] - 2026-05-13

### Fixed

- **Flaky TestThread_CallAsync on Windows CI** — replaced `time.Sleep(10ms)` with
channel-based synchronization. Same deterministic pattern as SnatchLock fix (f940eb7).
Verified: 100/100 passes with `-count=100`.

### Changed

- **goffi v0.5.0 → v0.5.1** — struct by-value argument passing (System V AMD64 ABI),
9-16B struct return via XMM registers (NSPoint, CGSize — critical for Metal backend),
callback struct arguments, CGO_ENABLED=1 support (race detector), E2E test infra.
Contributors: @jiyeyuran (CGO), @pekim (callbacks).
- **x/sys v0.43.0 → v0.44.0** — latest platform syscall definitions.

## [0.27.3] - 2026-05-11

### Added
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ go get github.com/gogpu/wgpu
CGO_ENABLED=0 go build
```

> **Note:** wgpu uses Pure Go FFI via `cgo_import_dynamic`, which requires `CGO_ENABLED=0`. This enables zero C compiler dependency and easy cross-compilation.
> **Note:** wgpu uses Pure Go FFI via [goffi](https://github.com/go-webgpu/goffi). Both `CGO_ENABLED=0` (default, zero C compiler dependency) and `CGO_ENABLED=1` (for race detector or coexistence with CGO libraries) are supported.

---

Expand Down
21 changes: 15 additions & 6 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

---

## Current State: v0.27.0
## Current State: v0.27.4

✅ **All 5 HAL backends complete** (~127K LOC)
✅ **Three-layer WebGPU stack** — wgpu API → wgpu/core → wgpu/hal
Expand Down Expand Up @@ -53,12 +53,16 @@
✅ **Zero-init workgroup memory** — WebGPU spec default, plumbed through all layers
✅ **CopyTextureToTexture public API** — DMA hardware copy with sub-region support
✅ **Vulkan relay semaphores** — GPU-side submission ordering (Mesa ANV workaround)
✅ **Software SPIR-V interpreter** — CPU shader execution for vertex/fragment (Phase 1: triangle)
✅ **WASM platform split** — root package _native.go/_browser.go, core/hal excluded from WASM build
✅ **Vulkan command buffer free list** — batch alloc 16 CBs, pool reset (Khronos/NVIDIA/ARM/Mesa/Rust parity)
✅ **Damage-aware surface presentation** — `PresentWithDamage()` with compositor dirty rects. First WebGPU implementation. Software + Vulkan + DX12 + GLES backends.
✅ **Automatic resource lifecycle** — `runtime.AddCleanup` for Buffer/BindGroup (ADR-018, Rust Arc+Drop pattern). GC safety net prevents per-frame resource leaks.
✅ **Zero-allocation WriteBuffer batching** — pre-allocated BufferCopy + stack barrier arrays. All PendingWrites hot paths 0 allocs/op.
✅ **Full SPIR-V interpreter** — 7 phases (~10K LOC): vertex/fragment/compute on CPU, texture sampling, GLSL.std.450 intrinsics, control flow, atomics, workgroup shared memory. Shader debugger with breakpoints and JSON trace. For debugging/CI, not production.
✅ **DX12 timestamp queries** — CreateQuerySet, EndQuery, ResolveQueryData (Rust wgpu-hal parity)
✅ **Queue thread safety** — Submit/WriteBuffer/WriteTexture serialized via sync.Mutex (Rust wgpu-core parity)
✅ **GLES compute memory barriers** — glMemoryBarrier for storage→draw/dispatch transitions (Rust parity)
✅ **Software render pass instrumentation** — slog debug events + RenderPassStats for CI e2e assertions

### Remaining validation (planned)
- **Phase C** (P2): Spec compliance edge cases, feature gates
Expand All @@ -70,19 +74,20 @@
| Metal | macOS | ✅ Stable — naga MSL 91/91 |
| DX12 | Windows | ✅ Stable — TDR fixed, PendingWrites, deferred destruction |
| GLES | Windows, Linux | ✅ Stable — text rendering, SamplerBindMap, texture completeness |
| Software | Windows, Linux | ✅ Stable — windowed presentation (GDI/X11), macOS planned |
| Software | Windows, Linux | ✅ Stable — windowed presentation (GDI/X11), SPIR-V interpreter, macOS planned (#163) |

→ **See [CHANGELOG.md](CHANGELOG.md) for detailed per-version notes**

---

## Upcoming

### Next: v0.26.0
### Next

- [ ] GLES Phase 1 — CopyBufferToTexture, CopyTextureToTexture, glFenceSync
- [ ] macOS software presentation — CGImage + CALayer (#163, contributor @k-chimi)
- [ ] DX12 DeviceTextureTracker for proper barrier state tracking
- [ ] GLES global UNPACK_ALIGNMENT=1 (Rust pattern — set once at device open)
- [ ] Vulkan relay semaphores for multi-submission ordering (VK-SYNC-001)
- [ ] GetSurfaceCapabilities on all backends (currently Vulkan-only)
- [ ] DXIL as default DX12 shader path (currently opt-in via `GOGPU_DX12_DXIL=1`)

Expand All @@ -98,7 +103,7 @@
- [x] Text rendering on all GPU backends
- [x] Blend constant tracking + resource usage conflict detection
- [ ] Full render/compute pass validation (resource transitions)
- [ ] Late buffer binding size validation (SPIR-V reflection → min binding size)
- [x] Late buffer binding size validation (VAL-006, draw/dispatch-time checks)
- [ ] Comprehensive documentation
- [ ] Conformance test suite

Expand Down Expand Up @@ -144,6 +149,10 @@

| Version | Date | Highlights |
|---------|------|------------|
| **v0.27.4** | 2026-05 | goffi v0.5.1 (struct ABI, XMM return, CGO_ENABLED=1), x/sys v0.44.0, flaky TestThread_CallAsync fix |
| **v0.27.3** | 2026-05 | Software render pass instrumentation (slog + RenderPassStats), Metal MsgSend docs |
| **v0.27.2** | 2026-05 | DX12 timestamp queries, Queue mutex, GLES compute barriers, Vulkan timestampPeriod fix |
| **v0.27.1** | 2026-05 | MSAA resolve LoadOp=CLEAR, Vulkan offscreen ImageLayoutGeneral, persistent stencil, naga v0.17.13 |
| **v0.27.0** | 2026-05 | **Full SPIR-V interpreter** (7 phases, ~10K LOC), shader debugger, compute HAL, particles rendering, tagged union optimization, naga v0.17.11, flaky test fix |
| **v0.26.12** | 2026-05 | **Test coverage** (core 85.5%, root 78.4%), Metal entry point fix (#168 by @k-chimi), naga v0.17.10 |
| **v0.26.11** | 2026-04 | **DX12 indirect dispatch/draw** — ExecuteIndirect + CommandSignature (was last GPU backend with stubs) |
Expand Down
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ module github.com/gogpu/wgpu
go 1.25.0

require (
github.com/go-webgpu/goffi v0.5.0
github.com/go-webgpu/goffi v0.5.1
github.com/gogpu/gputypes v0.5.0
github.com/gogpu/naga v0.17.13
golang.org/x/sys v0.43.0
golang.org/x/sys v0.44.0
)
4 changes: 4 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
github.com/go-webgpu/goffi v0.5.0 h1:EuvVRiRn9qAfCkYYXbHs9gz8NY+zv2/OA1N7gi56UVE=
github.com/go-webgpu/goffi v0.5.0/go.mod h1:wfoxNsJkU+5RFbV1kNN1kunhc1lFHuJKK3zpgx08/uM=
github.com/go-webgpu/goffi v0.5.1 h1:RSPR+YKT0tmbp5Uon+xwhN1veC9cehmqMptMkQuopok=
github.com/go-webgpu/goffi v0.5.1/go.mod h1:wfoxNsJkU+5RFbV1kNN1kunhc1lFHuJKK3zpgx08/uM=
github.com/gogpu/gputypes v0.5.0 h1:i2ED/9w6m6yLxf8XJT69/NIMSNTLO2y5F1LqvugCKIE=
github.com/gogpu/gputypes v0.5.0/go.mod h1:cnXrDMwTpWTvJLW1Vreop3PcT6a2YP/i3s91rPaOavw=
github.com/gogpu/naga v0.17.13 h1:VlponVgD1fEfNotx0874M4n7tnfum8YlMEB3pBdd2Ps=
github.com/gogpu/naga v0.17.13/go.mod h1:15sQaHKkbqXcwTN+hHYGLsA0WBBnkmYzne/eF5p5WEg=
golang.org/x/sys v0.43.0 h1:Rlag2XtaFTxp19wS8MXlJwTvoh8ArU6ezoyFsMyCTNI=
golang.org/x/sys v0.43.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
golang.org/x/sys v0.44.0 h1:ildZl3J4uzeKP07r2F++Op7E9B29JRUy+a27EibtBTQ=
golang.org/x/sys v0.44.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
19 changes: 6 additions & 13 deletions internal/thread/thread_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,19 @@
package thread

import (
"sync/atomic"
"testing"
"time"
)

func TestThread_CallVoid(t *testing.T) {
th := New()
defer th.Stop()

var called atomic.Bool
var called bool
th.CallVoid(func() {
called.Store(true)
called = true
})

if !called.Load() {
if !called {
t.Error("CallVoid did not execute function")
}
}
Expand All @@ -42,17 +40,12 @@ func TestThread_CallAsync(t *testing.T) {
th := New()
defer th.Stop()

var called atomic.Bool
done := make(chan struct{})
th.CallAsync(func() {
called.Store(true)
close(done)
})

// Wait for async call to complete
time.Sleep(10 * time.Millisecond)

if !called.Load() {
t.Error("CallAsync did not execute function")
}
<-done
}

func TestThread_Stop(t *testing.T) {
Expand Down
Loading