Skip to content

refactor(compute): split gpu_engine.go into focused files#77

Merged
dndungu merged 1 commit intomainfrom
e64-file-decomposition
Apr 6, 2026
Merged

refactor(compute): split gpu_engine.go into focused files#77
dndungu merged 1 commit intomainfrom
e64-file-decomposition

Conversation

@dndungu
Copy link
Copy Markdown
Contributor

@dndungu dndungu commented Apr 6, 2026

Summary

  • Split compute/gpu_engine.go (3,521 lines) into 4 focused files + core
  • gpu_engine.go: 2,245 lines (core struct, constructor, MatMul dispatch, quantized methods)
  • gpu_engine_elementwise.go: 400 lines (Add/Sub/Mul/Div, unary ops, fused ops)
  • gpu_engine_reduction.go: 221 lines (Softmax, Sum, ArgMax, TopK)
  • gpu_engine_memory.go: 695 lines (Copy, Zero, Reshape, Gather, Split, Concat)
  • gpu_engine_matmul.go: 240 lines (unchanged, created in E63)
  • Zero API changes, pure file reorganization

Closes E64 task T64.1.1.

Test plan

  • go build ./... passes
  • go vet ./compute/ clean
  • go test ./compute/ -timeout 120s passes
  • go test -race -timeout 120s ./compute/ passes
  • All files under 1,000 lines

Extract methods from gpu_engine.go (3521 -> 2245 lines) into three
focused files to improve navigability:

- gpu_engine_elementwise.go (400 lines): Add/Sub/Mul/Div, scalar ops,
  Exp/Log/Sin/Cos/Tanh/Pow, Sqrt/Rsqrt, fused RoPE/SwiGLU/RMSNorm,
  CosineSimilarity, HadamardTransform

- gpu_engine_reduction.go (221 lines): Sum, Softmax, ReduceSum/Max/Mean,
  GPUArgmax, GPUScaledSoftmax, GPUFusedSoftmaxVMul

- gpu_engine_memory.go (695 lines): Transpose, Zero/Zeros/Copy, Gather,
  ScatterAdd, Fill, Split/Concat/Repeat/RepeatInterleave, Reshape,
  OneHot, ConvertFP16ToF32

Zero behavioral changes. All method signatures identical. Build, vet,
and race-detector tests pass.
@dndungu dndungu merged commit 06f6b1b into main Apr 6, 2026
1 check passed
@dndungu dndungu deleted the e64-file-decomposition branch April 6, 2026 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant