This file provides guidance for agents and contributors working with code in this repository.
MIGraphX is AMD's graph inference engine that accelerates machine learning model inference on ROCm. It uses a three-level IR hierarchy:
- Programs (
migraphx::program): Top-level containers for entire neural networks - Modules (
migraphx::module): DAGs of instructions representing computational subgraphs - Instructions (
migraphx::instruction): Atomic operations wrapping anoperationwith inputs
The compilation flow transforms high-level models (ONNX, TensorFlow) through optimization passes into target-specific code (GPU kernels via HIP/MIOpen/rocBLAS, CPU ops, or reference implementations).
ROCm ecosystem: ROCm, MIOpen, rocBLAS, hipBLASLt, HIP (paths under /opt/rocm)
Core libraries: Protobuf (ONNX), Half (IEEE 754 fp16), pybind11 (Python), nlohmann/json, MessagePack, SQLite3
Build tools: CMake 3.15+, rbuild (optional but recommended), ccache (via rbuild develop)
Language level: C++17 (set in CMake)
src/include/migraphx/*.hpp- Public API headerssrc/op/- Operation implementationssrc/targets/gpu/- GPU backend (HIP, MIOpen, rocBLAS lowering)src/targets/cpu/- CPU backendsrc/targets/ref/- Reference implementationsrc/driver/- CLI tools (main.cpp, perf.cpp, verify.cpp, passes.cpp)src/py/- Python bindings (pybind11)test/- Unit and integration tests (mirrorssrc/structure)test/gpu/- GPU-specific tests (build astest_gpu_<topic>)test/verify/- End-to-end verification teststest/onnx/- ONNX model parsing/verificationtools/andcmake/- Build helpers, scripts, format scripts, CMake modulesexamples/- Usage examples (C++ API, Python, diffusion, transformers, vision)docs/- Sphinx/Doxygen documentation; build artifacts indocs/_build/build/- Default generated build output
Development build (recommended):
rbuild develop -DGPU_TARGETS=$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*')Manual CMake build:
cmake -S . -B build -DGPU_TARGETS=$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*')
make -C build -j$(nproc)Build with dependencies (rbuild):
rbuild build -d depend -B build -DGPU_TARGETS=$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*')Run all tests:
make -C build checkInstall:
make -C build installBuild documentation:
make -C build doc
# Or manually in docs/:
pip3 install -r sphinx/requirements.txt
python3 -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/htmlFormat code:
python3 tools/format.py -iInstall git hooks:
./.githooks/install- Files and symbols:
snake_case(e.g.,compute_shape,eliminate_concat.cpp) - Template parameters:
CamelCase - Macros:
UPPER_CASEwithMIGRAPHX_prefix - Public headers go under
src/include/migraphx/*.hpp - Prefer clear names over abbreviations
- Use
using type_alias = intnottypedef int type_alias - Use
nullptrnotNULLor0 - Use
orandandinstead of&&and||
- 4-space indent, no tabs
- 100 column limit
- Braces on new line for classes/functions/control structures
- Align consecutive assignments
.clang-tidyruns via CMake; treat warnings seriously (they are enforced in CI)cppcheckis enabled with custom rules
-
Avoid raw loops - Prefer algorithms from
<migraphx/algorithm.hpp>and STL<algorithm>:// Good: Declarative, expresses intent clearly std::transform(in.begin(), in.end(), std::back_inserter(out), [](int i) { return i * 2; }); // Avoid: Raw loops are error-prone and less optimizable for (int i : in) { out.push_back(i * 2); } // Note: std::for_each should NOT be used as it doesn't encapsulate a raw loop // It's just a wrapper around a for loop - use std::transform or other algorithms instead
MIGraphX-specific algorithms (
#include <migraphx/algorithm.hpp>):transform_if(start, last, out, pred, f)- Transform with filteringtransform_accumulate(first, last, init, binop, unaryop)- Accumulate with projectiontransform_partial_sum(first, last, out, binop, unaryop)- Partial sum with projectiongroup_by(start, last, out, pred)- Group elements by predicategroup_unique(start, last, out, pred)- Group consecutive unique elementsgroup_find(start, last, pred, out)- Find and group matching rangesadjacent_remove_if(first, last, pred)- Remove based on adjacent pairsadjacent_for_each(first, last, f)- Iterate over adjacent pairsfor_each(first1, last1, first2, f)- Two-range iteration (like std::transform)
If no suitable algorithm exists: Add a new algorithm to
migraphx/algorithm.hpprather than using raw loops -
Memory management - Use
std::make_unique/shared, avoid rawnew/delete -
Non-memory resources - Use
MIGRAPHX_MANAGE_PTRfor C-style acquire/release APIs:using file_ptr = MIGRAPHX_MANAGE_PTR(FILE*, fclose); file_ptr f{fopen(filename, "r")}; // Automatically closed on scope exit
-
Type erasure over inheritance - MIGraphX uses type erasure extensively for
pass,operation,target- No need to inherit from base class - just implement required interface methods
- Generate boilerplate with
make generatewhen interfaces change - Enables value semantics with polymorphism, no virtual inheritance overhead
-
Pass design principles:
- Keep passes idempotent (running twice = running once)
- Keep passes deterministic (same input → same output)
- Always handle dynamic shapes in
compute_shape(unknown dimensions)
-
Generic programming - Write reusable, type-independent code using templates and STL algorithms
-
Avoid Casts - Don't use a cast unless absolutely necessary. Declare correct types. Casts indicate a type mismatch that should be resolved at the source.
-
Encapsulate Bit Manipulation - Put bit-twiddling and low-level operations behind well-named utility functions:
- Prefer
std::bitsetfor bit manipulation
- Prefer
-
Use std::tie for Lexicographical Comparisons - Use
std::tieorstd::lexicographical_compareinstead of manually writing lexicographical comparisons. -
Use shape class to compute offsets and indexing - The
migraphx::shapeclass provides methods for computing offsets, strides, and indexing. Use these instead of manual calculations with mod and division.
- Prefer correct, complete implementations over minimal ones.
- Use appropriate data structures and algorithms — don't brute-force what has a known better solution.
- When fixing a bug, fix the root cause, not the symptom.
- If something I asked for requires error handling or validation to work reliably, include it without asking.
Program (src/include/migraphx/program.hpp):
- Contains one or more modules
- Manages compilation via
program::compile(target) - Entry point:
get_main_module()returns main execution module
Module (src/include/migraphx/module.hpp):
- DAG of instructions
- Add instructions via
module::add_instruction(op, inputs...) - Add parameters via
module::add_parameter(name, shape)
Instruction (src/include/migraphx/instruction.hpp):
- References an operation and input instructions
- Mutable IR node/reference; passes may rewrite instructions in place via APIs such as
replace,set_normalized, andset_target_id
Operation (src/include/migraphx/operation.hpp):
- Defines computation semantics
- Must implement:
name(),compute_shape(inputs) - Optionally:
compute(ctx, output, inputs),finalize(ctx, shape, inputs)
A target provides get_passes, get_context, and memory ops; see GPU under src/targets/gpu/ and CPU under src/targets/cpu/.
- Parse model (ONNX/TF) → Generic IR
- Run optimization passes (
pass_manager.cpp) - Target lowering (
targets/{gpu,cpu}/) applies backend-specific transformations - Code generation (GPU: kernel fusion, HIP code objects; CPU: direct ops)
- Execution via
program::eval(params)orprogram::run(params)
migraphx::shape - Describes tensor properties (one of the most important classes):
- Data type (e.g.,
float_type,int32_type) - Dimensions/lengths (e.g.,
{1, 3, 224, 224}) - Memory layout/strides
- States:
.packed(),.transposed(),.broadcasted(),.standard()
migraphx::literal - Immutable data buffer with a shape (compile-time constants like weights)
migraphx::argument - Mutable data buffer with a shape (runtime results and inputs)
migraphx::tensor_view<T> - Type-safe multi-dimensional view combining T* pointer with a shape
Visitor pattern for type-safe data access:
migraphx::literal l{migraphx::shape{migraphx::shape::float_type, {2, 2}}, {1.0f, 2.0f, 3.0f, 4.0f}};
l.visit([](auto view) {
// 'view' is a tensor_view<float>
std::cout << view(1, 1) << std::endl; // Prints 4.0
});Add parameters to modules for runtime inputs:
migraphx::shape s{migraphx::shape::int32_type, {1}};
auto x = mm->add_parameter("x", s); // Creates placeholder instruction
// Later, evaluate with actual data:
migraphx::parameter_map params;
std::vector<int> data = {4};
params["x"] = migraphx::argument(s, data.data());
auto result = p.eval(params).back();- Create operation struct in
src/include/migraphx/op/<name>.hpp:
struct my_op {
std::string name() const { return "my_op"; }
shape compute_shape(std::vector<shape> inputs) const {
// Compute output shape from inputs
return inputs[0];
}
// Optional: for reference target
argument compute(context& ctx, const shape& output,
std::vector<argument> args) const {
// Compute implementation
}
};- Register operation (automatic via include pattern or explicit):
MIGRAPHX_REGISTER_OP(my_op)- Add tests in
test/op_shape_test.cpportest/my_op_test.cpp
Place headers in src/include/migraphx/op/ and implementation in src/ if not header-only. Reflect attributes if needed.
Simple implementation:
#include <migraphx/pass.hpp>
struct my_pass {
std::string name() const { return "my_pass"; }
void apply(module& m) const {
// Transform module instructions
}
// Or use module_pass_manager:
// void apply(module_pass_manager& mpm) const;
};Using matchers (preferred for pattern matching):
Create a matcher struct with matcher() and apply() methods:
namespace {
struct find_mul_by_one
{
auto matcher() const
{
// Match: mul(1, x) or mul(x, 1)
return match::name("mul")(match::either_arg(0, 1)(
match::is_constant(1.0f),
match::any().bind("x")));
}
void apply(module& m, const match::matcher_result& r) const
{
auto ins = r.result;
auto x = r.instructions["x"];
m.replace_instruction(ins, x);
}
};
} // namespace
void simplify_mul::apply(module& m) const
{
match::find_matches(m, find_mul_by_one{});
}Built-in matchers:
match::name("op_name")- Match operation by namematch::any()- Match any instructionmatch::is_constant(value)- Match literal with specific valuematch::args(m1, m2, ...)- Match positional argumentsmatch::arg(i)(m)- Match i-th argumentmatch::either_arg(0, 1)(m1, m2)- Match either argument order.bind("name")- Bind matched instruction to retrieve later
Custom matchers:
MIGRAPHX_PRED_MATCHER(broadcasted_shape, instruction_ref ins) {
return ins->get_shape().broadcasted();
}
// Usage: auto m = broadcasted_shape();Integration:
- Wire into target's
get_passes()method - For driver testing: register in
src/driver/passes.cpp - Add tests mirroring
test/eliminate_concat_test.cpp
- Implement target interface in
src/targets/mytarget/:
struct mytarget {
std::string name() const { return "mytarget"; }
std::vector<pass> get_passes(context& ctx,
const compile_options& options) const {
// Return optimization passes
}
context get_context() const {
// Return execution context
}
// Memory operations
argument allocate(const shape& s) const;
argument copy_to(const argument& arg) const;
argument copy_from(const argument& arg) const;
};- Register target:
MIGRAPHX_REGISTER_TARGET(mytarget)- Add target-specific tests in
test/mytarget/
- CTest-driven executables generated from files in
test/(seetest/CMakeLists.txt) - Naming: prefer
*_test.cpp; CMake auto-creates an executabletest_<topic>and registers it with CTest - Colocate helpers under
test/include/when needed - Coverage: CI uploads via Codecov; keep unit tests focused and fast
test/- Core unit tests (no specific backend), target-independenttest/gpu/- GPU-specific tests (requiresMIGRAPHX_ENABLE_GPU=Onand a ROCm device); build astest_gpu_<topic>test/ref/- Reference operator behavior teststest/verify/- Multi-target result comparison (GPU vs reference)test/onnx/parse/- ONNX model parsing tests (no execution)test/onnx/verify/- ONNX model execution + numerical comparisontest/tf/- TensorFlow parser teststest/api/- C/C++ API validationtest/py/- Python binding tests
# All tests
make -C build check
# Single test file
ctest -R '^test_<topic>$'
# Single test case with pattern
./bin/test_<topic> 'pattern*'
# GPU tests only
ctest -R '^test_gpu_'
# List all test cases in a file
./bin/test_<topic> --listBasic test structure:
#include "test.hpp" // From test/include/
TEST_CASE(my_feature) {
EXPECT(condition);
CHECK(other_condition);
}Module/Program test pattern:
#include <migraphx/program.hpp>
#include <migraphx/make_op.hpp>
#include "test.hpp"
static void run_pass(migraphx::module& m)
{
migraphx::run_passes(m, {migraphx::my_pass{}, migraphx::dead_code_elimination{}});
}
TEST_CASE(test_my_pass)
{
migraphx::module m1;
{
auto input = m1.add_parameter("input", {migraphx::shape::float_type, {3, 3}});
auto result = m1.add_instruction(migraphx::make_op("relu"), input);
m1.add_return({result});
}
run_pass(m1);
migraphx::module m2;
{
auto input = m2.add_parameter("input", {migraphx::shape::float_type, {3, 3}});
auto result = m2.add_instruction(migraphx::make_op("relu"), input);
m2.add_return({result});
}
EXPECT(m1 == m2);
}When adding a pass that rewrites structure:
- Build a small graph with the pattern to optimize
- Run only the new pass (plus prerequisites and dead_code_elimination for cleanup)
- Create a new module with the expected final structure and assert that it is the same
Use driver for inspection: migraphx-driver compile --text --apply-pass your_pass
- Use
verify_rms_rangefor tolerance-based comparisons with floating-point noise - Provide deterministic inputs (explicit literals or seeded random generators)
- Avoid flaky thresholds by using stable, reproducible test data
- Tests under
test/verifywill verify the results using the ref target - Each verify class should go into a separate .cpp file
Speed principles:
- Favor shapes < 64 elements for scalar algorithms
- Use single-digit iteration counts
- Reuse shapes across tests to exploit compiler cache hits
- To skip large-buffer tests, configure
-DMIGRAPHX_DISABLE_LARGE_BUFFER_TESTS=On
Edge cases to cover:
- Zero-length dimensions
- Dynamic shapes at extreme min/max
- Mixed type promotion (int + float → float)
- Broadcasting asymmetries
- Reduction axes ordering
Regression strategy:
- When fixing subtle bugs (e.g., stride miscalculation), add test reproducing original failure
- Keep golden ONNX/TF assets minimal (large models slow CI and obscure failures)
- For new operator parsing, a model containing just that op suffices
Test utilities: test/include/test.hpp, basic_ops.hpp, pointwise.hpp, reduce.hpp
The GPU backend (src/targets/gpu/) implements:
- Lowering (
lowering.cpp): Transform generic ops to GPU-specific ops - Fusion (
fuse_ops.cpp,fuse_ck.cpp; shared pointwise fusion insrc/fuse_pointwise.cpp): Kernel fusion optimizations - Code generation (
compile_hip*.cpp,compile_miopen.cpp): HIP/MIOpen/rocBLAS dispatch - Scheduling (
schedule_model.cpp): Select optimal implementation - Performance DB (
perfdb.cpp,problem_cache.cpp): Cache tuning results
Key subdirectories:
device/- Device-side utilities and kernelskernels/- Custom kernel implementationsinclude/migraphx/gpu/- GPU-specific headers
MIGraphX employs multiple fusion stages to reduce kernel overhead and improve memory locality:
1. Prefusion (GPU-specific):
- Replace recognizable operator clusters with custom placeholders before lowering
- Passes:
prefuse_ops,fuse_attention - Detects high-level patterns like transformer attention (Q*K^T scaling, softmax, V multiplication)
2. Pointwise/Reduce Fusion:
fuse_pointwise: Groups pure elementwise chains into submodules for single-kernel code generationfuse_reduce/fuse_pointwise_reduce: Combines reductions sharing compatible axes- Flags:
enable_rewrite_reshapes,enable_rewrite_broadcasts,enable_multi_output - Critical for layernorm and attention patterns
3. Backend Post-Lowering Fusion:
fuse_ops: Combine lowered ops with adjacent pointwise or layout adjustmentsfuse_ck: Leverage Composable Kernel library for GEMM + epilogues (bias, activation)fuse_mlir: Groups pointwise + reshape sequences around GEMM/Conv for MLIR pipeline
Troubleshooting missing fusion:
- Check if earlier passes canonicalized shapes (
simplify_reshapes) - Verify broadcasts are explicit/implicit as expected
- Check if dynamic dims prevent grouping
- Ensure op type lists support new data types (fp8/bf16)
- Use
MIGRAPHX_TRACE_COMPILE=1to confirm fusion pass execution
Target object provides:
get_passes(context&, compile_options&)- Ordered backend-specific pass list- Allocation model and context types
- Lowering rules mapping high-level ops to target intrinsics/library calls
Compile options control:
fast_math- Relaxed math transformationsexhaustive_tune- Full search for fastest GEMM/convolution implementation- Quantization toggles (fp16/int8/fp8) for op substitution
Memory planning:
memory_coloring- Assign buffers to reuse memory where lifetimes don't overlapadjust_allocation/replace_allocate- Coordinate with target allocation modelsschedule- Order instructions for optimal overlap/locality (disable:MIGRAPHX_DISABLE_SCHEDULE_PASS=1)
Library integration:
- GEMM/DOT: HIPBLASLt or Composable Kernel (
MIGRAPHX_SET_GEMM_PROVIDER) - Convolution: MIOpen if available (
MIGRAPHX_USE_MIOPEN), else custom kernels - Precision handling:
propagate_precision,rewrite_low_precisionfor mixed precision (fp16/fp8/bf16)
MIGraphX uses type erasure extensively to create extensible systems without traditional inheritance. This pattern is used for pass, operation, target, instruction, and other core abstractions. It enables value semantics with polymorphic behavior, no virtual inheritance overhead.
1. Define interface template in tools/include/<name>.hpp:
<%
interface('pass',
virtual('name', returns='std::string', const=True),
virtual('apply', returns='void', mpm='module_pass_manager &', const=True,
default='migraphx::detail::module_pass_manager_apply'),
virtual('apply', returns='void', p='program &', const=True,
default='migraphx::nop')
)
%>Syntax explanation:
interface('pass', ...)- Defines type-erased interface namedpassvirtual(...)- Defines one method in the interface- First arg: method name (e.g.,
'name') returns='...'- C++ return typeconst=True- Method is const- Method arguments as keyword args:
p='program &' default='...'- Optional default implementation fallback
- First arg: method name (e.g.,
2. Generate boilerplate:
make generate # Runs tools/te.py on templates in tools/include/migraphx/3. Generated header placed in src/include/migraphx/<name>.hpp contains wrapper class with concept/model pattern for virtual dispatch.
No inheritance required - just implement the interface:
struct my_custom_pass {
std::string name() const { return "my_custom_pass"; }
void apply(module& m) const { /* transformation logic */ }
};
// Use directly:
migraphx::pass p = my_custom_pass{}; // Value semantics!- Parses ONNX models into MIGraphX IR
- Located in
src/onnx/parse_*.cppfiles - Each ONNX operator has a corresponding parse function
Adding a new ONNX operator:
- Add parse function in
src/onnx/parse_<op>.cpp - Register in ONNX parser operator map
- Map ONNX attributes to MIGraphX operation parameters
- Handle shape inference and broadcasting
- Add test in
test/onnx/parse/(parsing only) andtest/onnx/verify/(execution)
TensorFlow Parser follows similar pattern in src/tf/.
Located in src/py/migraphx_py.cpp using pybind11.
Basic pattern:
import migraphx
# Parse model
model = migraphx.parse_onnx("model.onnx")
# Compile for target
model.compile(migraphx.get_target("gpu"))
# Run inference
results = model.run({"input": input_data})When adding C++ APIs, mirror in Python bindings following existing patterns.
The driver (./bin/migraphx-driver) is essential for development, debugging, and testing without writing C++ code.
| Command | Description |
|---|---|
op -l/--list |
Print all available MIGraphX operators |
params |
Print input/output parameter shapes |
read |
Load and print input graph |
compile |
Compile and print input graph |
run |
Compile, allocate parameters, evaluate, and print graph |
verify |
Run on reference and GPU, check output consistency |
perf |
Compile and run, then print performance report |
| Option | Description |
|---|---|
--onnx |
Load file as ONNX graph |
--tf |
Load file as TensorFlow graph |
--migraphx |
Load file as MIGraphX graph |
--gpu / --cpu / --ref |
Target backend for compilation |
--fp16 / --int8 |
Quantization mode |
--text / --json / --binary |
Output format |
--batch N |
Set batch size for model |
--input-dim @input 1 3 224 224 |
Set static dimensions |
-n N / --iterations N |
Number of iterations for perf |
--fill0 / --fill1 |
Fill parameters with 0s or 1s |
-o FILE / --output FILE |
Write output to file |
Compile and inspect IR:
migraphx-driver compile model.onnx --gpu --textApply a driver pass:
migraphx-driver compile model.onnx --apply-pass dead_code_elimination --textPerformance benchmarking:
migraphx-driver perf model.onnx --gpu -n 100Verify numerical correctness:
migraphx-driver verify model.onnx --atol 1e-5 --rtol 1e-5Register custom passes: Add to src/driver/passes.cpp for driver access.
Environment variables:
MIGRAPHX_TRACE_COMPILE=1- Trace compilation passesMIGRAPHX_TRACE_EVAL=1- Trace evaluationMIGRAPHX_DISABLE_SCHEDULE_PASS=1- Disable scheduling for debugging
Performance reporting:
program.perf_report(std::ostream&, iterations, parameter_map)- Get timing stats- Driver:
migraphx-driver perf --onnx model.onnx --gpu -n 50
IR inspection:
program.debug_print()- Print IR to stdout- GraphViz output via
graphviz.cpphelpers - ROCm profiling markers via
marker_roctx.*
MIGRAPHX_ENABLE_GPU- Enable GPU backend (default: ON if HIP available)MIGRAPHX_ENABLE_CPU- Enable CPU backend (default: OFF)MIGRAPHX_ENABLE_PYTHON- Build Python bindings (default: ON)MIGRAPHX_USE_MIOPEN- Use MIOpen library (default: ON)MIGRAPHX_USE_ROCBLAS- Use rocBLAS library (default: ON)MIGRAPHX_USE_HIPBLASLT- Use hipBLASLt (default: ON, requires rocBLAS)MIGRAPHX_USE_COMPOSABLEKERNEL- Use composable kernel JIT (default: ON)MIGRAPHX_DISABLE_LARGE_BUFFER_TESTS- Skip large-buffer testsBUILD_DEV- Development mode with extra checksGPU_TARGETS- Target GPU architectures (use detection snippet)
Example: cmake -S . -B build -DMIGRAPHX_ENABLE_GPU=On -DGPU_TARGETS=$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*')
- Missing GPU_TARGETS: Always specify via
$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*') - Dynamic shape errors: Ensure
compute_shape()handles unknown dimensions - Pass ordering: Passes have dependencies - check existing target pipelines
- Kernel packaging: New GPU kernels must be included in CMake install targets
- Build failures: Check clang-tidy warnings - they're enforced in CI
When implementing new features, search for analogous implementations:
- Operations:
grep -r "similar_op" src/op/ - Passes:
grep -r "similar_pass" src/ - GPU lowering: Check
src/targets/gpu/lowering.cpp - Tests: Mirror structure from
test/<similar_feature>_test.cpp - Matchers: See existing patterns in
src/matcher.cpp - ONNX ops: Look at similar operators in
src/onnx/parse_*.cpp
Key header files to understand:
src/include/migraphx/program.hpp- Top-level containersrc/include/migraphx/module.hpp- DAG of instructionssrc/include/migraphx/instruction.hpp- Graph nodessrc/include/migraphx/operation.hpp- Operation abstractionsrc/include/migraphx/shape.hpp- Tensor shape/layoutsrc/include/migraphx/pass.hpp- Pass interfacesrc/include/migraphx/matcher.hpp- Pattern matching DSL
Documentation:
- Official docs:
docs/directory (Sphinx + Doxygen) - Developer guides:
docs/directory (detailed contributor documentation) - Build documentation:
make -C build doc
- Commits: Use imperative, concise subject lines (optionally add a scope), e.g.,
fix: correct hip stream handling. Reference issues/PRs (Fixes #123). - PRs: Include problem statement, approach, and risk notes; link issues. Show test output for affected areas and add tests for new behavior. Ensure formatting/linters pass and CI is green.
- Do not use
$(nproc), runnprocfirst, then use the numeric value directly. - When running
make -j$(nproc), first resolve$(nproc)to a literal number, e.g. runnprocfirst, then use the numeric value directly in the make command (i.e.,make -j32ifnprocreturns32).
- When creating any temporary/scratch/generated file, write it under
tmp/at the workspace root. - Use
tmp/<task-slug>/<YYYYMMDD-HHMMSS>/for groups of files. - Never place temp files outside
tmp/or inside source directories.