diff --git a/.gitignore b/.gitignore index f82e4f3..7bac171 100644 --- a/.gitignore +++ b/.gitignore @@ -41,5 +41,6 @@ nsys_analysis_build/ **/NeMo # AI agent files +CLAUDE.local.md GEMINI.md plans/ diff --git a/CHANGELOG b/CHANGELOG index c401c1b..489c6f1 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,3 +1,40 @@ +## [v26.05.00] - 2026-05-19 + +### Added + +- Kimi K2 MXFP8 pretrain support. +- Nemotron 3 Nano (30B) and Super (120B) pretrain recipes. +- Slurm topology checks and CPU governor reporting in the system info microbenchmark. +- `llmb-run` job history and log handling. +- `llmb-run` flags: `--env` for container env overrides, additional Slurm pass-through flags, and `dump-env` Megatron-Bridge mode. + +### Changed + +- Updated recipes to NeMo 26.04.00 where applicable. +- Refreshed DeepSeek V3, Nemotron 3, and Qwen3 configurations. + +### Fixed + +- Legacy-parser grad-norm NaN handling. +- Archive exclusion for `nsys_profile` and PyTorch profiling output directories. +- Torchtitan container compatibility. + +### Removed + +- Deprecated Grok1 and Nemotron4 recipes. +- Legacy `setup_script` installer path and Conda support. +- Deprecated `llmb-run` commands. + +### Known Issues + +- DeepSeek V3 Megatron-Bridge on H100 requires `uv <=0.9.28` during setup. +- EFA limitations remain for DeepSeek V3 (Megatron-Bridge H100, TorchTitan) and Qwen3 (30B H100, 235B H100); see Known Issues section of README for details. +- Optional PCT fixed-core CPU binding may improve select workloads on Granite Rapids systems where PCT is enabled. See the README Known Issues section before applying the patch. + +### End of Support + +- LLMB `v25.12.x` and earlier are no longer supported as of `v26.05.00`. These release lines will not receive further updates, fixes, or support. + ## [v26.02.01] - 2026-04-24 ### Added diff --git a/Exemplar_validation.md b/Exemplar_validation.md index b7988ee..8be1f4f 100644 --- a/Exemplar_validation.md +++ b/Exemplar_validation.md @@ -33,8 +33,13 @@ While the benchmarks can be run independently, we recommend looping in your NVID ### **Run benchmark recipes via llmb-run** -1. "llmb-run" is a tool that automates execution of the test suite, and is the recommended way to launch the suite. -2. For an installed GPU type, executing `llmb-run exemplar` will launch the full Exemplar test suite (including running each test three times). See the [llmb-run README](cli/llmb-run/README.md) for more info. +```bash +llmb-run exemplar +``` + +This launches the full Exemplar test suite for the installed GPU type. `llmb-run` is the recommended tool for executing the suite; the `exemplar` subcommand is a convenience that launches every required workload in one go. + +If individual workloads fail, you can re-run them on their own — Exemplar requires a passing run for each workload in the suite, not a single end-to-end execution. See the [llmb-run README](cli/llmb-run/README.md) for repeat and profiling behavior. ### **Verify results** @@ -44,12 +49,11 @@ While the benchmarks can be run independently, we recommend looping in your NVID ### **Optimize with NVIDIA** -1. Work with your NVIDIA account team to investigate any tuning opportunities with NVIDIA performance experts. +Work with your NVIDIA account team to investigate any tuning opportunities with NVIDIA performance experts. ### **Qualify for Exemplar** -1. If approved, your cloud is recognized as an [NVIDIA Exemplar Cloud](https://www.nvidia.com/en-us/data-center/ai-cloud-performance/) for the selected platform(s). -2. NVIDIA is happy to collaborate to support downstream efforts highlighting your achievement. +If approved, your cloud is recognized as an [NVIDIA Exemplar Cloud](https://www.nvidia.com/en-us/data-center/ai-cloud-performance/) for the selected platform(s). NVIDIA is happy to collaborate to support downstream efforts highlighting your achievement. ## **Ongoing Expectations** @@ -60,56 +64,56 @@ To start, contact your NVIDIA account team and reference this DGX Cloud Benchmar ## Exemplar Workload Recipes -Scale: **512 GPUs** | Repeats: **3x** | Profiling: enabled for 1 of the 3 total runs +Scale: **512 GPUs** | Repeats: **1** | Profiling: **disabled** ### GB300 -| Model | Size | Dtypes | -| :---------- | :--- | :--------- | -| DeepSeek-V3 | 671B | BF16, FP8 | -| GPT (OSS) | 120B | BF16 | -| Grok-1 | 314B | BF16, FP8 | -| Llama 3.1 | 405B | FP8, NVFP4 | -| Llama 3.1 | 70B | FP8, NVFP4 | -| Nemotron-H | 56B | FP8 | -| Nemotron-4 | 340B | BF16, FP8 | -| Qwen3 | 235B | BF16 | +| Model | Size | Dtypes | +| :---------- | :--- | :--------------- | +| DeepSeek-V3 | 671B | BF16, FP8, NVFP4 | +| GPT (OSS) | 120B | BF16 | +| Kimi-K2 | 1T | FP8 | +| Llama 3.1 | 405B | FP8, NVFP4 | +| Llama 3.1 | 70B | FP8, NVFP4 | +| Nemotron-H | 56B | FP8 | +| Nemotron 3 | 120B | BF16, FP8, NVFP4 | +| Qwen3 | 235B | BF16 | ### GB200 +| Model | Size | Dtypes | +| :---------- | :--- | :--------------- | +| DeepSeek-V3 | 671B | BF16, FP8, NVFP4 | +| GPT (OSS) | 120B | BF16 | +| Kimi-K2 | 1T | FP8 | +| Llama 3.1 | 405B | FP8, NVFP4 | +| Llama 3.1 | 70B | FP8, NVFP4 | +| Nemotron-H | 56B | FP8 | +| Qwen3 | 235B | BF16 | + +### B300 + | Model | Size | Dtypes | | :---------- | :--- | :--------- | -| DeepSeek-V3 | 671B | BF16, FP8 | +| DeepSeek-V3 | 671B | BF16 | | GPT (OSS) | 120B | BF16 | -| Grok-1 | 314B | BF16, FP8 | | Llama 3.1 | 405B | FP8, NVFP4 | -| Llama 3.1 | 70B | FP8 | +| Llama 3.1 | 70B | FP8, NVFP4 | | Nemotron-H | 56B | FP8 | -| Nemotron-4 | 340B | BF16, FP8 | +| Nemotron 3 | 120B | BF16 | | Qwen3 | 235B | BF16 | -### B300 - -| Model | Size | Dtypes | -| :---------- | :--- | :----- | -| DeepSeek-V3 | 671B | BF16 | -| GPT (OSS) | 120B | BF16 | -| Llama 3.1 | 405B | FP8 | -| Llama 3.1 | 70B | FP8 | -| Nemotron-H | 56B | FP8 | -| Qwen3 | 235B | BF16 | - ### B200 | Model | Size | Dtypes | | :---------- | :--- | :--------- | | DeepSeek-V3 | 671B | BF16, FP8 | | GPT (OSS) | 120B | BF16 | -| Grok-1 | 314B | BF16, FP8 | +| Kimi-K2 | 1T | FP8 | | Llama 3.1 | 405B | FP8, NVFP4 | | Llama 3.1 | 70B | FP8, NVFP4 | | Nemotron-H | 56B | FP8 | -| Nemotron-4 | 340B | BF16, FP8 | +| Nemotron 3 | 120B | BF16, FP8 | | Qwen3 | 235B | BF16 | ### H100 @@ -118,8 +122,6 @@ Scale: **512 GPUs** | Repeats: **3x** | Profiling: enabled for 1 of the 3 total | :---------- | :--- | :-------- | | DeepSeek-V3 | 671B | FP8 | | GPT (OSS) | 120B | BF16 | -| Grok-1 | 314B | BF16, FP8 | | Llama 3.1 | 70B | BF16, FP8 | | Nemotron-H | 56B | FP8 | -| Nemotron-4 | 340B | BF16, FP8 | | Qwen3 | 235B | BF16 | diff --git a/README.md b/README.md index 3d89f52..eab887e 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,9 @@ Depending on your cluster's job scheduler, ensure the following are met: - **Slurm Clusters** - Version 22.x or newer - `task/affinity` plugin required for process pinning + - PMIx support required; Slurm must be built with `--with-pmix` (verify with `srun --mpi=list`) - [Enroot](https://github.com/NVIDIA/enroot/) 4.0.0 or newer + - Enroot [extra hooks](https://github.com/NVIDIA/enroot/tree/main/conf/hooks/extra) (e.g. `50-slurm-pytorch.sh`) must be installed under `/etc/enroot/hooks.d/` — required for PyTorch distributed bootstrap. - [Pyxis](https://github.com/NVIDIA/pyxis) ## Quick Start Guide @@ -120,12 +122,20 @@ Depending on your cluster's job scheduler, ensure the following are met: **Important:** Installation may take several hours, influenced by selected recipes, internet speed, and your current node's resources. Consider using a tool like `tmux` or `screen`. - This will set up a supported Python environment (reusing your current `uv`/venv/conda env if compatible, otherwise creating `../llmb_venv` one directory above the repo), then launch the interactive installer. + This will ensure the required `uv` version is available, set up a supported Python environment (reusing your active environment if compatible, otherwise creating a uv-managed `../llmb_venv` one directory above the repo), then launch the interactive installer. ```bash ./install.sh ``` + To reuse container images across multiple installs on the same system, pass a writable shared image folder: + + ```bash + ./install.sh -i /shared/llmb-images + ``` + + This forwards `-i` to `llmb-install` and avoids downloading images that already exist in that folder. + The installer will: - Install `uv` (the required package manager) if it is not already present @@ -156,6 +166,8 @@ Depending on your cluster's job scheduler, ensure the following are met: llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 ``` + For one workload/model-size target, `-w -s ` is usually easiest to read. For target lists, omit `-s` and pass comma-separated `-w` entries: `pretrain_llama3.1_70b,pretrain_nemotron-h` selects one Llama 3.1 model size plus Nemotron-H, while `pretrain_llama3.1,pretrain_nemotron-h` includes all installed Llama 3.1 model sizes plus Nemotron-H. + 8. (Optional) Package results for sharing: When you're ready to share results — for example, as part of [Exemplar Cloud certification](Exemplar_validation.md) — bundle all experiment data into a single archive: @@ -182,7 +194,7 @@ After running the installer, the following directory structure is created: - `LLMB_REPO`: Directory containing the clone of the recipe repository. - `LLMB_INSTALL`: Top-level directory for all benchmarking artifacts (images, datasets, venvs, workloads, etc). -- `LLMB_WORKLOAD`: Workload-specific directory, e.g. `${LLMB_INSTALL}/workloads/pretrain_nemotron4`. +- `LLMB_WORKLOAD`: Workload-specific directory, e.g. `${LLMB_INSTALL}/workloads/pretrain_llama3.1`. - Results, logs, and checkpoints are stored under subfolders of `LLMB_WORKLOAD` (see below). **Example structure:** @@ -193,7 +205,7 @@ $LLMB_INSTALL/ ├── datasets/ ├── venvs/ └── workloads/ - └── pretrain_nemotron4/ # <- $LLMB_WORKLOAD + └── pretrain_llama3.1/ # <- $LLMB_WORKLOAD ├── NeMo/ ├── ... └── experiments/ @@ -218,77 +230,79 @@ The following tables list each benchmark used to evaluate the model's performanc ### GB300 Workloads -| Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | -| :------------: | :-------------: | :-----------------------------------------------------------: | :---------------: | :--------: | :---------------: | :--------: | :-------------------: | :-----------: | :----------: | -| Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-512 | BF16 | No | No | Slurm | -| Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 26.02.01 | 671B | 128-512 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 405B | 256-512 | NVFP4, FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 70B | 64-512 | NVFP4, FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 8B | 8-128 | NVFP4, FP8 | Yes | Yes | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.00 | 235B | 256-512 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.00 | 30B | 8-64 | BF16 | Yes | No | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-15b/README.md) | 25.09.00 | 15B | 16-256 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-340b/README.md) | 25.09.00 | 340B | 128-512 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | NeMo | [Grok1](grok1/README.md) | 25.09.00 | 314B | 128-512 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | -| Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | -| Microbenchmark | TRT-LLM | [GPT-OSS](microbenchmarks/cpu_overhead/README.md) | 1.1.0rc5 | 120B | 1-4 | MXFP4 | Yes | No | Slurm | +| Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | +| :------------: | :-------------: | :-----------------------------------------------------------: | :---------------: | :--------: | :---------------: | :--------------: | :-------------------: | :-----------: | :----------: | +| Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-512 | BF16 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 26.04.00 | 671B | 128-512 | NVFP4, FP8, BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 405B | 256-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 70B | 64-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 8B | 8-128 | NVFP4, FP8 | Yes | Yes | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 235B | 256-512 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 30B | 8-64 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Kimi-K2](kimi-k2/README.md) | 26.04.00 | 1T | 256-512 | FP8 (MX) | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Nano](nemotron3/README.md) | 26.04.00 | 30B | 8-64 | FP8, BF16 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Super](nemotron3/README.md) | 26.04.00 | 120B | 64-512 | NVFP4, FP8, BF16 | No | No | Slurm | +| Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | +| Microbenchmark | TRT-LLM | [GPT-OSS](microbenchmarks/cpu_overhead/README.md) | 1.1.0rc5 | 120B | 1-4 | MXFP4 | Yes | No | Slurm | ### GB200 Workloads -| Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | -| :------------: | :--------------: | :-----------------------------------------------------------: | :----------------: | :--------: | :---------------: | :--------: | :-------------------: | :-----------: | :----------: | -| Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-512 | BF16 | No | No | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-15b/README.md) | 25.09.00 | 15B | 16-256 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-340b/README.md) | 25.07.01 | 340B | 128-512 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 405B | 256-512 | NVFP4, FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 70B | 64-512 | FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 8B | 8-128 | NVFP4, FP8 | Yes | Yes | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 235B | 256-512 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 30B | 8-64 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 26.02.01 | 671B | 256-512 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | TorchTitan | [DeepSeek V3](deepseek_v3/pretrain/torchtitan/README.md) | 25.12-py3 | 671B | 256 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | NeMo | [Grok1](grok1/README.md) | 25.09.00 | 314B | 128-512 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | -| Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | -| Inference | TRT-LLM | [DeepSeek R1](deepseek_r1/inference/trtllm/README.md) | 1.1.0rc5 | 671B | 4 | NVFP4 | No | No | Slurm | -| Inference | Dynamo | [DeepSeek R1](deepseek_r1/inference/dynamo/README.md) | 0.6.1 | 671B | 32 | NVFP4 | No | No | Slurm | -| Inference | SGLang | [DeepSeek R1](deepseek_r1/inference/sglang/README.md) | v0.5.3-cu129-gb200 | 671B | 4 | NVFP4 | No | No | Slurm | -| Inference | TRT-LLM | [Llama 3.3](llama3.3/inference/README.md) | 1.1.0rc5 | 70B | 1-4 | NVFP4 | Yes | No | Slurm | -| Inference | Dynamo + TRT-LLM | [GPT-OSS Inference](gpt-oss/inference/k8s/README.md) | 0.5.1-rc0.pre3 | 120B | 4+ | MXFP4 | No | No | Kubernetes | -| Inference | Dynamo + TRT-LLM | [GPT-OSS](gpt-oss/inference/slurm/README.md) | 0.5.1-rc0.pre3 | 120B | 4 | MXFP4 | No | No | Slurm | -| Microbenchmark | TRT-LLM | [GPT-OSS](microbenchmarks/cpu_overhead/README.md) | 1.1.0rc5 | 120B | 1-4 | MXFP4 | Yes | No | Slurm | +| Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | +| :------------: | :--------------: | :-----------------------------------------------------------: | :----------------: | :--------: | :---------------: | :--------------: | :-------------------: | :-----------: | :----------: | +| Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-512 | BF16 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 405B | 256-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 70B | 64-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 8B | 8-128 | NVFP4, FP8 | Yes | Yes | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 235B | 256-512 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 30B | 8-64 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 26.04.00 | 671B | 256-512 | NVFP4, FP8, BF16 | Yes | No | Slurm | +| Pretrain | TorchTitan | [DeepSeek V3](deepseek_v3/pretrain/torchtitan/README.md) | 25.12-py3 | 671B | 256 | FP8, BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Kimi-K2](kimi-k2/README.md) | 26.04.00 | 1T | 256-512 | FP8 (MX) | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Nano](nemotron3/README.md) | 26.04.00 | 30B | 8-64 | BF16 | No | No | Slurm | +| Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | +| Inference | TRT-LLM | [DeepSeek R1](deepseek_r1/inference/trtllm/README.md) | 1.1.0rc5 | 671B | 4 | NVFP4 | No | No | Slurm | +| Inference | Dynamo | [DeepSeek R1](deepseek_r1/inference/dynamo/README.md) | 0.6.1 | 671B | 32 | NVFP4 | No | No | Slurm | +| Inference | SGLang | [DeepSeek R1](deepseek_r1/inference/sglang/README.md) | v0.5.3-cu129-gb200 | 671B | 4 | NVFP4 | No | No | Slurm | +| Inference | TRT-LLM | [Llama 3.3](llama3.3/inference/README.md) | 1.1.0rc5 | 70B | 1-4 | NVFP4 | Yes | No | Slurm | +| Inference | Dynamo + TRT-LLM | [GPT-OSS Inference](gpt-oss/inference/k8s/README.md) | 0.5.1-rc0.pre3 | 120B | 4+ | MXFP4 | No | No | Kubernetes | +| Inference | Dynamo + TRT-LLM | [GPT-OSS](gpt-oss/inference/slurm/README.md) | 0.5.1-rc0.pre3 | 120B | 4 | MXFP4 | No | No | Slurm | +| Microbenchmark | TRT-LLM | [GPT-OSS](microbenchmarks/cpu_overhead/README.md) | 1.1.0rc5 | 120B | 1-4 | MXFP4 | Yes | No | Slurm | ### B300 Workloads -| Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | -| :------------: | :-------------: | :-----------------------------------------------------------: | :---------------: | :--------: | :---------------: | :-------: | :-------------------: | :-----------: | :----------: | -| Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-512 | BF16 | No | No | Slurm | -| Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 26.02.01 | 671B | 128-512 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 405B | 256-512 | FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 70B | 64-512 | FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 235B | 256-512 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 30B | 8-64 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | -| Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | -| Microbenchmark | TRT-LLM | [GPT-OSS](microbenchmarks/cpu_overhead/README.md) | 1.1.0rc5 | 120B | 1-4 | MXFP4 | Yes | No | Slurm | +| Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | +| :------------: | :-------------: | :-----------------------------------------------------------: | :---------------: | :--------: | :---------------: | :--------: | :-------------------: | :-----------: | :----------: | +| Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-512 | BF16 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 405B | 256-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 70B | 64-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 8B | 8-128 | NVFP4, FP8 | Yes | Yes | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 235B | 256-512 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 30B | 8-64 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 26.02.01 | 671B | 128-512 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Nano](nemotron3/README.md) | 26.04.00 | 30B | 8-64 | FP8, BF16 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Super](nemotron3/README.md) | 26.04.00 | 120B | 64-512 | BF16 | No | No | Slurm | +| Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | +| Microbenchmark | TRT-LLM | [GPT-OSS](microbenchmarks/cpu_overhead/README.md) | 1.1.0rc5 | 120B | 1-4 | MXFP4 | Yes | No | Slurm | ### B200 Workloads | Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | | :------------: | :--------------: | :-----------------------------------------------------------: | :------------------: | :--------: | :---------------: | :--------: | :-------------------: | :-----------: | :----------: | | Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-512 | BF16 | No | No | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-15b/README.md) | 25.09.00 | 15B | 16-256 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-340b/README.md) | 25.07.01 | 340B | 128-1024 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.00 | 405B | 256-1024 | NVFP4, FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.00 | 70B | 64-1024 | NVFP4, FP8 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.00 | 8B | 8-128 | NVFP4, FP8 | Yes | Yes | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 235B | 256-512 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 30B | 8-64 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 405B | 256-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 70B | 64-512 | NVFP4, FP8 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 8B | 8-128 | NVFP4, FP8 | Yes | Yes | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 235B | 256-512 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 30B | 8-64 | BF16 | Yes | No | Slurm | | Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 26.02.01 | 671B | 256-512 | FP8, BF16 | Yes | No | Slurm | | Pretrain | TorchTitan | [DeepSeek V3](deepseek_v3/pretrain/torchtitan/README.md) | 25.12-py3 | 671B | 256 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | NeMo | [Grok1](grok1/README.md) | 25.09.00 | 314B | 256-1024 | FP8, BF16 | Yes | No | Slurm | | Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Kimi-K2](kimi-k2/README.md) | 26.04.00 | 1T | 256-512 | FP8 (MX) | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Nano](nemotron3/README.md) | 26.04.00 | 30B | 8-64 | FP8, BF16 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Super](nemotron3/README.md) | 26.04.00 | 120B | 64-512 | FP8, BF16 | No | No | Slurm | | Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | | Inference | TRT-LLM | [DeepSeek R1](deepseek_r1/inference/trtllm/README.md) | 1.1.0rc5 | 671B | 4 | NVFP4 | No | No | Slurm | | Inference | Dynamo | [DeepSeek R1](deepseek_r1/inference/dynamo/README.md) | 0.6.1 | 671B | 32 | NVFP4 | No | No | Slurm | @@ -304,19 +318,17 @@ Baseline performance metrics were collected using workloads on the NVIDIA DGX H1 | Type | Framework | Model | Container Version | Model Size | Scale (# of GPUs) | Precision | Model Access Required | Checkpointing | Cluster Type | | :------------: | :-------------: | :-----------------------------------------------------------: | :---------------: | :--------: | :---------------: | :-------: | :-------------------: | :-----------: | :----------: | | Pretrain | Megatron-Bridge | [GPT OSS 120B](gpt-oss/pretrain/README.md) | 26.02.01 | 120B | 64-1024 | BF16 | No | No | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-15b/README.md) | 25.09.00 | 15B | 16-256 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | NeMo | [Nemotron4](nemotron4-340b/README.md) | 25.09.00 | 340B | 256-2048 | FP8, BF16 | No | Yes | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 405B | 1024 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 70B | 64-1024 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.02.01 | 8B | 8-128 | FP8, BF16 | Yes | Yes | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 235B | 256-512 | BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.02.01 | 30B | 16-64 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 405B | 1024 | FP8, BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 70B | 64-512 | FP8, BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Llama 3.1](llama3.1/README.md) | 26.04.00 | 8B | 8-128 | FP8, BF16 | Yes | Yes | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 235B | 256-512 | BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Qwen3](qwen3/pretrain/README.md) | 26.04.00 | 30B | 16-64 | BF16 | Yes | No | Slurm | | Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 25.09.00 | 671B | 512-1024 | FP8 | Yes | No | Slurm | | Pretrain | Megatron-Bridge | [DeepSeek V3](deepseek_v3/pretrain/megatron_bridge/README.md) | 25.09.00 | 671B | 1024 | BF16 | Yes | No | Slurm | | Pretrain | TorchTitan | [DeepSeek V3](deepseek_v3/pretrain/torchtitan/README.md) | 25.12-py3 | 671B | 512-1024 | BF16 | Yes | No | Slurm | -| Pretrain | NeMo | [Grok1](grok1/README.md) | 25.09.00 | 314B | 512-2048 | FP8, BF16 | Yes | No | Slurm | -| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-1024 | FP8 | No | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron-H](nemotron-h/README.md) | 26.02.01 | 56B | 32-512 | FP8 | No | No | Slurm | | Finetune | Megatron-Bridge | [Llama 3](llama3/finetune/README.md) | 26.02.01 | 70B | 8-16 | FP8, BF16 | Yes | No | Slurm | +| Pretrain | Megatron-Bridge | [Nemotron 3 Nano](nemotron3/README.md) | 26.04.00 | 30B | 16-64 | FP8, BF16 | No | No | Slurm | | Inference | TRT-LLM | [DeepSeek R1](deepseek_r1/inference/trtllm/README.md) | 1.1.0rc5 | 671B | 16 | FP8 | No | No | Slurm | | Inference | Dynamo | [DeepSeek R1](deepseek_r1/inference/dynamo/README.md) | 0.6.1 | 671B | 48 | FP8 | No | No | Slurm | | Inference | TRT-LLM | [Llama 3.3](llama3.3/inference/README.md) | 1.1.0rc5 | 70B | 2 | FP8 | Yes | No | Slurm | @@ -339,6 +351,9 @@ Baseline performance metrics were collected using workloads on the NVIDIA DGX H1 | Inference | NIM, SGLang | DeepSeek R1 | 1.7.2 | 671B | 16 | FP8 | No | No | Slurm | 25.08 | | Inference | NIM & NeMo Retriever (NVIDIA Enterprise RAG) | Llama 3.1 and 3.2 | instruct:1.3.3, rerank:1.3, embed:1.3.1 | 70b, 1b | 1-8 | N/A | Yes | No | Slurm | 25.08 | | Inference | TRT-LLM | Llama 4 | 1.0.0rc1 | 17b | 8 | FP8 | Yes | No | Slurm | 25.08 | +| Pretrain | NeMo | Nemotron4 15B | 25.09.00 | 15B | 16-256 | FP8, BF16 | No | Yes | Slurm | 26.02.01 | +| Pretrain | NeMo | Nemotron4 340B | 25.09.00 | 340B | 128-2048 | FP8, BF16 | No | Yes | Slurm | 26.02.01 | +| Pretrain | NeMo | Grok1 | 25.09.00 | 314B | 128-2048 | FP8, BF16 | Yes | No | Slurm | 26.02.01 | ## Model Access Requirements @@ -348,21 +363,21 @@ Some recipes additionally require approval for gated model repositories. In thos **Note:** approval processes are not immediate and may take some time. -| Recipe Type | Recipe Name | HF Token Required | Additional Approval Required | Details/Link for Approval | -| :------------- | :----------- | :---------------- | :--------------------------- | :-------------------------------------------------------------------------------------------------------- | -| Pretrain | GPT OSS 120B | Yes | No | [HuggingFace GPT OSS 120B](https://huggingface.co/openai/gpt-oss-120b) | -| Pretrain | Llama 3.1 | Yes | Yes | [HuggingFace Llama 3.1](https://huggingface.co/meta-llama/Llama-3.1-405B) | -| Pretrain | DeepSeek V3 | Yes | No | N/A | -| Pretrain | Grok1 | Yes | Yes | Grok1 recipe uses the [HuggingFace Llama 3](https://huggingface.co/meta-llama/Meta-Llama-3-70B) tokenizer | -| Pretrain | Nemotron4 | Yes | No | N/A | -| Pretrain | Qwen3 235B | Yes | No | [HuggingFace Qwen3 235B](https://huggingface.co/Qwen/Qwen3-235B-A22B) | -| Pretrain | Qwen3 30B | Yes | No | [HuggingFace Qwen3 30B](https://huggingface.co/Qwen/Qwen3-30B-A3B) | -| Pretrain | Nemotron-H | No | No | N/A | -| Finetune | Llama 3 | Yes | Yes | [HuggingFace Llama 3 70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B) | -| Inference | Llama 3.3 | Yes | Yes | [HuggingFace Llama 3.3 70B Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | -| Inference | DeepSeek R1 | Yes | No | N/A | -| Inference | GPT-OSS | Yes | No | [HuggingFace GPT OSS 120B](https://huggingface.co/openai/gpt-oss-120b) | -| Microbenchmark | CPU overhead | Yes | No | [HuggingFace GPT-OSS-120B](https://huggingface.co/openai/gpt-oss-120b) | +| Recipe Type | Recipe Name | HF Token Required | Additional Approval Required | Details/Link for Approval | +| :------------- | :--------------- | :---------------- | :--------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| Pretrain | GPT OSS 120B | Yes | No | [HuggingFace GPT OSS 120B](https://huggingface.co/openai/gpt-oss-120b) | +| Pretrain | Llama 3.1 405B | Yes | Yes | [HuggingFace Llama 3.1 405B](https://huggingface.co/meta-llama/Llama-3.1-405B) | +| Pretrain | Llama 3.1 8B/70B | Yes | Yes | [HuggingFace Llama 3 70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B) or [HuggingFace Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B); either grants Llama 3 family access | +| Pretrain | DeepSeek V3 | Yes | No | N/A | +| Pretrain | Qwen3 235B | Yes | No | [HuggingFace Qwen3 235B](https://huggingface.co/Qwen/Qwen3-235B-A22B) | +| Pretrain | Qwen3 30B | Yes | No | [HuggingFace Qwen3 30B](https://huggingface.co/Qwen/Qwen3-30B-A3B) | +| Pretrain | Nemotron-H | No | No | N/A | +| Pretrain | Kimi-K2 | No | No | N/A | +| Finetune | Llama 3 | Yes | Yes | [HuggingFace Llama 3 70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B) | +| Inference | Llama 3.3 | Yes | Yes | [HuggingFace Llama 3.3 70B Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | +| Inference | DeepSeek R1 | Yes | No | N/A | +| Inference | GPT-OSS | Yes | No | [HuggingFace GPT OSS 120B](https://huggingface.co/openai/gpt-oss-120b) | +| Microbenchmark | CPU overhead | Yes | No | [HuggingFace GPT-OSS-120B](https://huggingface.co/openai/gpt-oss-120b) | # Reference Infrastructure @@ -560,65 +575,34 @@ These usually mean that one of the GPUs is hanging. Possible resolutions: - re-running the job on a different set of nodes - rebooting affected nodes. -## 2. Slurm job failed, need to find log files +## 2. Slurm job failed, need to inspect logs ### Symptom -A Slurm job failed during benchmark run. E.g., a nemotron benchmark job with ID=2041792 failed - -``` -sacct -j 2041792 -JobID JobName Partition Account AllocCPUS State ExitCode ------------- ---------- ---------- ---------- ---------- ---------- -------- -2041792 launch.sh batch test 224 FAILED 1:0 -2041792.bat+ batch test 224 FAILED 1:0 -2041792.ext+ extern test 224 COMPLETED 0:0 -2041792.0 bash test 224 FAILED 1:0 -``` +A benchmark job failed or needs inspection. ### Solution -#### NeMo2 (e.g., Nemotron4) - -You can find log files associated with this run under `$LLMB_WORKLOAD/experiments/pretrain_nemotron4____` folder. The folder will have subfolders that will contain `log-account.pretrain_nemotron4____.out` files with a root cause error message. - -E.g., for the job failure above and assuming the nemotron 15b job ran on 16 GPUs, used version 25.05, and with precision bf16 the path will be under `$LLMB_WORKLOAD/experiments/pretrain_nemotron4_15b_bf16_gpus16_tp1_pp1_cp1_vp1_mbs2_gbs64/...` - -Search for errors in the `log-account.pretrain_nemotron4_15b_bf16_gpus16_tp1_pp1_cp1_vp1_mbs2_gbs64_3358926_0.out` file. - -## 3. Unable to use venv required by benchmark - -### Symptom - -If a benchmark requires virtual python environment (venv) but `virtualenv` executable isn't available on the login node and/or login nodes cannot be updated by non-sudo users, you would see errors like below when trying to setup venv +From `$LLMB_INSTALL`, list jobs and find the Slurm job ID: -```shell -bash-5.2$ virtualenv -bash: virtualenv: command not found +```bash +cd $LLMB_INSTALL +llmb-run jobs ``` -### Solution - -There are alternative virtual environment options available like **conda**. - -To install and activate conda virtual environment +Then show the active log: -```shell -# pick INSTALL_PATH with sufficient disk space -INSTALL_PATH=~ -wget -q https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O $INSTALL_PATH/miniconda.sh -bash $INSTALL_PATH/miniconda.sh -b -p $INSTALL_PATH/miniconda3 -$INSTALL_PATH/miniconda3/bin/conda init -source ~/.bashrc +```bash +llmb-run jobs log ``` -When you are finished running this benchmark you can deactivate the environment, run this command +By default, this prints the last 200 lines, not the full file. Use `--tail ` for more lines, `--follow` for running jobs, or `--path` to print the active log file path. -```shell -conda deactivate -``` +If `llmb-run` cannot find the job, run `llmb-run jobs rebuild` once to scan older submissions, then retry the log command. -## 4. NCCL InfiniBand QPS tuning +See the [llmb-run jobs command reference](cli/llmb-run/README.md#jobs-command) for the full command list and options. + +## 3. NCCL InfiniBand QPS tuning Some recipes set `NCCL_IB_QPS_PER_CONNECTION=4` by default. This controls the number of InfiniBand queue pairs NCCL uses per connection and can improve multi-node communication performance on certain cluster configurations. @@ -634,12 +618,12 @@ environment: **Option B** — Pass it inline when submitting a single job: ```bash -NCCL_IB_QPS_PER_CONNECTION=4 llmb-run submit -w -s --dtype --scale +NCCL_IB_QPS_PER_CONNECTION=4 llmb-run submit -w -s --dtype --scale ``` > **Note:** The optimal value may vary by cluster and workload. If you experience communication errors or degraded performance after changing this setting, try removing it or adjusting the value. -## 5. Why do I see Llama-3 downloads or pretrain_llama3 log names when using the llama3.1 recipe? +## 4. Why do I see Llama-3 downloads or pretrain_llama3 log names when using the llama3.1 recipe? The pretrain_llama3.1 workload is the user-facing recipe for 8B, 70B, and 405B. Internally, the 8B and 70B sizes reuse existing Megatron-Bridge llama3 configs instead of duplicating them under a separate llama3.1 name. As a result, setup output for 8B/70B may show Meta-Llama-3-\*, and experiment or log names may use the pretrain_llama3 prefix. This is expected and does not mean the wrong workload or model size was selected. @@ -681,15 +665,17 @@ Some workloads complete all timesteps but print errors during the cleanup phase. We now detect this case and convert the exit code so Slurm reports success when the run actually finished. Log files will still contain the cleanup errors. If the job completed all timesteps and Slurm shows COMPLETED, you can ignore cleanup errors in the logs. This will be fixed in a future release. -## 3. uv 0.9.29+ breaks all recipes that use nemo_run +## 3. DeepSeek V3 Megatron-Bridge on H100 requires uv \<=0.9.28 ### Issue -Nearly every recipe installs `nemo_run` and will fail with `uv` `0.9.29+` due to uv rejecting unknown fields in `pyproject.toml` files. +DeepSeek V3 Megatron-Bridge on H100 uses NeMo `25.09.00` and requires `uv <=0.9.28` during setup. Newer uv versions reject fields used by this recipe's `pyproject.toml` files. + +This does not affect other non-deprecated Megatron-Bridge recipes in this release. ### Workaround -Run `./install.sh` from this release. It enforces `uv <=0.9.28`, which avoids the strict parser breakage. +Run `./install.sh`; it selects a compatible uv version. For manual DeepSeek V3 H100 setup, use `uv <=0.9.28`. ## 4. NeMo 26.02.00 container EFA library conflict @@ -741,30 +727,29 @@ The NeMo `26.02.01` container fixes the NeMo `26.02.00` EFA library conflict abo - **DeepSeek V3 Megatron-Bridge on H100:** Not supported on EFA. The H100 recipe uses NeMo `25.09.00` and still has NVSHMEM/EFA initialization issues. - **DeepSeek V3 TorchTitan:** Not validated on EFA. The recipe uses PyTorch `25.12-py3` and has unresolved NVSHMEM/EFA issues. - **Qwen3 30B on H100:** Not supported on EFA. The H100 configuration uses EP=16, which requires expert-parallel communication between nodes over EFA and exposes the Megatron-Bridge EP communication issue tracked in [Megatron-Bridge #3343](https://github.com/NVIDIA-NeMo/Megatron-Bridge/issues/3343). -- **Grok1 and Nemotron4:** EFA failures have been observed with the older NeMo containers used by these recipes (`25.09.00` or `25.07.01`, depending on GPU type). If EFA failures occur, update the container with current NCCL, EFA, and AWS OFI NCCL packages. See the [AWS CSP section](#aws) for EFA update references. - **Qwen3 235B:** Supported on GB300/GB200 systems. H100 EFA is not validated in this release. -## 6. B300 PCT fixed-core binding for certain GNR systems +## 6. Priority Core Turbo fixed-core binding for Granite Rapids systems ### Issue The current Megatron-Bridge launch configuration does not include the fixed-core CPU binding (`-C $((SLURM_LOCALID * 16)),...`) used on the B300 reference configuration. Instead, it binds processes at the NUMA-node level only. -This is intentional as the general default: on B300 systems where Intel Granite Rapids (GNR) PCT is not available or not enabled, forcing this stricter binding can hurt performance or break recipes. However, on the small subset of GNR processors that support PCT, and only when PCT is enabled, restoring this fixed-core binding can provide the best performance for recipes like Qwen. +This is intentional as the general default. Priority Core Turbo (PCT) is a turbo-frequency capability on some Intel Xeon 6900/6700-series Granite Rapids processors that lets a small number of high-priority CPU cores run at elevated turbo frequency while lower-priority cores run at a reduced frequency. It is separate from Intel's broader Performance-core (P-core) and Efficient-core (E-core) processor-family terminology. The patch below matches the fixed-core binding used by the B300 reference configuration, but the underlying requirement is the host CPU's PCT configuration rather than the GPU model. -We refer to this as the "B300" pinning configuration because it matches the B300 reference configuration, but it is a CPU-platform-specific optimization rather than a B300 GPU feature. +Only use this tuning on clusters where your administrator has confirmed that the processors support PCT, that PCT is enabled, and that the high-priority core IDs match the reference binding pattern used by the patch. On GNR systems without PCT, on systems where PCT is disabled, or on systems with a different PCT core layout, forcing this stricter binding can hurt performance or break recipes. It is also workload dependent: Qwen3 benefits on the validated B300 reference configuration, and some additional workloads such as Nemotron3 may benefit on some systems, but this should not be treated as a blanket recommendation for every workload. ### Workaround -A patch file is provided at `qwen3/pretrain/b300_numa_cpu_pinning.patch` to restore this fixed-core binding. Apply it only if PCT is available and enabled on your system; in that case it will likely provide the best performance for recipes like Qwen. Do not apply it on systems without PCT. +A patch file is provided at `common/b300_numa_cpu_pinning.patch` to restore the fixed-core binding used by the B300 reference configuration. -The example below patches the Qwen3 pretrain workload only. Each workload has its own `Megatron-Bridge` checkout, so if you want the same change for another recipe you must apply an equivalent patch in that workload's `Megatron-Bridge` directory as well. +The example below patches the Qwen3 pretrain workload only. Each workload has its own `Megatron-Bridge` checkout, so if you want to test the same change for another recipe, apply the patch in that workload's `Megatron-Bridge` directory and compare performance before keeping it. -Apply the Qwen3 patch from the root of that workload's Megatron-Bridge installation: +Apply the patch from the root of the Qwen3 workload's Megatron-Bridge installation: ```bash cd $LLMB_INSTALL/workloads/pretrain_qwen3/Megatron-Bridge -git apply $LLMB_INSTALL/llmb_repo/qwen3/pretrain/b300_numa_cpu_pinning.patch +git apply $LLMB_INSTALL/llmb_repo/common/b300_numa_cpu_pinning.patch ``` # Support diff --git a/cli/llmb-install/CHANGELOG.md b/cli/llmb-install/CHANGELOG.md index f00e6a7..54f86be 100644 --- a/cli/llmb-install/CHANGELOG.md +++ b/cli/llmb-install/CHANGELOG.md @@ -6,6 +6,32 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). This project uses [PEP 440](https://www.python.org/dev/peps/pep-0440/) versioning with semantic versioning semantics: **MAJOR.MINOR.PATCH** for feature parity with [SemVer](https://semver.org/). +## [1.9.1] - 2026-05-04 + +### Fixed + +- Accept trillion-parameter (`t`) model-size suffixes when selecting Exemplar Cloud workloads from `exemplar.yaml`. + +## [1.9.0] - 2026-04-24 + +### Changed + +- Fresh interactive and express installs now create recipe environments with `uv` without prompting for an environment type; existing conda/venv configs remain supported for resume, incremental, and headless compatibility. + +### Removed + +- Removed unsupported legacy `setup_script` installation support; recipes should use `setup.tasks`. + +### Fixed + +- Workloads without Python dependencies no longer run through the legacy scripted-install path or emit deprecation warnings. + +## [1.8.7] - 2026-04-01 + +### Changed + +- Migrated config models (`InstallConfig`, `SlurmConfig`, `SystemConfig`) from dataclasses to pydantic v2; field types are now enforced at construction (e.g. `Literal` for `venv_type`, `gpu_type` validated against `SUPPORTED_GPU_TYPES`). + ## [1.8.6] - 2026-03-31 ### Fixed diff --git a/cli/llmb-install/README.md b/cli/llmb-install/README.md index 0c989f2..88fd183 100644 --- a/cli/llmb-install/README.md +++ b/cli/llmb-install/README.md @@ -30,16 +30,9 @@ uv tool install $LLMB_REPO/cli/llmb-install #### Option 2: Install as a Package (pip) -It is recommended to run installer in a virtual environment (uv, conda or venv with python 3.12.x). The installer has been tested with these three environment types; other solutions may work but are not officially supported. Make sure to have the environment activated -before running commands below. - -The installer supports multiple Python environment types with automatic detection and preference ordering: - -1. **UV** (Recommended) - Modern Python package manager with fast dependency resolution -2. **System venv** - Python 3.12+ virtual environments using system Python -3. **Conda** - Anaconda/Miniconda environments - -The installer will automatically detect available options and guide you through selection. No pre-activation required. +It is recommended to run the installer in a virtual environment with Python 3.12.x. +The top-level installer can run from an existing uv, venv, or conda environment, +but newly-created recipe environments use `uv`. ```bash # Install installer dependencies @@ -58,7 +51,7 @@ The installer will guide you through an interactive setup process covering: - Installation location selection - SLURM cluster configuration - Node architecture (x86_64/aarch64) -- Environment type (automatic detection with uv/venv/conda) +- Recipe environment setup with uv - Installation method (local/SLURM) - Workload selection @@ -119,30 +112,22 @@ Express mode uses saved system configuration (SLURM settings, GPU type, image fo ### System Requirements -- **Python**: 3.12+ (for venv support), OR conda/miniconda. `uv` is installed automatically if missing. +- **Python**: 3.12+ for the top-level installer environment. `uv` is required for recipe environments and is installed automatically by `install.sh` if missing. - **SLURM**: 22.x or newer with job scheduler access - **Enroot**: For container image management - **Network Access**: Required for downloading container images - **Disk Space**: Substantial space required (see [Storage Requirements](#storage-requirements)) -### Environment Options - -The installer automatically detects and offers available options in preference order: - -1. **UV** (Required): Fast, modern Python package manager - - - Installed automatically by `install.sh` if missing - - Benefits: Faster dependency resolution, automatic Python version management - -2. **System venv**: Uses system Python 3.12+ with venv module - - - Requires: Python 3.12+ with venv support - - Benefits: Standard library solution, no additional tools needed +### Environment Setup -3. **Conda** (Deprecated): Anaconda/Miniconda environments +Fresh recipe environments are created with `uv`. The bootstrap script still +detects an already-active uv, venv, or conda environment for the top-level +`llmb-install` and `llmb-run` tools, but the installer no longer prompts for a +recipe environment manager. - - Requires: conda or miniconda installation - - Benefits: Cross-platform compatibility, scientific package ecosystem +- `uv` is installed automatically by `install.sh` if missing +- Existing conda/venv recipe configs are still supported for resume, incremental, + and headless compatibility ### Python Dependencies @@ -273,26 +258,19 @@ Automatically selecting SLURM-based installation. ### Python Version Compatibility -**Issue**: No compatible environment available +**Issue**: uv is not available ```text -Error: No compatible environment options available. +Error: uv is required to create recipe environments. ``` -**Solutions** (in recommended order): +**Solution**: -1. **Install UV** (Recommended): +1. **Install uv**: ```bash curl -LsSf https://astral.sh/uv/install.sh | sh ``` -2. **Install conda/miniconda**: - ```bash - wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh - bash Miniconda3-latest-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH/miniconda3 - $CONDA_INSTALL_PATH/miniconda3/bin/conda init - source ~/.bashrc - ``` -3. **Upgrade system Python** to 3.12+ with venv support +2. Re-run `./install.sh` so it can install or pin the supported uv version. ### Cache Directory Warnings @@ -368,6 +346,7 @@ llmb-install express --help **Note on image folder**: - **Purpose**: Highly recommended for multi-user or multi-installation setups. Container images are 5-60 GB each and read-only, so sharing saves significant space with no downsides. +- **Requirement**: You need write access to the image folder. - **Persistence**: The image folder path is saved to `~/.config/llmb/system_config.yaml` after successful installation and automatically reused in future installs. - **Override**: Use `-i` flag to override the saved location for a specific installation, or for first time installs. @@ -485,9 +464,9 @@ This project uses `uv` for dependency management and `tox` for multi-environment ### Environment Setup 1. **Install uv**: [Follow official instructions](https://docs.astral.sh/uv/getting-started/installation/). -2. **Sync environment**: Creates a virtualenv and installs dependencies from `uv.lock`. +2. **Sync environment**: Creates a virtualenv and installs runtime plus development dependencies from `uv.lock`. ```bash - uv sync + uv sync --extra dev ``` ### Managing Dependencies @@ -503,7 +482,7 @@ This project uses `uv` for dependency management and `tox` for multi-environment - **Quick (Current Python)**: ```bash - uv run pytest + uv run --extra dev pytest ``` - **Full Matrix (Multiple Python versions)**: ```bash diff --git a/cli/llmb-install/docs/headless-installation.md b/cli/llmb-install/docs/headless-installation.md index 2a235da..dac1178 100644 --- a/cli/llmb-install/docs/headless-installation.md +++ b/cli/llmb-install/docs/headless-installation.md @@ -81,7 +81,7 @@ llmb-install --play my_config.yaml Complete configuration file structure: ```yaml -venv_type: venv # 'venv', 'conda', or 'uv' +venv_type: uv # 'uv' is recommended; 'venv'/'conda' remain for compatibility install_path: /lustre/user/llmb # Installation directory slurm: account: myaccount diff --git a/cli/llmb-install/docs/recipe_guide.md b/cli/llmb-install/docs/recipe_guide.md index 48ef113..866d734 100644 --- a/cli/llmb-install/docs/recipe_guide.md +++ b/cli/llmb-install/docs/recipe_guide.md @@ -10,7 +10,7 @@ Each workload recipe requires a `metadata.yaml` file that defines: - **Container Images**: Runtime environment containers - **Repositories**: Git repositories for dependencies - **Downloads**: Offline assets (tokenizers, models, datasets) -- **Setup**: Virtual environment and dependency installation +- **Setup**: Optional virtual environment, dependency installation, and setup tasks - **Tools**: Workload-specific tool versions (e.g., nsys) - **Run Configuration**: GPU configs, model sizes, and test scales @@ -35,7 +35,7 @@ tools: # Optional # Tool versions setup: # Optional - # Dependencies and setup tasks + # Dependencies and setup tasks, if needed run: # Launch configuration and GPU configs @@ -47,10 +47,10 @@ Identifies the workload at a high level: ```yaml general: - workload: nemotron4 # workload model name + workload: qwen3 # workload model name workload_type: pretrain # Type of workload - framework: nemo2 # Framework used - model: nemotron4 # Optional: Override model name in llmb-config + framework: megatron_bridge # Framework used + model: qwen3 # Optional: Override model name in llmb-config ``` ### Fields @@ -60,6 +60,7 @@ general: - `pretrain` - Pre-training workloads - `inference` - Inference workloads - `finetune` - Fine-tuning workloads + - `microbenchmark` - Microbenchmark workloads - **`framework`** (string, required): Framework name (e.g., `nemo2`, `maxtext`, `megatron`) - **`model`** (string, optional): Model name to use in `llmb-config_jobid.yaml` for `model_info.model_name`. If not specified, defaults to the `workload` value. Useful when multiple workload directories share the same base model (e.g., `llama3.1` and `llama3.3` both use `model: llama3`) @@ -198,7 +199,7 @@ Existing recipes using `hf_tokenizers` should eventually migrate to the `hugging ```yaml downloads: hf_tokenizers: - - 'nvidia/Nemotron-4-340B-Base' + - 'Qwen/Qwen3-30B-A3B' ``` **Migrated (Tokenizer only):** @@ -206,7 +207,7 @@ downloads: ```yaml downloads: huggingface: - - repo_id: nvidia/Nemotron-4-340B-Base + - repo_id: Qwen/Qwen3-30B-A3B assets: [tokenizer] ``` @@ -222,12 +223,14 @@ downloads: - repo_id: Qwen/Qwen3-30B-A3B ``` -#### 2. Tokenizer-only (Nemotron Pattern) +#### 2. Tokenizer-only syntax + +Current recipes usually need both tokenizer and config assets. This example uses a current repository ID only to show the `assets: [tokenizer]` syntax for a recipe that intentionally needs tokenizer files only. ```yaml downloads: huggingface: - - repo_id: nvidia/Nemotron-4-340B-Base + - repo_id: Qwen/Qwen3-30B-A3B assets: [tokenizer] ``` @@ -291,7 +294,7 @@ For more details, see [tools.md](tools.md). ## Setup Section (Optional) -Defines virtual environment creation, dependencies, and setup tasks. +Defines virtual environment creation, dependencies, and setup tasks. Omit this section for image-only recipes that only need container downloads and run metadata. ### Basic Setup with Dependencies @@ -368,17 +371,7 @@ setup: - `srun`: Run via SLURM srun - `sbatch`: Submit as SLURM batch job -### Legacy Setup Script - -> **⚠️ DEPRECATED:** The `setup_script` functionality is deprecated and will be removed in a future release. Please migrate to the `tasks` feature above for all setup operations. - -For backward compatibility only: - -```yaml -setup: - setup_script: "setup.sh" # Path to setup script (DEPRECATED - use tasks instead) - venv_req: true -``` +Setup tasks can be used with or without `dependencies`. If `venv_req: true` is set without dependencies, the installer creates an empty workload-specific virtual environment before running tasks. If `venv_req` is omitted or false, tasks run without a virtual environment. ## Run Section (Required) @@ -402,8 +395,21 @@ run: - **`nemo`**: NeMo launcher (nemo2 workloads) - **`megatron_bridge`**: Megatron bridge launcher +- **`configured_sbatch`**: SLURM sbatch submission with llmb-run-managed experiment directories - **`sbatch`**: Direct SLURM sbatch submission +### Launch Script Env Contract + +Launch scripts should treat values passed through `llmb-run submit --env KEY=value` or a YAML task spec `env:` block as explicit container-launch overrides. + +- `llmb-run` validates `--env` keys as bash-style environment variable names and exports the corresponding `KEY=value` pairs into the job environment. YAML `env:` entries from `-f` task files receive the same treatment. +- For `sbatch` and `configured_sbatch` launchers, `llmb-run` also exports `LLMB_CONTAINER_ENV=KEY1,KEY2,...`. + Launch scripts that invoke `srun` should pass this through to Pyxis `--container-env`, and may append additional keys if needed. +- For `nemo` and `megatron_bridge` launchers, `llmb-run` appends repeatable `-E KEY=value` flags into `CONFIG_OVERRIDES`. + Launch scripts should preserve that variable and may append additional override flags to it if needed. + +This contract covers explicit `--env` values and YAML `env:` blocks. Environment variables from cluster config or workload config continue to flow through the normal job environment unless the launch script chooses to add them to its container override mechanism. + ### GPU Configs Define test configurations for each GPU type: @@ -412,14 +418,14 @@ Define test configurations for each GPU type: gpu_configs: h100: model_configs: - - model_size: '15b' - dtypes: ['fp8', 'bf16'] - scales: [16, 32, 64, 128] + - model_size: '30b' + dtypes: ['bf16'] + scales: [16, 32, 64] b200: model_configs: - - model_size: '15b' - dtypes: ['fp8'] - scales: [32, 64, 128, 256] + - model_size: '30b' + dtypes: ['bf16'] + scales: [8, 16, 32, 64] ``` **Supported GPU Types**: `h100`, `b200`, `gb200`, `gb300` @@ -432,9 +438,9 @@ Each model config specifies: ```yaml model_configs: - - model_size: '340b' - dtypes: ['fp8', 'bf16'] - scales: [128, 256, 512, 1024] + - model_size: '405b' + dtypes: ['fp8', 'nvfp4'] + scales: [256, 512] exact_scales: false # Optional: allow power-of-2 extension ``` @@ -544,74 +550,75 @@ Here's a complete `metadata.yaml` example: ```yaml general: - workload: nemotron4 + workload: qwen3 workload_type: pretrain - framework: nemo2 + framework: megatron_bridge container: - images: - - 'nvcr.io#nvidia/nemo:25.07.01' + images: + - 'nvcr.io#nvidia/nemo:26.04.00' repositories: - nemo: - url: "https://github.com/NVIDIA/NeMo.git" - commit: "763ffa8b00a2fca9f7a204e14111ed190de7d947" - megatron_core: - url: "https://github.com/NVIDIA/Megatron-LM.git" - commit: "ac198fc0d60a8c748597e01ca4c6887d3a7bcf3d" + megatron_bridge: + url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" + commit: "f4d10a3746d1220f2aef57d54d49303b9150d901" nemo_run: - url: "https://github.com/NVIDIA/NeMo-Run.git" - commit: "04f900a9c1cde79ce6beca6a175b4c62b99d7982" + url: "https://github.com/NVIDIA-NeMo/Run.git" + commit: "64b91e0187b93475ea0d54028317e349ced7ac1b" downloads: huggingface: - - repo_id: 'nvidia/Nemotron-4-340B-Base' - assets: [tokenizer] - -tools: - nsys: - by_gpu: - h100: "2025.5.1.121-3638078" - gb200: "2025.5.1.121-3638078" - default: "2025.4.1.172-3634357" + - repo_id: 'Qwen/Qwen3-30B-A3B' + - repo_id: 'Qwen/Qwen3-235B-A22B' setup: venv_req: true dependencies: + git: + megatron_bridge: + repo_key: megatron_bridge + install_method: + type: clone pip: - - package: nemo - repo_key: nemo - install_target: '.[nlp]' - - 'scipy<1.13.0' - - 'bitsandbytes==0.46.0' - - package: megatron-core - repo_key: megatron_core - package: nemo_run repo_key: nemo_run run: - launcher_type: 'nemo' + launcher_type: 'megatron_bridge' launch_script: 'launch.sh' gpu_configs: - h100: + gb300: model_configs: - - model_size: '15b' - dtypes: ['fp8', 'bf16'] - scales: [16, 32, 64, 128, 256, 512, 1024, 2048] - proxy_scales: [8, 16] # Optional: reduced scales for debug/validation - - model_size: '340b' - dtypes: ['fp8', 'bf16'] - scales: [256, 512, 1024, 2048] - proxy_scales: [128, 256] # Optional: reduced scales for debug/validation + - model_size: '235b' + dtypes: ['bf16'] + scales: [256, 512] + - model_size: '30b' + dtypes: ['bf16'] + scales: [8, 16, 32, 64] + gb200: + model_configs: + - model_size: '235b' + dtypes: ['bf16'] + scales: [256, 512] + - model_size: '30b' + dtypes: ['bf16'] + scales: [8, 16, 32, 64] b200: model_configs: - - model_size: '15b' - dtypes: ['fp8', 'bf16'] - scales: [16, 32, 64, 128, 256, 512, 1024] - proxy_scales: [8, 16] - - model_size: '340b' - dtypes: ['fp8', 'bf16'] - scales: [128, 256, 512, 1024] + - model_size: '235b' + dtypes: ['bf16'] + scales: [256, 512] + - model_size: '30b' + dtypes: ['bf16'] + scales: [8, 16, 32, 64] + h100: + model_configs: + - model_size: '235b' + dtypes: ['bf16'] + scales: [256, 512] + - model_size: '30b' + dtypes: ['bf16'] + scales: [16, 32, 64] ``` ## Validation @@ -795,9 +802,9 @@ The complete schema is defined in `.gitlab/ci/metadata_schema.yaml`. Key enums a ### Enums - **GPU Types**: `h100`, `b200`, `gb200`, `gb300`, `default` (for by_gpu only) -- **Workload Types**: `pretrain`, `inference`, `finetune`, `tools` -- **Dtypes**: `fp8`, `bf16`, `nvfp4` -- **Launcher Types**: `nemo`, `megatron_bridge`, `sbatch` +- **Workload Types**: `pretrain`, `inference`, `finetune`, `microbenchmark`, `tools` +- **Dtypes**: `fp8`, `bf16`, `nvfp4`, `mxfp4` +- **Launcher Types**: `nemo`, `megatron_bridge`, `configured_sbatch`, `sbatch` - **Job Types**: `local`, `nemo2`, `srun`, `sbatch` ### Format Patterns diff --git a/cli/llmb-install/pyproject.toml b/cli/llmb-install/pyproject.toml index 71ec530..efd8780 100644 --- a/cli/llmb-install/pyproject.toml +++ b/cli/llmb-install/pyproject.toml @@ -4,12 +4,13 @@ build-backend = "setuptools.build_meta" [project] name = "llmb-install" -version = "1.8.6" +version = "1.9.1" requires-python = ">=3.10" description = "Installer for LLM benchmarking workloads" dependencies = [ "PyYAML~=6.0", + "pydantic>=2.0,<3", "questionary~=2.1", "rich>=10.0.0,<15", "transformers>4.57.3,<5.0.0", diff --git a/cli/llmb-install/src/llmb_install/config/headless.py b/cli/llmb-install/src/llmb_install/config/headless.py index b04b5c8..9f59a7b 100644 --- a/cli/llmb-install/src/llmb_install/config/headless.py +++ b/cli/llmb-install/src/llmb_install/config/headless.py @@ -24,9 +24,12 @@ import os from pathlib import Path -from typing import Any, Dict, Iterable +from typing import Any, Dict import yaml +from pydantic import ValidationError + +from llmb_install.config.models import PlayfileConfig def save_installation_config(config_file: str, config_data: Dict[str, Any]) -> None: @@ -50,20 +53,6 @@ def save_installation_config(config_file: str, config_data: Dict[str, Any]) -> N raise SystemExit(1) from e -def _missing_required_fields(data: Dict[str, Any], fields: Iterable[str]) -> list[str]: - return [field for field in fields if field not in data] - - -def _validate_required_strings(data: Dict[str, Any], fields: Iterable[str]) -> None: - blank_fields = [] - for field in fields: - value = data.get(field) - if not isinstance(value, str) or not value.strip(): - blank_fields.append(field) - if blank_fields: - raise ValueError(f"Configuration fields cannot be blank: {blank_fields}") - - def load_installation_config(config_file: str) -> Dict[str, Any]: """Load installation configuration from a YAML file. @@ -83,56 +72,8 @@ def load_installation_config(config_file: str) -> Dict[str, Any]: if not isinstance(config_data, dict): raise ValueError("Configuration file must contain a dictionary") - # TODO: Remove this deprecated-key check after next public release. - deprecated_keys = [key for key in ('slurm_info', 'env_vars') if key in config_data] - if deprecated_keys: - raise ValueError( - "Playfiles must use top-level slurm and environment_vars; " - "slurm_info/env_vars are no longer supported." - ) - - # Validate required fields - required_fields = [ - 'install_path', - 'venv_type', - 'gpu_type', - 'node_architecture', - 'install_method', - 'selected_workloads', - ] - - missing_fields = _missing_required_fields(config_data, required_fields) - if missing_fields: - raise ValueError(f"Configuration file is missing required fields: {missing_fields}") - - _validate_required_strings(config_data, ['install_path', 'venv_type', 'gpu_type', 'node_architecture']) - - selected_workloads = config_data.get('selected_workloads') - if not isinstance(selected_workloads, list): - raise ValueError("selected_workloads must be a list") - if not selected_workloads: - raise ValueError("selected_workloads cannot be empty") - invalid_workloads = [w for w in selected_workloads if not isinstance(w, str) or not w.strip()] - if invalid_workloads: - raise ValueError("selected_workloads must be a list of non-empty strings") - - env_vars = config_data.get('environment_vars') - if env_vars is not None and not isinstance(env_vars, dict): - raise ValueError("environment_vars must be a dictionary when provided") - - # Validate SLURM configuration when provided or required. - slurm_config = config_data.get('slurm') - if slurm_config is not None: - if not isinstance(slurm_config, dict): - raise ValueError("slurm must be a dictionary when provided") - - _validate_required_strings( - slurm_config, - ['account', 'gpu_partition', 'cpu_partition'], - ) - - if config_data.get('install_method') == 'slurm' and not slurm_config: - raise ValueError("slurm configuration is required when install_method is 'slurm'") + # Validate against playfile schema (structure, types, and playfile-specific rules) + PlayfileConfig.model_validate(config_data) print(f"✓ Configuration loaded from: {config_file}") return config_data @@ -143,6 +84,12 @@ def load_installation_config(config_file: str) -> Dict[str, Any]: except yaml.YAMLError as e: print(f"Error: Invalid YAML in configuration file {config_file}: {e}") raise SystemExit(1) from e + except ValidationError as e: + print(f"Error: Invalid configuration in {config_file}:") + for err in e.errors(): + loc = " -> ".join(str(part) for part in err["loc"]) + print(f" - {loc}: {err['msg']}") + raise SystemExit(1) from e except ValueError as e: print(f"Error: Invalid configuration in {config_file}: {e}") raise SystemExit(1) from e diff --git a/cli/llmb-install/src/llmb_install/config/models.py b/cli/llmb-install/src/llmb_install/config/models.py index 73b5c25..00847ba 100644 --- a/cli/llmb-install/src/llmb_install/config/models.py +++ b/cli/llmb-install/src/llmb_install/config/models.py @@ -22,180 +22,135 @@ """Configuration data models for LLMB Install.""" -from dataclasses import dataclass, field -from typing import Any, Dict, List, Optional +from typing import Annotated, Any, Dict, List, Literal, Optional +from pydantic import AfterValidator, BaseModel, BeforeValidator, model_validator -@dataclass -class SlurmConfig: +from llmb_install.constants import SUPPORTED_GPU_TYPES + + +def _check_gpu_type(v: str) -> str: + if v not in SUPPORTED_GPU_TYPES: + raise ValueError(f"Unsupported gpu_type '{v}'. Must be one of: {sorted(SUPPORTED_GPU_TYPES)}") + return v + + +def _coerce_env_vars(v: Any) -> Any: + if v is None: + return {} + if not isinstance(v, dict): + return v # let pydantic's type check handle it + null_keys = [k for k, val in v.items() if val is None] + if null_keys: + raise ValueError( + f"environment_vars contains null values for: {null_keys}. " + "Use empty strings ('') instead of null/blank values, or remove the keys entirely." + ) + return {k: str(val) for k, val in v.items()} + + +def _check_non_blank(v: str) -> str: + if not v.strip(): + raise ValueError("Value must not be blank or whitespace-only") + return v + + +GpuType = Annotated[str, AfterValidator(_check_gpu_type)] +NonBlankStr = Annotated[str, AfterValidator(_check_non_blank)] +EnvironmentVars = Annotated[Dict[str, str], BeforeValidator(_coerce_env_vars)] + + +class SlurmConfig(BaseModel): """SLURM cluster configuration.""" - account: str - gpu_partition: str - cpu_partition: str + account: NonBlankStr + gpu_partition: NonBlankStr + cpu_partition: NonBlankStr gpu_partition_gres: Optional[int] = None cpu_partition_gres: Optional[int] = None -@dataclass -class InstallConfig: +class ClusterSettings(BaseModel): + """Base cluster configuration shared across all config models. + + Contains the stable settings that describe a cluster environment: + GPU type, architecture, venv strategy, slurm configuration, etc. + """ + + venv_type: Literal['uv', 'venv', 'conda'] + gpu_type: GpuType + node_architecture: Literal['x86_64', 'aarch64'] + install_method: Literal['local', 'slurm'] = 'slurm' + slurm: Optional[SlurmConfig] = None + workload_selection_mode: Literal['custom', 'exemplar'] = 'custom' + environment_vars: EnvironmentVars = {} + image_folder: Optional[str] = None + + +class PlayfileConfig(ClusterSettings): + """Schema for headless playfile configuration. + + Defines the fields that are valid in a playfile YAML, along with + playfile-specific validation rules (non-empty workloads, deprecated + key rejection). Slurm configuration is always required. + """ + + install_path: NonBlankStr + install_method: Literal['local', 'slurm'] # required in playfiles (no default) + selected_workloads: List[NonBlankStr] + slurm: SlurmConfig # required in playfiles (no default) + + @model_validator(mode='before') + @classmethod + def reject_deprecated_keys(cls, data: Any) -> Any: + if isinstance(data, dict): + # TODO: Remove this deprecated-key check after next public release. + deprecated_keys = [key for key in ('slurm_info', 'env_vars') if key in data] + if deprecated_keys: + raise ValueError( + "Playfiles must use top-level slurm and environment_vars; " + "slurm_info/env_vars are no longer supported." + ) + return data + + @model_validator(mode='after') + def validate_playfile_rules(self) -> 'PlayfileConfig': + if not self.selected_workloads: + raise ValueError("selected_workloads cannot be empty") + return self + + +class InstallConfig(ClusterSettings): """Central configuration object for LLMB installation.""" # Required fields (no defaults) - install_path: str - venv_type: str # 'uv', 'venv', 'conda' - gpu_type: str # 'h100', 'gb200', 'b200' - node_architecture: str # 'x86_64', 'aarch64' + install_path: NonBlankStr # Optional fields (with defaults) - slurm: Optional[SlurmConfig] = None - selected_workloads: List[str] = field(default_factory=list) - workload_selection_mode: str = 'custom' # 'custom' or 'exemplar' - install_method: str = 'local' # 'local' or 'slurm' - ui_mode: str = 'simple' # 'simple', 'rich', 'express' - environment_vars: Dict[str, str] = field(default_factory=dict) + selected_workloads: List[NonBlankStr] = [] + ui_mode: Literal['simple', 'rich', 'express'] = 'simple' cache_dirs_configured: bool = False - image_folder: Optional[str] = None # Shared container image folder - dev_mode: bool = False # Development mode: skip repo copying, use original location - llmb_repo: Optional[str] = None # Path to LLMB repository (original or copied) - is_incremental_install: bool = ( - False # True if this is an incremental install (adding workloads to existing installation) - ) - - def __post_init__(self): - """Enforce environment_vars contract: all values must be strings, no None.""" - if self.environment_vars: - null_keys = [k for k, v in self.environment_vars.items() if v is None] - if null_keys: - raise ValueError( - f"environment_vars contains null values for: {null_keys}. " - "Use empty strings ('') instead of null/blank values, or remove the keys entirely." - ) - self.environment_vars = {k: str(v) for k, v in self.environment_vars.items()} - - def to_dict(self) -> Dict[str, Any]: - """Convert config to dictionary for serialization.""" - result = { - 'install_path': self.install_path, - 'venv_type': self.venv_type, - 'gpu_type': self.gpu_type, - 'node_architecture': self.node_architecture, - 'selected_workloads': self.selected_workloads, - 'workload_selection_mode': self.workload_selection_mode, - 'install_method': self.install_method, - 'ui_mode': self.ui_mode, - 'environment_vars': self.environment_vars, - 'cache_dirs_configured': self.cache_dirs_configured, - 'image_folder': self.image_folder, - 'dev_mode': self.dev_mode, - 'llmb_repo': self.llmb_repo, - 'is_incremental_install': self.is_incremental_install, - } - - if self.slurm: - result['slurm'] = { - 'account': self.slurm.account, - 'gpu_partition': self.slurm.gpu_partition, - 'cpu_partition': self.slurm.cpu_partition, - 'gpu_partition_gres': self.slurm.gpu_partition_gres, - 'cpu_partition_gres': self.slurm.cpu_partition_gres, - } - else: - result['slurm'] = None - - return result + dev_mode: bool = False + llmb_repo: Optional[str] = None + is_incremental_install: bool = False def to_play_dict(self) -> Dict[str, Any]: - """Convert config to dictionary for headless playfiles.""" - data = self.to_dict() - denylist = { - 'llmb_repo', - 'dev_mode', - 'ui_mode', - 'cache_dirs_configured', - 'is_incremental_install', - } - data = {key: value for key, value in data.items() if key not in denylist} + """Convert config to playfile-compatible dictionary.""" + data = PlayfileConfig.model_validate(self.model_dump()).model_dump() if data.get('image_folder') is None: data.pop('image_folder', None) return data - @classmethod - def from_dict(cls, data: Dict[str, Any]) -> 'InstallConfig': - """Create config from dictionary.""" - slurm_data = data.get('slurm') - slurm_config = None - if slurm_data: - slurm_config = SlurmConfig( - account=slurm_data['account'], - gpu_partition=slurm_data['gpu_partition'], - cpu_partition=slurm_data['cpu_partition'], - gpu_partition_gres=slurm_data.get('gpu_partition_gres'), - cpu_partition_gres=slurm_data.get('cpu_partition_gres'), - ) - - return cls( - install_path=data['install_path'], - venv_type=data['venv_type'], - gpu_type=data['gpu_type'], - node_architecture=data['node_architecture'], - slurm=slurm_config, - selected_workloads=data.get('selected_workloads', []), - workload_selection_mode=data.get('workload_selection_mode', 'custom'), - install_method=data.get('install_method', 'local'), - ui_mode=data.get('ui_mode', 'simple'), - environment_vars=data.get('environment_vars', {}), - cache_dirs_configured=data.get('cache_dirs_configured', False), - image_folder=data.get('image_folder'), - dev_mode=data.get('dev_mode', False), - llmb_repo=data.get('llmb_repo'), - is_incremental_install=data.get('is_incremental_install', False), - ) - def get_remaining_workloads(self, completed: List[str]) -> List[str]: - """Get workloads that still need to be installed. - - Args: - completed: List of completed workload names - - Returns: - List of workload names that still need to be installed - """ + """Get workloads that still need to be installed.""" return [w for w in self.selected_workloads if w not in completed] @property def locked_fields_for_resume(self) -> List[str]: - """Get list of fields that cannot be changed during resume edit. - - Returns: - List of field names that are locked during resume - """ + """Get list of fields that cannot be changed during resume edit.""" return ['install_path', 'gpu_type', 'node_architecture', 'venv_type', 'llmb_repo', 'dev_mode', 'image_folder'] @property def editable_fields_for_resume(self) -> List[str]: - """Get list of fields that can be changed during resume edit. - - Returns: - List of field names that can be edited during resume - """ + """Get list of fields that can be changed during resume edit.""" return ['slurm', 'install_method', 'selected_workloads'] - - def get_slurm_dict(self) -> Dict[str, Any]: - """Convert SLURM config to dictionary format for legacy code compatibility. - - Returns: - Dictionary containing SLURM configuration, or empty dict if no SLURM config - """ - if not self.slurm: - return {} - - return { - 'slurm': { - 'account': self.slurm.account, - 'gpu_partition': self.slurm.gpu_partition, - 'cpu_partition': self.slurm.cpu_partition, - 'gpu_partition_gres': self.slurm.gpu_partition_gres, - 'cpu_partition_gres': self.slurm.cpu_partition_gres, - } - } diff --git a/cli/llmb-install/src/llmb_install/config/system.py b/cli/llmb-install/src/llmb_install/config/system.py index cd4f3bc..4287e73 100644 --- a/cli/llmb-install/src/llmb_install/config/system.py +++ b/cli/llmb-install/src/llmb_install/config/system.py @@ -27,98 +27,27 @@ import os import stat -from dataclasses import dataclass from datetime import datetime, timedelta from pathlib import Path from typing import Any, Dict, List, Optional, Tuple import yaml +from pydantic import ValidationError -from llmb_install.config.models import InstallConfig, SlurmConfig +from llmb_install.config.models import ClusterSettings, InstallConfig from llmb_install.utils.logging import get_logger logger = get_logger(__name__) -@dataclass -class SystemConfig: +class SystemConfig(ClusterSettings): """Sanitized system configuration that persists across installs. Contains stable system settings but excludes per-install variables like - install_path and selected_workloads. + install_path and selected_workloads. Inherits all cluster settings from + ClusterSettings. """ - # Core system settings (stable) - venv_type: str # 'uv', 'venv', 'conda' - install_method: str # 'local' or 'slurm' - gpu_type: str # 'h100', 'gb200', 'b200' - node_architecture: str # 'x86_64', 'aarch64' - workload_selection_mode: str = 'custom' # 'custom' or 'exemplar' - - # SLURM settings (stable for a cluster) - slurm: Optional[SlurmConfig] = None - - # Environment variables (potentially stable) - environment_vars: Dict[str, str] = None - - # Shared container image folder (stable) - image_folder: Optional[str] = None - - def __post_init__(self): - """Initialize default values.""" - if self.environment_vars is None: - self.environment_vars = {} - - def to_dict(self) -> Dict[str, Any]: - """Convert to dictionary for serialization.""" - result = { - 'venv_type': self.venv_type, - 'install_method': self.install_method, - 'gpu_type': self.gpu_type, - 'node_architecture': self.node_architecture, - 'workload_selection_mode': self.workload_selection_mode, - 'environment_vars': self.environment_vars, - 'image_folder': self.image_folder, - } - - if self.slurm: - result['slurm'] = { - 'account': self.slurm.account, - 'gpu_partition': self.slurm.gpu_partition, - 'cpu_partition': self.slurm.cpu_partition, - 'gpu_partition_gres': self.slurm.gpu_partition_gres, - 'cpu_partition_gres': self.slurm.cpu_partition_gres, - } - else: - result['slurm'] = None - - return result - - @classmethod - def from_dict(cls, data: Dict[str, Any]) -> 'SystemConfig': - """Create from dictionary.""" - slurm_data = data.get('slurm') - slurm_config = None - if slurm_data: - slurm_config = SlurmConfig( - account=slurm_data['account'], - gpu_partition=slurm_data['gpu_partition'], - cpu_partition=slurm_data['cpu_partition'], - gpu_partition_gres=slurm_data.get('gpu_partition_gres'), - cpu_partition_gres=slurm_data.get('cpu_partition_gres'), - ) - - return cls( - venv_type=data['venv_type'], - install_method=data['install_method'], - gpu_type=data['gpu_type'], - node_architecture=data['node_architecture'], - workload_selection_mode=data.get('workload_selection_mode', 'custom'), - slurm=slurm_config, - environment_vars=data.get('environment_vars', {}), - image_folder=data.get('image_folder'), - ) - @classmethod def from_install_config(cls, install_config: InstallConfig) -> 'SystemConfig': """Extract system config from a complete install config. @@ -126,16 +55,7 @@ def from_install_config(cls, install_config: InstallConfig) -> 'SystemConfig': This sanitizes the install config by keeping only stable system settings and excluding per-install variables. """ - return cls( - venv_type=install_config.venv_type, - install_method=install_config.install_method, - gpu_type=install_config.gpu_type, - node_architecture=install_config.node_architecture, - workload_selection_mode=install_config.workload_selection_mode, - slurm=install_config.slurm, - environment_vars=install_config.environment_vars.copy(), - image_folder=install_config.image_folder, - ) + return cls.model_validate(install_config.model_dump()) def _get_system_config_dir() -> Path: @@ -180,7 +100,7 @@ def save(self, install_config: InstallConfig) -> None: # Write to temporary file first (atomic operation) temp_path = self.config_path.with_suffix('.tmp') with open(temp_path, 'w') as f: - yaml.safe_dump(system_config.to_dict(), f, default_flow_style=False, indent=2) + yaml.safe_dump(system_config.model_dump(), f, default_flow_style=False, indent=2) # Set restrictive permissions (0600) for security os.chmod(temp_path, stat.S_IRUSR | stat.S_IWUSR) @@ -217,12 +137,19 @@ def load(self) -> Optional[SystemConfig]: logger.warning(f"System config file is empty: {self.config_path}") return None - system_config = SystemConfig.from_dict(data) + system_config = SystemConfig.model_validate(data) logger.debug(f"Loaded system config from {self.config_path}") logger.debug(f"Available defaults: {list(data.keys())}") return system_config - except (yaml.YAMLError, KeyError, TypeError) as e: + except ValidationError as e: + details = "; ".join(f"{' -> '.join(str(p) for p in err['loc'])}: {err['msg']}" for err in e.errors()) + logger.warning( + f"System config at {self.config_path} is invalid ({details}). " + f"Ignoring saved defaults — you will be prompted for all settings." + ) + return None + except (yaml.YAMLError, KeyError, TypeError, ValueError) as e: logger.warning(f"Failed to load system config from {self.config_path}: {e}") return None @@ -300,7 +227,7 @@ def save_install_state( existing_cluster_config: For incremental installs, the original cluster config """ state_data = { - 'install_config': config.to_dict(), + 'install_config': config.model_dump(), 'completed_workloads': completed_workloads, 'workload_venvs': workload_venvs or {}, 'timestamp': datetime.now().isoformat(), @@ -374,7 +301,7 @@ def load_install_state(self) -> Optional[Tuple[InstallConfig, List[str], Dict[st return None # Reconstruct InstallConfig - install_config = InstallConfig.from_dict(install_config_data) + install_config = InstallConfig.model_validate(install_config_data) # Validate that install directory still exists if not os.path.exists(install_config.install_path): @@ -392,8 +319,17 @@ def load_install_state(self) -> Optional[Tuple[InstallConfig, List[str], Dict[st return (install_config, completed_workloads, workload_venvs, existing_cluster_config) - except (yaml.YAMLError, KeyError, TypeError) as e: + except ValidationError as e: + details = "; ".join(f"{' -> '.join(str(p) for p in err['loc'])}: {err['msg']}" for err in e.errors()) + logger.warning( + f"Resume state at {self.config_path} is invalid ({details}). " + f"Discarding saved progress — installation will start fresh." + ) + self.clear_install_state() + return None + except (yaml.YAMLError, KeyError, TypeError, ValueError) as e: logger.warning(f"Failed to load install state from {self.config_path}: {e}") + self.clear_install_state() return None def clear_install_state(self) -> None: diff --git a/cli/llmb-install/src/llmb_install/core/dependency.py b/cli/llmb-install/src/llmb_install/core/dependency.py index 80901a4..746bb21 100644 --- a/cli/llmb-install/src/llmb_install/core/dependency.py +++ b/cli/llmb-install/src/llmb_install/core/dependency.py @@ -183,7 +183,8 @@ def group_workloads_by_dependencies( resolved_deps = _resolve_dependencies(workload_data) if not resolved_deps: - # Scripted workload installs without explicit dependencies + # Workloads without Python dependencies do not need dependency + # installation. They may still need image downloads or setup tasks. if None not in dep_groups: dep_groups[None] = [] dep_groups[None].append(key) @@ -214,29 +215,28 @@ def print_dependency_group_summary(dep_groups: Dict[Optional[str], List[str]]) - print("\nWorkload Installation Plan") print("=========================") - scripted_workloads = dep_groups.get(None, []) + no_dependency_workloads = dep_groups.get(None, []) dependency_groups = {k: v for k, v in dep_groups.items() if k is not None} - # Count individual installations (both scripted and unique dependency workloads) - individual_count = len(scripted_workloads) - individual_count += sum(1 for workloads in dependency_groups.values() if len(workloads) == 1) + unique_dependency_workloads = [ + group_workloads[0] for group_workloads in dependency_groups.values() if len(group_workloads) == 1 + ] # Count shared virtual environment groups shared_count = sum(len(workloads) for workloads in dependency_groups.values() if len(workloads) > 1) shared_groups_count = len([g for g in dependency_groups.values() if len(g) > 1]) - if individual_count > 0: - print(f"\nIndividual installations ({individual_count} workloads):") - print("Each workload will have its own virtual environment:") - - # Show scripted workloads - for workload in sorted(scripted_workloads): + if no_dependency_workloads: + print(f"\nNo dependency setup ({len(no_dependency_workloads)} workloads):") + print("These workloads do not declare Python dependencies:") + for workload in sorted(no_dependency_workloads): print(f" • {workload}") - # Show workloads with unique dependencies - for group_workloads in dependency_groups.values(): - if len(group_workloads) == 1: - print(f" • {group_workloads[0]}") + if unique_dependency_workloads: + print(f"\nIndividual installations ({len(unique_dependency_workloads)} workloads):") + print("Each workload will have its own virtual environment:") + for workload in sorted(unique_dependency_workloads): + print(f" • {workload}") if shared_count > 0: print(f"\nShared virtual environment groups ({shared_count} workloads in {shared_groups_count} groups):") diff --git a/cli/llmb-install/src/llmb_install/core/exemplar.py b/cli/llmb-install/src/llmb_install/core/exemplar.py index 183c3bb..d15dd4d 100644 --- a/cli/llmb-install/src/llmb_install/core/exemplar.py +++ b/cli/llmb-install/src/llmb_install/core/exemplar.py @@ -1,4 +1,4 @@ -# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: MIT # # Permission is hereby granted, free of charge, to any person obtaining a @@ -39,7 +39,7 @@ def get_exemplar_workloads(llmb_repo: Path, gpu_type: str) -> List[str]: Each workload_full_name is parsed by splitting on the last '_': - Prefix becomes the base_key (e.g., 'pretrain_llama3.1') - - Suffix must match ^\d+(\.\d+)?b$ (lowercase, e.g., '70b', '3.5b') + - Suffix must match ^\d+(\.\d+)?[bt]$ (case-insensitive, e.g., '70b', '3.5b', '1t') Args: llmb_repo: Path to the LLMB repository root @@ -115,13 +115,14 @@ def get_exemplar_workloads(llmb_repo: Path, gpu_type: str) -> List[str]: # Parse workload_full_names to extract base keys base_keys = [] - size_suffix_pattern = re.compile(r'^\d+(\.\d+)?b$') + size_suffix_pattern = re.compile(r'^\d+(\.\d+)?[bt]$') for full_name in workload_full_names: if '_' not in full_name: raise ValueError( f"Invalid workload name '{full_name}' in workloads[{matched_key}]: " - f"expected format '_b' (e.g., 'pretrain_llama3.1_70b')" + f"expected format '_' (e.g., 'pretrain_llama3.1_70b' or " + f"'pretrain_kimi-k2_1t')" ) base_key, size_suffix = full_name.rsplit('_', 1) @@ -136,8 +137,8 @@ def get_exemplar_workloads(llmb_repo: Path, gpu_type: str) -> List[str]: if not size_suffix_pattern.match(size_suffix.lower()): raise ValueError( f"Invalid size suffix in '{full_name}' (workloads[{matched_key}]): " - f"'{size_suffix}' does not match pattern '^\\d+(\\.\\d+)?b$' (lowercase). " - f"Examples: '70b', '3.5b', '405b'" + f"'{size_suffix}' does not match pattern '^\\d+(\\.\\d+)?[bt]$'. " + f"Examples: '70b', '3.5b', '405b', '1t'" ) base_keys.append(base_key) diff --git a/cli/llmb-install/src/llmb_install/core/installer.py b/cli/llmb-install/src/llmb_install/core/installer.py index 8846180..26d96d0 100644 --- a/cli/llmb-install/src/llmb_install/core/installer.py +++ b/cli/llmb-install/src/llmb_install/core/installer.py @@ -58,8 +58,6 @@ from llmb_install.core.workload import ( build_workload_dict, filter_tools_from_workload_list, - install_scripted_workload, - run_post_install_script, run_setup_tasks, ) from llmb_install.downloads.huggingface import download_huggingface_files_for_workloads @@ -842,8 +840,8 @@ def _run_headless_mode(self, args: argparse.Namespace) -> None: config_data = load_installation_config(args.play) # Create InstallConfig from the loaded data - # InstallConfig.from_dict handles all necessary conversions and defaults - config = InstallConfig.from_dict(config_data) + # model_validate handles all necessary conversions and defaults + config = InstallConfig.model_validate(config_data) # Convert install_path to absolute path using pathlib # This is typically done in _collect_configuration but needed here for headless @@ -1101,13 +1099,15 @@ def _collect_configuration( # Create merged defaults dict - system config first, then resume overrides defaults = {} if system_config: - defaults.update(system_config.to_dict()) + defaults.update(system_config.model_dump()) if override_defaults: - defaults.update(override_defaults.to_dict()) + defaults.update(override_defaults.model_dump()) ui.log("Using saved installation configuration as additional defaults") # Not wild about this language. - # Collect basic configuration using merged defaults - venv_type = prompt_environment_type(ui, defaults.get('venv_type'), express_mode=False) + # Collect basic configuration using merged defaults. Fresh installs always + # create recipe environments with uv; legacy venv_type values are kept + # only for resume, incremental, and headless compatibility paths. + venv_type = prompt_environment_type(ui, defaults.get('venv_type')) print() # Loop until we get a valid install path (not an existing installation, or user accepts incremental) @@ -1778,8 +1778,9 @@ def _collect_incremental_configuration( def _collect_express_configuration(self, args: argparse.Namespace) -> InstallConfig: """Collect configuration for express mode using saved system config. - Express mode only prompts for install_path and workloads. All other values - are taken from the saved system configuration. + Express mode only prompts for install_path and workloads. Most values are + taken from the saved system configuration, but fresh installs always use + uv for newly-created recipe environments. Args: args: Parsed command line arguments @@ -1799,7 +1800,7 @@ def _collect_express_configuration(self, args: argparse.Namespace) -> InstallCon ui.print_section("Express Mode Configuration Summary") ui.log(f"GPU Type: {system_config.gpu_type.upper()}") ui.log(f"Architecture: {system_config.node_architecture}") - ui.log(f"Environment: {system_config.venv_type}") + ui.log("Environment: uv") ui.log(f"Install Method: {system_config.install_method}") if system_config.slurm: ui.log(f"SLURM Account: {system_config.slurm.account}") @@ -1850,8 +1851,10 @@ def _collect_express_configuration(self, args: argparse.Namespace) -> InstallCon dev_mode = getattr(args, 'dev_mode', False) self.root_dir = self._handle_repository_setup(install_path, dev_mode) - # Setup cache directories - setup_cache_directories(install_path, system_config.venv_type) + # Setup cache directories. Fresh express installs always use uv for + # recipe environments, regardless of stale saved venv_type defaults. + venv_type = prompt_environment_type(ui, system_config.venv_type) + setup_cache_directories(install_path, venv_type) # Get workloads from CLI or prompt and determine selection mode workload_selection_mode = 'custom' # Default mode @@ -1937,12 +1940,13 @@ def _collect_express_configuration(self, args: argparse.Namespace) -> InstallCon cpu_partition_gres=system_config.slurm.cpu_partition_gres, ) - # Use all saved values from system config + # Use saved values from system config, except for recipe environment type + # which is forced to uv for fresh installs. config = InstallConfig( install_path=install_path, gpu_type=system_config.gpu_type, node_architecture=system_config.node_architecture, - venv_type=system_config.venv_type, + venv_type=venv_type, slurm=slurm_obj, environment_vars=env_vars, selected_workloads=selected, @@ -2234,8 +2238,7 @@ def _perform_installation( # Use image_folder from config effective_image_folder = install_config.image_folder - # Get SLURM dict for legacy calls - slurm_info = install_config.get_slurm_dict() + slurm_info = {'slurm': install_config.slurm.model_dump()} if install_config.slurm else {} fetch_container_images( required_images, @@ -2289,20 +2292,25 @@ def _perform_installation( try: for dep_hash, workload_keys in dep_groups.items(): - if dep_hash is None: # Scripted workloads - print("\n[Individual Installations - Legacy Setup Scripts]") + if dep_hash is None: + print("\n[No Dependency Setup]") print("-" * 60) for workload_key in workload_keys: - venv_path = install_scripted_workload( - workload_key, - filtered_workloads[workload_key], - install_config.install_path, - install_config.venv_type, - install_config.environment_vars, - install_config.gpu_type, - ) - workload_venvs[workload_key] = venv_path - # Execute any additional setup tasks defined for the workload + workload_dir = os.path.join(install_config.install_path, "workloads", workload_key) + os.makedirs(workload_dir, exist_ok=True) + print(f"Installing: {workload_key}") + + venv_path = None + setup_config = filtered_workloads[workload_key].get('setup', {}) or {} + if setup_config.get('venv_req', False): + venvs_dir = os.path.join(install_config.install_path, "venvs") + os.makedirs(venvs_dir, exist_ok=True) + venv_path = os.path.join(venvs_dir, f"{workload_key}_venv") + create_virtual_environment(venv_path, install_config.venv_type) + workload_venvs[workload_key] = venv_path + else: + print(f"No virtual environment required for {workload_key}") + run_setup_tasks( workload_key, filtered_workloads[workload_key], @@ -2314,7 +2322,6 @@ def _perform_installation( install_config.gpu_type, ) - # Track completion for scripted workloads (after all workloads in this group complete) self._save_installation_progress( install_config, workload_keys, completed_workloads, workload_venvs, existing_cluster_config ) @@ -2368,26 +2375,8 @@ def _perform_installation( workload_venvs[workload_key] = venv_path - # Run post-install scripts and setup tasks - env = get_venv_environment(venv_path, install_config.venv_type) - env['LLMB_INSTALL'] = install_config.install_path - env['MANUAL_INSTALL'] = 'false' - env['GPU_TYPE'] = install_config.gpu_type - if install_config.environment_vars: - env.update(install_config.environment_vars) - for workload_key in workload_keys: workload_data = filtered_workloads[workload_key] - setup_config = workload_data.get('setup', {}) - setup_script = setup_config.get('setup_script') - if setup_script: - # Set workload-specific env var - env['LLMB_WORKLOAD'] = os.path.join( - install_config.install_path, "workloads", workload_key - ) - source_dir = workload_data['path'] - run_post_install_script(setup_script, source_dir, env) - # Execute new-style setup tasks (if any) run_setup_tasks( workload_key, workload_data, @@ -2452,19 +2441,9 @@ def _perform_installation( first_workload_dir = os.path.join(install_config.install_path, "workloads", first_workload_key) install_dependencies(venv_path, install_config.venv_type, dependencies, first_workload_dir, env) - # 4. For each workload, run post-install script (if any) + # 4. For each workload, run setup tasks (if any) for workload_key in workload_keys: workload_data = filtered_workloads[workload_key] - setup_config = workload_data.get('setup', {}) - setup_script = setup_config.get('setup_script') - if setup_script: - # Set workload-specific env var - env['LLMB_WORKLOAD'] = os.path.join( - install_config.install_path, "workloads", workload_key - ) - source_dir = workload_data['path'] - run_post_install_script(setup_script, source_dir, env) - # Execute new-style setup tasks (if any) run_setup_tasks( workload_key, workload_data, diff --git a/cli/llmb-install/src/llmb_install/core/workload.py b/cli/llmb-install/src/llmb_install/core/workload.py index 02c5d33..09bb572 100644 --- a/cli/llmb-install/src/llmb_install/core/workload.py +++ b/cli/llmb-install/src/llmb_install/core/workload.py @@ -36,10 +36,7 @@ import yaml from llmb_install.cluster.slurm import augment_env_for_job_type -from llmb_install.environment.venv_manager import ( - create_virtual_environment, - get_venv_environment, -) +from llmb_install.environment.venv_manager import get_venv_environment def find_metadata_files(root_dir: str) -> List[Path]: @@ -104,8 +101,7 @@ def get_setup_tasks(workload_data: Dict[str, Any]) -> List[Dict[str, Any]]: Behaviour: • If `setup.tasks` exists, return that list preserving order. - • Otherwise, return an empty list - legacy `setup_script` is handled - elsewhere by `run_post_install_script` for full backward compatibility. + • Otherwise, return an empty list. """ setup_cfg: Dict[str, Any] = workload_data.get("setup", {}) or {} tasks: List[Dict[str, Any]] = setup_cfg.get("tasks", []) or [] @@ -121,7 +117,7 @@ def run_setup_tasks( workload_key: str, workload_data: Dict[str, Any], venv_path: Optional[str], - venv_type: Optional[str], + venv_type: str, install_path: str, slurm_info: Dict[str, Any], global_env_vars: Dict[str, str], @@ -133,7 +129,7 @@ def run_setup_tasks( workload_key: Identifier such as "finetune_llama4-maverick". workload_data: Metadata dict for the workload. venv_path: Path to the venv to activate for this workload (may be None). - venv_type: 'venv' or 'conda'. + venv_type: Environment type for venv_path ('uv', 'venv', or 'conda'). install_path: Base installation path ($LLMB_INSTALL). slurm_info: Cluster SLURM config as gathered earlier. global_env_vars: Env vars collected from the user (e.g. HF_TOKEN). @@ -204,115 +200,3 @@ def run_setup_tasks( if stderr_msg: print(stderr_msg) raise - - -def run_post_install_script(setup_script: str, source_dir: str, env: Dict[str, str]): - """Run a post-install setup script within the correct environment. - - Distinct from the scripted workload install, as that also creates a venv. - - Args: - setup_script: The name of the setup script - source_dir: The directory where the script is located - env: The environment dictionary for running subprocesses - """ - print("\n⚠️ WARNING: setup_script functionality is deprecated and will be removed in a future release.") - print(" Please migrate to the 'tasks' feature in metadata.yaml for setup operations.") - print(" See documentation: docs/recipe_guide.md#setup-tasks\n") - - script_path = os.path.join(source_dir, setup_script) - print(f"Running post-install script: {script_path}") - try: - if not os.path.exists(script_path): - print(f"Warning: Post-install script {script_path} not found, skipping.") - return - - os.chmod(script_path, 0o755) - - subprocess.run([script_path], env=env, cwd=source_dir, check=True, text=True) - print("\n✓ Post-install script completed successfully.") - - except subprocess.CalledProcessError as e: - print(f"\nError running post-install script (return code: {e.returncode})") - raise - - -def install_scripted_workload( - workload_key: str, - workload_data: Dict[str, Any], - install_path: str, - venv_type: str, - env_vars: Dict[str, str], - gpu_type: str, -) -> Optional[str]: - """Install a workload whose dependencies are defined entirely by a shell script. - - This function is used for workloads that rely on a 'setup_script' to handle - their setup, rather than declaring dependencies in the metadata. It will - create a dedicated virtual environment for this workload. - - Args: - workload_key: The unique identifier for the workload. - workload_data: The dictionary of metadata for the workload. - install_path: The base installation directory for all workloads. - venv_type: The type of virtual environment to create ('venv' or 'conda'). - env_vars: The environment variables to pass to the setup script. - gpu_type: GPU type (e.g., 'h100', 'gb200'). - Returns: - The path to the created virtual environment, or None if no venv was required. - """ - print(f"\n\nInstalling {workload_key} (scripted method)") - print("-----------------------------------------") - print("⚠️ WARNING: Scripted workload installation is deprecated and will be removed in a future release.") - print(" Please migrate to dependency-based installation with 'tasks' feature in metadata.yaml.") - print(" See documentation: docs/recipe_guide.md#setup-tasks\n") - - target_dir = os.path.join(install_path, "workloads", workload_key) - os.makedirs(target_dir, exist_ok=True) - - env = os.environ.copy() - venv_path = None - - setup_config = workload_data.get('setup', {}) - if setup_config.get('venv_req', False): - venv_name = f"{workload_key}_venv" - venvs_dir = os.path.join(install_path, "venvs") - os.makedirs(venvs_dir, exist_ok=True) - venv_path = os.path.join(venvs_dir, venv_name) - create_virtual_environment(venv_path, venv_type) - env = get_venv_environment(venv_path, venv_type) - else: - print(f"No virtual environment required for {workload_key}") - - env['LLMB_INSTALL'] = install_path - env['LLMB_WORKLOAD'] = os.path.join(install_path, "workloads", workload_key) - # Signal to setup scripts that this is an automated install (prevents automatic sqsh downloads) - env['MANUAL_INSTALL'] = 'false' - env['GPU_TYPE'] = gpu_type - if env_vars: - env.update(env_vars) - - source_dir = workload_data['path'] - print(f"Installing {workload_key} to {target_dir}") - - setup_script = setup_config.get('setup_script') - if setup_script: - script_path = os.path.join(source_dir, setup_script) - print(f"Running setup script: {script_path}") - - if not os.path.exists(script_path): - print(f"Error: Setup script {script_path} not found!") - return venv_path - - os.chmod(script_path, 0o755) - - try: - subprocess.run([script_path], env=env, cwd=source_dir, check=True, text=True) - print(f"\n✓ Setup script for {workload_key} completed successfully.") - except subprocess.CalledProcessError as e: - print(f"\nError running setup script for {workload_key} (return code: {e.returncode})") - raise - else: - print(f"No setup script specified for {workload_key}") - - return venv_path diff --git a/cli/llmb-install/src/llmb_install/ui/prompts/environment.py b/cli/llmb-install/src/llmb_install/ui/prompts/environment.py index 57eb7ce..b386905 100644 --- a/cli/llmb-install/src/llmb_install/ui/prompts/environment.py +++ b/cli/llmb-install/src/llmb_install/ui/prompts/environment.py @@ -23,136 +23,40 @@ """Environment configuration prompts for LLMB installer.""" import os -import subprocess -import sys from typing import Dict, Optional -from llmb_install.constants import EXIT_CANCELLED, MAX_PYTHON_VERSION_TUPLE, MIN_PYTHON_VERSION_TUPLE -from llmb_install.environment.detector import ( - detect_virtual_environment, - get_system_python_path, - get_system_python_version, - has_active_conda_environment, - is_conda_installed, - is_uv_installed, - is_venv_installed, -) -from llmb_install.environment.venv_manager import get_clean_environment_for_subprocess +from llmb_install.environment.detector import is_uv_installed from llmb_install.ui.interface import UIInterface -def prompt_environment_type(ui: UIInterface, default: Optional[str] = None, express_mode: bool = False) -> str: - """Prompt the user to select their preferred environment type (uv, venv, or conda). +def prompt_environment_type(ui: UIInterface, default: Optional[str] = None) -> str: + """Resolve the recipe environment type for fresh interactive installs. + + uv is the only supported choice for newly-created recipe environments. The + installer still supports existing conda/venv configs elsewhere for resume, + incremental, and headless compatibility. Args: ui: UI interface for user interaction - default: Default environment type from system config (if available) - express_mode: Whether this is being called from express mode (shows default messages) + default: Deprecated saved environment type from system config, ignored for fresh installs Returns: - str: Selected environment type ('uv', 'venv', or 'conda') + str: Selected environment type ('uv') """ - # Detect system Python version (clean environment) - if detect_virtual_environment(): - env = get_clean_environment_for_subprocess() - current_python_version = get_system_python_version(env) - system_python_path = get_system_python_path(env) - if current_python_version is not None: - venv_available = ( - subprocess.run(['python3', '-m', 'venv', '--help'], env=env, capture_output=True, text=True).returncode - == 0 - ) - else: - venv_available = False # No system Python detected - else: - current_python_version = sys.version_info[:3] - system_python_path = sys.executable - venv_available = is_venv_installed() - - # Environment availability checks - conda_available = is_conda_installed() - uv_available = is_uv_installed() - - # Check compatibility - if current_python_version is not None: - python_compatible = MIN_PYTHON_VERSION_TUPLE <= current_python_version < MAX_PYTHON_VERSION_TUPLE - can_use_venv = venv_available and python_compatible - else: - python_compatible = False - can_use_venv = False # Can't use venv if no system Python detected - can_use_conda = conda_available - can_use_uv = uv_available - - # Print environment status ui.log("Environment Configuration") ui.log("------------------------") - if current_python_version is not None: - ui.log(f"System Python version: {'.'.join(map(str, current_python_version))}") - ui.log(f"System Python path: {system_python_path}") - else: - ui.log("System Python version: Not detected") - - ui.log( - f"Supported version range: [{'.'.join(map(str, MIN_PYTHON_VERSION_TUPLE))}, {'.'.join(map(str, MAX_PYTHON_VERSION_TUPLE))})" - ) - ui.log("") - # Determine available options - options = [] - if can_use_uv: - options.append("uv") - if can_use_venv: - options.append("venv") - if can_use_conda: - options.append({"name": "conda (deprecated)", "value": "conda"}) - - if not options: - if not python_compatible and current_python_version is not None: - ui.log(f"Error: System Python version {'.'.join(map(str, current_python_version))} is not supported.") - ui.log( - f"Please use Python version in range [{'.'.join(map(str, MIN_PYTHON_VERSION_TUPLE))}, {'.'.join(map(str, MAX_PYTHON_VERSION_TUPLE))})." - ) - else: - ui.log("Error: No supported environment managers found.") - ui.log("Please install at least one of: uv, venv (Python standard library), or conda.") + if not is_uv_installed(): + ui.log("Error: uv is required to create recipe environments.") + ui.log("Please install uv and rerun the installer.") raise SystemExit(1) - def _option_value(opt): - return opt["value"] if isinstance(opt, dict) else opt - - # Display available options - if len(options) > 1: - ui.log("Multiple environment options available:") - - # Use provided default if valid, otherwise no default - if default and any(_option_value(o) == default for o in options): - selected_default = default - if express_mode: - ui.log(f"Using saved default: {default}") - else: - selected_default = None - - selected = ui.prompt_select("Select your preferred environment type:", options, default=selected_default) - if selected is None: - # User cancelled (Ctrl-C) - ui.log("\nInstallation cancelled by user.") - raise SystemExit(EXIT_CANCELLED) + if default and default != "uv": + ui.log(f"Ignoring saved environment type '{default}'; new recipe environments use uv.") else: - selected = _option_value(options[0]) - ui.log(f"Using {selected} (only available option)") - - # Validate selection and warn if necessary - if selected == "venv" and detect_virtual_environment(): - ui.log("") - ui.log("Warning: You are currently in a virtual environment.") - ui.log("The installer will use system Python to create new virtual environments.") - - if selected == "conda" and detect_virtual_environment() and not has_active_conda_environment(): - ui.log("") - ui.log("Warning: You are in a non-conda virtual environment.") - ui.log("Consider deactivating it before using conda for installation.") + ui.log("Using uv for recipe environments.") - return selected + return "uv" def prompt_environment_variables( diff --git a/cli/llmb-install/uv.lock b/cli/llmb-install/uv.lock index 0053848..ba048b6 100644 --- a/cli/llmb-install/uv.lock +++ b/cli/llmb-install/uv.lock @@ -6,6 +6,15 @@ resolution-markers = [ "python_full_version < '3.11'", ] +[[package]] +name = "annotated-types" +version = "0.7.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/ee/67/531ea369ba64dcff5ec9c3402f9f51bf748cec26dde048a2f973a4eea7f5/annotated_types-0.7.0.tar.gz", hash = "sha256:aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89", size = 16081, upload-time = "2024-05-20T21:33:25.928Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643, upload-time = "2024-05-20T21:33:24.1Z" }, +] + [[package]] name = "black" version = "26.1.0" @@ -371,10 +380,11 @@ wheels = [ [[package]] name = "llmb-install" -version = "1.8.6" +version = "1.9.1" source = { editable = "." } dependencies = [ { name = "huggingface-hub" }, + { name = "pydantic" }, { name = "pyyaml" }, { name = "questionary" }, { name = "rich" }, @@ -394,6 +404,7 @@ dev = [ requires-dist = [ { name = "black", marker = "extra == 'dev'", specifier = "~=26.1" }, { name = "huggingface-hub", specifier = ">=0.20.0" }, + { name = "pydantic", specifier = ">=2.0,<3" }, { name = "pytest", marker = "extra == 'dev'", specifier = ">=7.0" }, { name = "pytest-cov", marker = "extra == 'dev'", specifier = ">=4.0" }, { name = "pytest-mock", marker = "extra == 'dev'", specifier = ">=3.10" }, @@ -630,6 +641,139 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/ce/4f/5249960887b1fbe561d9ff265496d170b55a735b76724f10ef19f9e40716/prompt_toolkit-3.0.51-py3-none-any.whl", hash = "sha256:52742911fde84e2d423e2f9a4cf1de7d7ac4e51958f648d9540e0fb8db077b07", size = 387810, upload-time = "2025-04-15T09:18:44.753Z" }, ] +[[package]] +name = "pydantic" +version = "2.12.5" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "annotated-types" }, + { name = "pydantic-core" }, + { name = "typing-extensions" }, + { name = "typing-inspection" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/69/44/36f1a6e523abc58ae5f928898e4aca2e0ea509b5aa6f6f392a5d882be928/pydantic-2.12.5.tar.gz", hash = "sha256:4d351024c75c0f085a9febbb665ce8c0c6ec5d30e903bdb6394b7ede26aebb49", size = 821591, upload-time = "2025-11-26T15:11:46.471Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/5a/87/b70ad306ebb6f9b585f114d0ac2137d792b48be34d732d60e597c2f8465a/pydantic-2.12.5-py3-none-any.whl", hash = "sha256:e561593fccf61e8a20fc46dfc2dfe075b8be7d0188df33f221ad1f0139180f9d", size = 463580, upload-time = "2025-11-26T15:11:44.605Z" }, +] + +[[package]] +name = "pydantic-core" +version = "2.41.5" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "typing-extensions" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/71/70/23b021c950c2addd24ec408e9ab05d59b035b39d97cdc1130e1bce647bb6/pydantic_core-2.41.5.tar.gz", hash = "sha256:08daa51ea16ad373ffd5e7606252cc32f07bc72b28284b6bc9c6df804816476e", size = 460952, upload-time = "2025-11-04T13:43:49.098Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/c6/90/32c9941e728d564b411d574d8ee0cf09b12ec978cb22b294995bae5549a5/pydantic_core-2.41.5-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:77b63866ca88d804225eaa4af3e664c5faf3568cea95360d21f4725ab6e07146", size = 2107298, upload-time = "2025-11-04T13:39:04.116Z" }, + { url = "https://files.pythonhosted.org/packages/fb/a8/61c96a77fe28993d9a6fb0f4127e05430a267b235a124545d79fea46dd65/pydantic_core-2.41.5-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:dfa8a0c812ac681395907e71e1274819dec685fec28273a28905df579ef137e2", size = 1901475, upload-time = "2025-11-04T13:39:06.055Z" }, + { url = "https://files.pythonhosted.org/packages/5d/b6/338abf60225acc18cdc08b4faef592d0310923d19a87fba1faf05af5346e/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5921a4d3ca3aee735d9fd163808f5e8dd6c6972101e4adbda9a4667908849b97", size = 1918815, upload-time = "2025-11-04T13:39:10.41Z" }, + { url = "https://files.pythonhosted.org/packages/d1/1c/2ed0433e682983d8e8cba9c8d8ef274d4791ec6a6f24c58935b90e780e0a/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e25c479382d26a2a41b7ebea1043564a937db462816ea07afa8a44c0866d52f9", size = 2065567, upload-time = "2025-11-04T13:39:12.244Z" }, + { url = "https://files.pythonhosted.org/packages/b3/24/cf84974ee7d6eae06b9e63289b7b8f6549d416b5c199ca2d7ce13bbcf619/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:f547144f2966e1e16ae626d8ce72b4cfa0caedc7fa28052001c94fb2fcaa1c52", size = 2230442, upload-time = "2025-11-04T13:39:13.962Z" }, + { url = "https://files.pythonhosted.org/packages/fd/21/4e287865504b3edc0136c89c9c09431be326168b1eb7841911cbc877a995/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6f52298fbd394f9ed112d56f3d11aabd0d5bd27beb3084cc3d8ad069483b8941", size = 2350956, upload-time = "2025-11-04T13:39:15.889Z" }, + { url = "https://files.pythonhosted.org/packages/a8/76/7727ef2ffa4b62fcab916686a68a0426b9b790139720e1934e8ba797e238/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:100baa204bb412b74fe285fb0f3a385256dad1d1879f0a5cb1499ed2e83d132a", size = 2068253, upload-time = "2025-11-04T13:39:17.403Z" }, + { url = "https://files.pythonhosted.org/packages/d5/8c/a4abfc79604bcb4c748e18975c44f94f756f08fb04218d5cb87eb0d3a63e/pydantic_core-2.41.5-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:05a2c8852530ad2812cb7914dc61a1125dc4e06252ee98e5638a12da6cc6fb6c", size = 2177050, upload-time = "2025-11-04T13:39:19.351Z" }, + { url = "https://files.pythonhosted.org/packages/67/b1/de2e9a9a79b480f9cb0b6e8b6ba4c50b18d4e89852426364c66aa82bb7b3/pydantic_core-2.41.5-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:29452c56df2ed968d18d7e21f4ab0ac55e71dc59524872f6fc57dcf4a3249ed2", size = 2147178, upload-time = "2025-11-04T13:39:21Z" }, + { url = "https://files.pythonhosted.org/packages/16/c1/dfb33f837a47b20417500efaa0378adc6635b3c79e8369ff7a03c494b4ac/pydantic_core-2.41.5-cp310-cp310-musllinux_1_1_armv7l.whl", hash = "sha256:d5160812ea7a8a2ffbe233d8da666880cad0cbaf5d4de74ae15c313213d62556", size = 2341833, upload-time = "2025-11-04T13:39:22.606Z" }, + { url = "https://files.pythonhosted.org/packages/47/36/00f398642a0f4b815a9a558c4f1dca1b4020a7d49562807d7bc9ff279a6c/pydantic_core-2.41.5-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:df3959765b553b9440adfd3c795617c352154e497a4eaf3752555cfb5da8fc49", size = 2321156, upload-time = "2025-11-04T13:39:25.843Z" }, + { url = "https://files.pythonhosted.org/packages/7e/70/cad3acd89fde2010807354d978725ae111ddf6d0ea46d1ea1775b5c1bd0c/pydantic_core-2.41.5-cp310-cp310-win32.whl", hash = "sha256:1f8d33a7f4d5a7889e60dc39856d76d09333d8a6ed0f5f1190635cbec70ec4ba", size = 1989378, upload-time = "2025-11-04T13:39:27.92Z" }, + { url = "https://files.pythonhosted.org/packages/76/92/d338652464c6c367e5608e4488201702cd1cbb0f33f7b6a85a60fe5f3720/pydantic_core-2.41.5-cp310-cp310-win_amd64.whl", hash = "sha256:62de39db01b8d593e45871af2af9e497295db8d73b085f6bfd0b18c83c70a8f9", size = 2013622, upload-time = "2025-11-04T13:39:29.848Z" }, + { url = "https://files.pythonhosted.org/packages/e8/72/74a989dd9f2084b3d9530b0915fdda64ac48831c30dbf7c72a41a5232db8/pydantic_core-2.41.5-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:a3a52f6156e73e7ccb0f8cced536adccb7042be67cb45f9562e12b319c119da6", size = 2105873, upload-time = "2025-11-04T13:39:31.373Z" }, + { url = "https://files.pythonhosted.org/packages/12/44/37e403fd9455708b3b942949e1d7febc02167662bf1a7da5b78ee1ea2842/pydantic_core-2.41.5-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7f3bf998340c6d4b0c9a2f02d6a400e51f123b59565d74dc60d252ce888c260b", size = 1899826, upload-time = "2025-11-04T13:39:32.897Z" }, + { url = "https://files.pythonhosted.org/packages/33/7f/1d5cab3ccf44c1935a359d51a8a2a9e1a654b744b5e7f80d41b88d501eec/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:378bec5c66998815d224c9ca994f1e14c0c21cb95d2f52b6021cc0b2a58f2a5a", size = 1917869, upload-time = "2025-11-04T13:39:34.469Z" }, + { url = "https://files.pythonhosted.org/packages/6e/6a/30d94a9674a7fe4f4744052ed6c5e083424510be1e93da5bc47569d11810/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e7b576130c69225432866fe2f4a469a85a54ade141d96fd396dffcf607b558f8", size = 2063890, upload-time = "2025-11-04T13:39:36.053Z" }, + { url = "https://files.pythonhosted.org/packages/50/be/76e5d46203fcb2750e542f32e6c371ffa9b8ad17364cf94bb0818dbfb50c/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6cb58b9c66f7e4179a2d5e0f849c48eff5c1fca560994d6eb6543abf955a149e", size = 2229740, upload-time = "2025-11-04T13:39:37.753Z" }, + { url = "https://files.pythonhosted.org/packages/d3/ee/fed784df0144793489f87db310a6bbf8118d7b630ed07aa180d6067e653a/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:88942d3a3dff3afc8288c21e565e476fc278902ae4d6d134f1eeda118cc830b1", size = 2350021, upload-time = "2025-11-04T13:39:40.94Z" }, + { url = "https://files.pythonhosted.org/packages/c8/be/8fed28dd0a180dca19e72c233cbf58efa36df055e5b9d90d64fd1740b828/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f31d95a179f8d64d90f6831d71fa93290893a33148d890ba15de25642c5d075b", size = 2066378, upload-time = "2025-11-04T13:39:42.523Z" }, + { url = "https://files.pythonhosted.org/packages/b0/3b/698cf8ae1d536a010e05121b4958b1257f0b5522085e335360e53a6b1c8b/pydantic_core-2.41.5-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:c1df3d34aced70add6f867a8cf413e299177e0c22660cc767218373d0779487b", size = 2175761, upload-time = "2025-11-04T13:39:44.553Z" }, + { url = "https://files.pythonhosted.org/packages/b8/ba/15d537423939553116dea94ce02f9c31be0fa9d0b806d427e0308ec17145/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:4009935984bd36bd2c774e13f9a09563ce8de4abaa7226f5108262fa3e637284", size = 2146303, upload-time = "2025-11-04T13:39:46.238Z" }, + { url = "https://files.pythonhosted.org/packages/58/7f/0de669bf37d206723795f9c90c82966726a2ab06c336deba4735b55af431/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_armv7l.whl", hash = "sha256:34a64bc3441dc1213096a20fe27e8e128bd3ff89921706e83c0b1ac971276594", size = 2340355, upload-time = "2025-11-04T13:39:48.002Z" }, + { url = "https://files.pythonhosted.org/packages/e5/de/e7482c435b83d7e3c3ee5ee4451f6e8973cff0eb6007d2872ce6383f6398/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:c9e19dd6e28fdcaa5a1de679aec4141f691023916427ef9bae8584f9c2fb3b0e", size = 2319875, upload-time = "2025-11-04T13:39:49.705Z" }, + { url = "https://files.pythonhosted.org/packages/fe/e6/8c9e81bb6dd7560e33b9053351c29f30c8194b72f2d6932888581f503482/pydantic_core-2.41.5-cp311-cp311-win32.whl", hash = "sha256:2c010c6ded393148374c0f6f0bf89d206bf3217f201faa0635dcd56bd1520f6b", size = 1987549, upload-time = "2025-11-04T13:39:51.842Z" }, + { url = "https://files.pythonhosted.org/packages/11/66/f14d1d978ea94d1bc21fc98fcf570f9542fe55bfcc40269d4e1a21c19bf7/pydantic_core-2.41.5-cp311-cp311-win_amd64.whl", hash = "sha256:76ee27c6e9c7f16f47db7a94157112a2f3a00e958bc626e2f4ee8bec5c328fbe", size = 2011305, upload-time = "2025-11-04T13:39:53.485Z" }, + { url = "https://files.pythonhosted.org/packages/56/d8/0e271434e8efd03186c5386671328154ee349ff0354d83c74f5caaf096ed/pydantic_core-2.41.5-cp311-cp311-win_arm64.whl", hash = "sha256:4bc36bbc0b7584de96561184ad7f012478987882ebf9f9c389b23f432ea3d90f", size = 1972902, upload-time = "2025-11-04T13:39:56.488Z" }, + { url = "https://files.pythonhosted.org/packages/5f/5d/5f6c63eebb5afee93bcaae4ce9a898f3373ca23df3ccaef086d0233a35a7/pydantic_core-2.41.5-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:f41a7489d32336dbf2199c8c0a215390a751c5b014c2c1c5366e817202e9cdf7", size = 2110990, upload-time = "2025-11-04T13:39:58.079Z" }, + { url = "https://files.pythonhosted.org/packages/aa/32/9c2e8ccb57c01111e0fd091f236c7b371c1bccea0fa85247ac55b1e2b6b6/pydantic_core-2.41.5-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:070259a8818988b9a84a449a2a7337c7f430a22acc0859c6b110aa7212a6d9c0", size = 1896003, upload-time = "2025-11-04T13:39:59.956Z" }, + { url = "https://files.pythonhosted.org/packages/68/b8/a01b53cb0e59139fbc9e4fda3e9724ede8de279097179be4ff31f1abb65a/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e96cea19e34778f8d59fe40775a7a574d95816eb150850a85a7a4c8f4b94ac69", size = 1919200, upload-time = "2025-11-04T13:40:02.241Z" }, + { url = "https://files.pythonhosted.org/packages/38/de/8c36b5198a29bdaade07b5985e80a233a5ac27137846f3bc2d3b40a47360/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:ed2e99c456e3fadd05c991f8f437ef902e00eedf34320ba2b0842bd1c3ca3a75", size = 2052578, upload-time = "2025-11-04T13:40:04.401Z" }, + { url = "https://files.pythonhosted.org/packages/00/b5/0e8e4b5b081eac6cb3dbb7e60a65907549a1ce035a724368c330112adfdd/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:65840751b72fbfd82c3c640cff9284545342a4f1eb1586ad0636955b261b0b05", size = 2208504, upload-time = "2025-11-04T13:40:06.072Z" }, + { url = "https://files.pythonhosted.org/packages/77/56/87a61aad59c7c5b9dc8caad5a41a5545cba3810c3e828708b3d7404f6cef/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e536c98a7626a98feb2d3eaf75944ef6f3dbee447e1f841eae16f2f0a72d8ddc", size = 2335816, upload-time = "2025-11-04T13:40:07.835Z" }, + { url = "https://files.pythonhosted.org/packages/0d/76/941cc9f73529988688a665a5c0ecff1112b3d95ab48f81db5f7606f522d3/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:eceb81a8d74f9267ef4081e246ffd6d129da5d87e37a77c9bde550cb04870c1c", size = 2075366, upload-time = "2025-11-04T13:40:09.804Z" }, + { url = "https://files.pythonhosted.org/packages/d3/43/ebef01f69baa07a482844faaa0a591bad1ef129253ffd0cdaa9d8a7f72d3/pydantic_core-2.41.5-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d38548150c39b74aeeb0ce8ee1d8e82696f4a4e16ddc6de7b1d8823f7de4b9b5", size = 2171698, upload-time = "2025-11-04T13:40:12.004Z" }, + { url = "https://files.pythonhosted.org/packages/b1/87/41f3202e4193e3bacfc2c065fab7706ebe81af46a83d3e27605029c1f5a6/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:c23e27686783f60290e36827f9c626e63154b82b116d7fe9adba1fda36da706c", size = 2132603, upload-time = "2025-11-04T13:40:13.868Z" }, + { url = "https://files.pythonhosted.org/packages/49/7d/4c00df99cb12070b6bccdef4a195255e6020a550d572768d92cc54dba91a/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:482c982f814460eabe1d3bb0adfdc583387bd4691ef00b90575ca0d2b6fe2294", size = 2329591, upload-time = "2025-11-04T13:40:15.672Z" }, + { url = "https://files.pythonhosted.org/packages/cc/6a/ebf4b1d65d458f3cda6a7335d141305dfa19bdc61140a884d165a8a1bbc7/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:bfea2a5f0b4d8d43adf9d7b8bf019fb46fdd10a2e5cde477fbcb9d1fa08c68e1", size = 2319068, upload-time = "2025-11-04T13:40:17.532Z" }, + { url = "https://files.pythonhosted.org/packages/49/3b/774f2b5cd4192d5ab75870ce4381fd89cf218af999515baf07e7206753f0/pydantic_core-2.41.5-cp312-cp312-win32.whl", hash = "sha256:b74557b16e390ec12dca509bce9264c3bbd128f8a2c376eaa68003d7f327276d", size = 1985908, upload-time = "2025-11-04T13:40:19.309Z" }, + { url = "https://files.pythonhosted.org/packages/86/45/00173a033c801cacf67c190fef088789394feaf88a98a7035b0e40d53dc9/pydantic_core-2.41.5-cp312-cp312-win_amd64.whl", hash = "sha256:1962293292865bca8e54702b08a4f26da73adc83dd1fcf26fbc875b35d81c815", size = 2020145, upload-time = "2025-11-04T13:40:21.548Z" }, + { url = "https://files.pythonhosted.org/packages/f9/22/91fbc821fa6d261b376a3f73809f907cec5ca6025642c463d3488aad22fb/pydantic_core-2.41.5-cp312-cp312-win_arm64.whl", hash = "sha256:1746d4a3d9a794cacae06a5eaaccb4b8643a131d45fbc9af23e353dc0a5ba5c3", size = 1976179, upload-time = "2025-11-04T13:40:23.393Z" }, + { url = "https://files.pythonhosted.org/packages/87/06/8806241ff1f70d9939f9af039c6c35f2360cf16e93c2ca76f184e76b1564/pydantic_core-2.41.5-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:941103c9be18ac8daf7b7adca8228f8ed6bb7a1849020f643b3a14d15b1924d9", size = 2120403, upload-time = "2025-11-04T13:40:25.248Z" }, + { url = "https://files.pythonhosted.org/packages/94/02/abfa0e0bda67faa65fef1c84971c7e45928e108fe24333c81f3bfe35d5f5/pydantic_core-2.41.5-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:112e305c3314f40c93998e567879e887a3160bb8689ef3d2c04b6cc62c33ac34", size = 1896206, upload-time = "2025-11-04T13:40:27.099Z" }, + { url = "https://files.pythonhosted.org/packages/15/df/a4c740c0943e93e6500f9eb23f4ca7ec9bf71b19e608ae5b579678c8d02f/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0cbaad15cb0c90aa221d43c00e77bb33c93e8d36e0bf74760cd00e732d10a6a0", size = 1919307, upload-time = "2025-11-04T13:40:29.806Z" }, + { url = "https://files.pythonhosted.org/packages/9a/e3/6324802931ae1d123528988e0e86587c2072ac2e5394b4bc2bc34b61ff6e/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:03ca43e12fab6023fc79d28ca6b39b05f794ad08ec2feccc59a339b02f2b3d33", size = 2063258, upload-time = "2025-11-04T13:40:33.544Z" }, + { url = "https://files.pythonhosted.org/packages/c9/d4/2230d7151d4957dd79c3044ea26346c148c98fbf0ee6ebd41056f2d62ab5/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:dc799088c08fa04e43144b164feb0c13f9a0bc40503f8df3e9fde58a3c0c101e", size = 2214917, upload-time = "2025-11-04T13:40:35.479Z" }, + { url = "https://files.pythonhosted.org/packages/e6/9f/eaac5df17a3672fef0081b6c1bb0b82b33ee89aa5cec0d7b05f52fd4a1fa/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:97aeba56665b4c3235a0e52b2c2f5ae9cd071b8a8310ad27bddb3f7fb30e9aa2", size = 2332186, upload-time = "2025-11-04T13:40:37.436Z" }, + { url = "https://files.pythonhosted.org/packages/cf/4e/35a80cae583a37cf15604b44240e45c05e04e86f9cfd766623149297e971/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:406bf18d345822d6c21366031003612b9c77b3e29ffdb0f612367352aab7d586", size = 2073164, upload-time = "2025-11-04T13:40:40.289Z" }, + { url = "https://files.pythonhosted.org/packages/bf/e3/f6e262673c6140dd3305d144d032f7bd5f7497d3871c1428521f19f9efa2/pydantic_core-2.41.5-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:b93590ae81f7010dbe380cdeab6f515902ebcbefe0b9327cc4804d74e93ae69d", size = 2179146, upload-time = "2025-11-04T13:40:42.809Z" }, + { url = "https://files.pythonhosted.org/packages/75/c7/20bd7fc05f0c6ea2056a4565c6f36f8968c0924f19b7d97bbfea55780e73/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:01a3d0ab748ee531f4ea6c3e48ad9dac84ddba4b0d82291f87248f2f9de8d740", size = 2137788, upload-time = "2025-11-04T13:40:44.752Z" }, + { url = "https://files.pythonhosted.org/packages/3a/8d/34318ef985c45196e004bc46c6eab2eda437e744c124ef0dbe1ff2c9d06b/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:6561e94ba9dacc9c61bce40e2d6bdc3bfaa0259d3ff36ace3b1e6901936d2e3e", size = 2340133, upload-time = "2025-11-04T13:40:46.66Z" }, + { url = "https://files.pythonhosted.org/packages/9c/59/013626bf8c78a5a5d9350d12e7697d3d4de951a75565496abd40ccd46bee/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:915c3d10f81bec3a74fbd4faebe8391013ba61e5a1a8d48c4455b923bdda7858", size = 2324852, upload-time = "2025-11-04T13:40:48.575Z" }, + { url = "https://files.pythonhosted.org/packages/1a/d9/c248c103856f807ef70c18a4f986693a46a8ffe1602e5d361485da502d20/pydantic_core-2.41.5-cp313-cp313-win32.whl", hash = "sha256:650ae77860b45cfa6e2cdafc42618ceafab3a2d9a3811fcfbd3bbf8ac3c40d36", size = 1994679, upload-time = "2025-11-04T13:40:50.619Z" }, + { url = "https://files.pythonhosted.org/packages/9e/8b/341991b158ddab181cff136acd2552c9f35bd30380422a639c0671e99a91/pydantic_core-2.41.5-cp313-cp313-win_amd64.whl", hash = "sha256:79ec52ec461e99e13791ec6508c722742ad745571f234ea6255bed38c6480f11", size = 2019766, upload-time = "2025-11-04T13:40:52.631Z" }, + { url = "https://files.pythonhosted.org/packages/73/7d/f2f9db34af103bea3e09735bb40b021788a5e834c81eedb541991badf8f5/pydantic_core-2.41.5-cp313-cp313-win_arm64.whl", hash = "sha256:3f84d5c1b4ab906093bdc1ff10484838aca54ef08de4afa9de0f5f14d69639cd", size = 1981005, upload-time = "2025-11-04T13:40:54.734Z" }, + { url = "https://files.pythonhosted.org/packages/ea/28/46b7c5c9635ae96ea0fbb779e271a38129df2550f763937659ee6c5dbc65/pydantic_core-2.41.5-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:3f37a19d7ebcdd20b96485056ba9e8b304e27d9904d233d7b1015db320e51f0a", size = 2119622, upload-time = "2025-11-04T13:40:56.68Z" }, + { url = "https://files.pythonhosted.org/packages/74/1a/145646e5687e8d9a1e8d09acb278c8535ebe9e972e1f162ed338a622f193/pydantic_core-2.41.5-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:1d1d9764366c73f996edd17abb6d9d7649a7eb690006ab6adbda117717099b14", size = 1891725, upload-time = "2025-11-04T13:40:58.807Z" }, + { url = "https://files.pythonhosted.org/packages/23/04/e89c29e267b8060b40dca97bfc64a19b2a3cf99018167ea1677d96368273/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:25e1c2af0fce638d5f1988b686f3b3ea8cd7de5f244ca147c777769e798a9cd1", size = 1915040, upload-time = "2025-11-04T13:41:00.853Z" }, + { url = "https://files.pythonhosted.org/packages/84/a3/15a82ac7bd97992a82257f777b3583d3e84bdb06ba6858f745daa2ec8a85/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:506d766a8727beef16b7adaeb8ee6217c64fc813646b424d0804d67c16eddb66", size = 2063691, upload-time = "2025-11-04T13:41:03.504Z" }, + { url = "https://files.pythonhosted.org/packages/74/9b/0046701313c6ef08c0c1cf0e028c67c770a4e1275ca73131563c5f2a310a/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:4819fa52133c9aa3c387b3328f25c1facc356491e6135b459f1de698ff64d869", size = 2213897, upload-time = "2025-11-04T13:41:05.804Z" }, + { url = "https://files.pythonhosted.org/packages/8a/cd/6bac76ecd1b27e75a95ca3a9a559c643b3afcd2dd62086d4b7a32a18b169/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2b761d210c9ea91feda40d25b4efe82a1707da2ef62901466a42492c028553a2", size = 2333302, upload-time = "2025-11-04T13:41:07.809Z" }, + { url = "https://files.pythonhosted.org/packages/4c/d2/ef2074dc020dd6e109611a8be4449b98cd25e1b9b8a303c2f0fca2f2bcf7/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:22f0fb8c1c583a3b6f24df2470833b40207e907b90c928cc8d3594b76f874375", size = 2064877, upload-time = "2025-11-04T13:41:09.827Z" }, + { url = "https://files.pythonhosted.org/packages/18/66/e9db17a9a763d72f03de903883c057b2592c09509ccfe468187f2a2eef29/pydantic_core-2.41.5-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:2782c870e99878c634505236d81e5443092fba820f0373997ff75f90f68cd553", size = 2180680, upload-time = "2025-11-04T13:41:12.379Z" }, + { url = "https://files.pythonhosted.org/packages/d3/9e/3ce66cebb929f3ced22be85d4c2399b8e85b622db77dad36b73c5387f8f8/pydantic_core-2.41.5-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:0177272f88ab8312479336e1d777f6b124537d47f2123f89cb37e0accea97f90", size = 2138960, upload-time = "2025-11-04T13:41:14.627Z" }, + { url = "https://files.pythonhosted.org/packages/a6/62/205a998f4327d2079326b01abee48e502ea739d174f0a89295c481a2272e/pydantic_core-2.41.5-cp314-cp314-musllinux_1_1_armv7l.whl", hash = "sha256:63510af5e38f8955b8ee5687740d6ebf7c2a0886d15a6d65c32814613681bc07", size = 2339102, upload-time = "2025-11-04T13:41:16.868Z" }, + { url = "https://files.pythonhosted.org/packages/3c/0d/f05e79471e889d74d3d88f5bd20d0ed189ad94c2423d81ff8d0000aab4ff/pydantic_core-2.41.5-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:e56ba91f47764cc14f1daacd723e3e82d1a89d783f0f5afe9c364b8bb491ccdb", size = 2326039, upload-time = "2025-11-04T13:41:18.934Z" }, + { url = "https://files.pythonhosted.org/packages/ec/e1/e08a6208bb100da7e0c4b288eed624a703f4d129bde2da475721a80cab32/pydantic_core-2.41.5-cp314-cp314-win32.whl", hash = "sha256:aec5cf2fd867b4ff45b9959f8b20ea3993fc93e63c7363fe6851424c8a7e7c23", size = 1995126, upload-time = "2025-11-04T13:41:21.418Z" }, + { url = "https://files.pythonhosted.org/packages/48/5d/56ba7b24e9557f99c9237e29f5c09913c81eeb2f3217e40e922353668092/pydantic_core-2.41.5-cp314-cp314-win_amd64.whl", hash = "sha256:8e7c86f27c585ef37c35e56a96363ab8de4e549a95512445b85c96d3e2f7c1bf", size = 2015489, upload-time = "2025-11-04T13:41:24.076Z" }, + { url = "https://files.pythonhosted.org/packages/4e/bb/f7a190991ec9e3e0ba22e4993d8755bbc4a32925c0b5b42775c03e8148f9/pydantic_core-2.41.5-cp314-cp314-win_arm64.whl", hash = "sha256:e672ba74fbc2dc8eea59fb6d4aed6845e6905fc2a8afe93175d94a83ba2a01a0", size = 1977288, upload-time = "2025-11-04T13:41:26.33Z" }, + { url = "https://files.pythonhosted.org/packages/92/ed/77542d0c51538e32e15afe7899d79efce4b81eee631d99850edc2f5e9349/pydantic_core-2.41.5-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:8566def80554c3faa0e65ac30ab0932b9e3a5cd7f8323764303d468e5c37595a", size = 2120255, upload-time = "2025-11-04T13:41:28.569Z" }, + { url = "https://files.pythonhosted.org/packages/bb/3d/6913dde84d5be21e284439676168b28d8bbba5600d838b9dca99de0fad71/pydantic_core-2.41.5-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:b80aa5095cd3109962a298ce14110ae16b8c1aece8b72f9dafe81cf597ad80b3", size = 1863760, upload-time = "2025-11-04T13:41:31.055Z" }, + { url = "https://files.pythonhosted.org/packages/5a/f0/e5e6b99d4191da102f2b0eb9687aaa7f5bea5d9964071a84effc3e40f997/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3006c3dd9ba34b0c094c544c6006cc79e87d8612999f1a5d43b769b89181f23c", size = 1878092, upload-time = "2025-11-04T13:41:33.21Z" }, + { url = "https://files.pythonhosted.org/packages/71/48/36fb760642d568925953bcc8116455513d6e34c4beaa37544118c36aba6d/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:72f6c8b11857a856bcfa48c86f5368439f74453563f951e473514579d44aa612", size = 2053385, upload-time = "2025-11-04T13:41:35.508Z" }, + { url = "https://files.pythonhosted.org/packages/20/25/92dc684dd8eb75a234bc1c764b4210cf2646479d54b47bf46061657292a8/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5cb1b2f9742240e4bb26b652a5aeb840aa4b417c7748b6f8387927bc6e45e40d", size = 2218832, upload-time = "2025-11-04T13:41:37.732Z" }, + { url = "https://files.pythonhosted.org/packages/e2/09/f53e0b05023d3e30357d82eb35835d0f6340ca344720a4599cd663dca599/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:bd3d54f38609ff308209bd43acea66061494157703364ae40c951f83ba99a1a9", size = 2327585, upload-time = "2025-11-04T13:41:40Z" }, + { url = "https://files.pythonhosted.org/packages/aa/4e/2ae1aa85d6af35a39b236b1b1641de73f5a6ac4d5a7509f77b814885760c/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2ff4321e56e879ee8d2a879501c8e469414d948f4aba74a2d4593184eb326660", size = 2041078, upload-time = "2025-11-04T13:41:42.323Z" }, + { url = "https://files.pythonhosted.org/packages/cd/13/2e215f17f0ef326fc72afe94776edb77525142c693767fc347ed6288728d/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d0d2568a8c11bf8225044aa94409e21da0cb09dcdafe9ecd10250b2baad531a9", size = 2173914, upload-time = "2025-11-04T13:41:45.221Z" }, + { url = "https://files.pythonhosted.org/packages/02/7a/f999a6dcbcd0e5660bc348a3991c8915ce6599f4f2c6ac22f01d7a10816c/pydantic_core-2.41.5-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:a39455728aabd58ceabb03c90e12f71fd30fa69615760a075b9fec596456ccc3", size = 2129560, upload-time = "2025-11-04T13:41:47.474Z" }, + { url = "https://files.pythonhosted.org/packages/3a/b1/6c990ac65e3b4c079a4fb9f5b05f5b013afa0f4ed6780a3dd236d2cbdc64/pydantic_core-2.41.5-cp314-cp314t-musllinux_1_1_armv7l.whl", hash = "sha256:239edca560d05757817c13dc17c50766136d21f7cd0fac50295499ae24f90fdf", size = 2329244, upload-time = "2025-11-04T13:41:49.992Z" }, + { url = "https://files.pythonhosted.org/packages/d9/02/3c562f3a51afd4d88fff8dffb1771b30cfdfd79befd9883ee094f5b6c0d8/pydantic_core-2.41.5-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:2a5e06546e19f24c6a96a129142a75cee553cc018ffee48a460059b1185f4470", size = 2331955, upload-time = "2025-11-04T13:41:54.079Z" }, + { url = "https://files.pythonhosted.org/packages/5c/96/5fb7d8c3c17bc8c62fdb031c47d77a1af698f1d7a406b0f79aaa1338f9ad/pydantic_core-2.41.5-cp314-cp314t-win32.whl", hash = "sha256:b4ececa40ac28afa90871c2cc2b9ffd2ff0bf749380fbdf57d165fd23da353aa", size = 1988906, upload-time = "2025-11-04T13:41:56.606Z" }, + { url = "https://files.pythonhosted.org/packages/22/ed/182129d83032702912c2e2d8bbe33c036f342cc735737064668585dac28f/pydantic_core-2.41.5-cp314-cp314t-win_amd64.whl", hash = "sha256:80aa89cad80b32a912a65332f64a4450ed00966111b6615ca6816153d3585a8c", size = 1981607, upload-time = "2025-11-04T13:41:58.889Z" }, + { url = "https://files.pythonhosted.org/packages/9f/ed/068e41660b832bb0b1aa5b58011dea2a3fe0ba7861ff38c4d4904c1c1a99/pydantic_core-2.41.5-cp314-cp314t-win_arm64.whl", hash = "sha256:35b44f37a3199f771c3eaa53051bc8a70cd7b54f333531c59e29fd4db5d15008", size = 1974769, upload-time = "2025-11-04T13:42:01.186Z" }, + { url = "https://files.pythonhosted.org/packages/11/72/90fda5ee3b97e51c494938a4a44c3a35a9c96c19bba12372fb9c634d6f57/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-macosx_10_12_x86_64.whl", hash = "sha256:b96d5f26b05d03cc60f11a7761a5ded1741da411e7fe0909e27a5e6a0cb7b034", size = 2115441, upload-time = "2025-11-04T13:42:39.557Z" }, + { url = "https://files.pythonhosted.org/packages/1f/53/8942f884fa33f50794f119012dc6a1a02ac43a56407adaac20463df8e98f/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-macosx_11_0_arm64.whl", hash = "sha256:634e8609e89ceecea15e2d61bc9ac3718caaaa71963717bf3c8f38bfde64242c", size = 1930291, upload-time = "2025-11-04T13:42:42.169Z" }, + { url = "https://files.pythonhosted.org/packages/79/c8/ecb9ed9cd942bce09fc888ee960b52654fbdbede4ba6c2d6e0d3b1d8b49c/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:93e8740d7503eb008aa2df04d3b9735f845d43ae845e6dcd2be0b55a2da43cd2", size = 1948632, upload-time = "2025-11-04T13:42:44.564Z" }, + { url = "https://files.pythonhosted.org/packages/2e/1b/687711069de7efa6af934e74f601e2a4307365e8fdc404703afc453eab26/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f15489ba13d61f670dcc96772e733aad1a6f9c429cc27574c6cdaed82d0146ad", size = 2138905, upload-time = "2025-11-04T13:42:47.156Z" }, + { url = "https://files.pythonhosted.org/packages/09/32/59b0c7e63e277fa7911c2fc70ccfb45ce4b98991e7ef37110663437005af/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:7da7087d756b19037bc2c06edc6c170eeef3c3bafcb8f532ff17d64dc427adfd", size = 2110495, upload-time = "2025-11-04T13:42:49.689Z" }, + { url = "https://files.pythonhosted.org/packages/aa/81/05e400037eaf55ad400bcd318c05bb345b57e708887f07ddb2d20e3f0e98/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:aabf5777b5c8ca26f7824cb4a120a740c9588ed58df9b2d196ce92fba42ff8dc", size = 1915388, upload-time = "2025-11-04T13:42:52.215Z" }, + { url = "https://files.pythonhosted.org/packages/6e/0d/e3549b2399f71d56476b77dbf3cf8937cec5cd70536bdc0e374a421d0599/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c007fe8a43d43b3969e8469004e9845944f1a80e6acd47c150856bb87f230c56", size = 1942879, upload-time = "2025-11-04T13:42:56.483Z" }, + { url = "https://files.pythonhosted.org/packages/f7/07/34573da085946b6a313d7c42f82f16e8920bfd730665de2d11c0c37a74b5/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:76d0819de158cd855d1cbb8fcafdf6f5cf1eb8e470abe056d5d161106e38062b", size = 2139017, upload-time = "2025-11-04T13:42:59.471Z" }, + { url = "https://files.pythonhosted.org/packages/e6/b0/1a2aa41e3b5a4ba11420aba2d091b2d17959c8d1519ece3627c371951e73/pydantic_core-2.41.5-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:b5819cd790dbf0c5eb9f82c73c16b39a65dd6dd4d1439dcdea7816ec9adddab8", size = 2103351, upload-time = "2025-11-04T13:43:02.058Z" }, + { url = "https://files.pythonhosted.org/packages/a4/ee/31b1f0020baaf6d091c87900ae05c6aeae101fa4e188e1613c80e4f1ea31/pydantic_core-2.41.5-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:5a4e67afbc95fa5c34cf27d9089bca7fcab4e51e57278d710320a70b956d1b9a", size = 1925363, upload-time = "2025-11-04T13:43:05.159Z" }, + { url = "https://files.pythonhosted.org/packages/e1/89/ab8e86208467e467a80deaca4e434adac37b10a9d134cd2f99b28a01e483/pydantic_core-2.41.5-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ece5c59f0ce7d001e017643d8d24da587ea1f74f6993467d85ae8a5ef9d4f42b", size = 2135615, upload-time = "2025-11-04T13:43:08.116Z" }, + { url = "https://files.pythonhosted.org/packages/99/0a/99a53d06dd0348b2008f2f30884b34719c323f16c3be4e6cc1203b74a91d/pydantic_core-2.41.5-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:16f80f7abe3351f8ea6858914ddc8c77e02578544a0ebc15b4c2e1a0e813b0b2", size = 2175369, upload-time = "2025-11-04T13:43:12.49Z" }, + { url = "https://files.pythonhosted.org/packages/6d/94/30ca3b73c6d485b9bb0bc66e611cff4a7138ff9736b7e66bcf0852151636/pydantic_core-2.41.5-pp310-pypy310_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:33cb885e759a705b426baada1fe68cbb0a2e68e34c5d0d0289a364cf01709093", size = 2144218, upload-time = "2025-11-04T13:43:15.431Z" }, + { url = "https://files.pythonhosted.org/packages/87/57/31b4f8e12680b739a91f472b5671294236b82586889ef764b5fbc6669238/pydantic_core-2.41.5-pp310-pypy310_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:c8d8b4eb992936023be7dee581270af5c6e0697a8559895f527f5b7105ecd36a", size = 2329951, upload-time = "2025-11-04T13:43:18.062Z" }, + { url = "https://files.pythonhosted.org/packages/7d/73/3c2c8edef77b8f7310e6fb012dbc4b8551386ed575b9eb6fb2506e28a7eb/pydantic_core-2.41.5-pp310-pypy310_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:242a206cd0318f95cd21bdacff3fcc3aab23e79bba5cac3db5a841c9ef9c6963", size = 2318428, upload-time = "2025-11-04T13:43:20.679Z" }, + { url = "https://files.pythonhosted.org/packages/2f/02/8559b1f26ee0d502c74f9cca5c0d2fd97e967e083e006bbbb4e97f3a043a/pydantic_core-2.41.5-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:d3a978c4f57a597908b7e697229d996d77a6d3c94901e9edee593adada95ce1a", size = 2147009, upload-time = "2025-11-04T13:43:23.286Z" }, + { url = "https://files.pythonhosted.org/packages/5f/9b/1b3f0e9f9305839d7e84912f9e8bfbd191ed1b1ef48083609f0dabde978c/pydantic_core-2.41.5-pp311-pypy311_pp73-macosx_10_12_x86_64.whl", hash = "sha256:b2379fa7ed44ddecb5bfe4e48577d752db9fc10be00a6b7446e9663ba143de26", size = 2101980, upload-time = "2025-11-04T13:43:25.97Z" }, + { url = "https://files.pythonhosted.org/packages/a4/ed/d71fefcb4263df0da6a85b5d8a7508360f2f2e9b3bf5814be9c8bccdccc1/pydantic_core-2.41.5-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:266fb4cbf5e3cbd0b53669a6d1b039c45e3ce651fd5442eff4d07c2cc8d66808", size = 1923865, upload-time = "2025-11-04T13:43:28.763Z" }, + { url = "https://files.pythonhosted.org/packages/ce/3a/626b38db460d675f873e4444b4bb030453bbe7b4ba55df821d026a0493c4/pydantic_core-2.41.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:58133647260ea01e4d0500089a8c4f07bd7aa6ce109682b1426394988d8aaacc", size = 2134256, upload-time = "2025-11-04T13:43:31.71Z" }, + { url = "https://files.pythonhosted.org/packages/83/d9/8412d7f06f616bbc053d30cb4e5f76786af3221462ad5eee1f202021eb4e/pydantic_core-2.41.5-pp311-pypy311_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:287dad91cfb551c363dc62899a80e9e14da1f0e2b6ebde82c806612ca2a13ef1", size = 2174762, upload-time = "2025-11-04T13:43:34.744Z" }, + { url = "https://files.pythonhosted.org/packages/55/4c/162d906b8e3ba3a99354e20faa1b49a85206c47de97a639510a0e673f5da/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:03b77d184b9eb40240ae9fd676ca364ce1085f203e1b1256f8ab9984dca80a84", size = 2143141, upload-time = "2025-11-04T13:43:37.701Z" }, + { url = "https://files.pythonhosted.org/packages/1f/f2/f11dd73284122713f5f89fc940f370d035fa8e1e078d446b3313955157fe/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:a668ce24de96165bb239160b3d854943128f4334822900534f2fe947930e5770", size = 2330317, upload-time = "2025-11-04T13:43:40.406Z" }, + { url = "https://files.pythonhosted.org/packages/88/9d/b06ca6acfe4abb296110fb1273a4d848a0bfb2ff65f3ee92127b3244e16b/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:f14f8f046c14563f8eb3f45f499cc658ab8d10072961e07225e507adb700e93f", size = 2316992, upload-time = "2025-11-04T13:43:43.602Z" }, + { url = "https://files.pythonhosted.org/packages/36/c7/cfc8e811f061c841d7990b0201912c3556bfeb99cdcb7ed24adc8d6f8704/pydantic_core-2.41.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:56121965f7a4dc965bff783d70b907ddf3d57f6eba29b6d2e5dabfaf07799c51", size = 2145302, upload-time = "2025-11-04T13:43:46.64Z" }, +] + [[package]] name = "pygments" version = "2.19.2" @@ -1125,6 +1269,18 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/18/67/36e9267722cc04a6b9f15c7f3441c2363321a3ea07da7ae0c0707beb2a9c/typing_extensions-4.15.0-py3-none-any.whl", hash = "sha256:f0fa19c6845758ab08074a0cfa8b7aecb71c999ca73d62883bc25cc018c4e548", size = 44614, upload-time = "2025-08-25T13:49:24.86Z" }, ] +[[package]] +name = "typing-inspection" +version = "0.4.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "typing-extensions" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/55/e3/70399cb7dd41c10ac53367ae42139cf4b1ca5f36bb3dc6c9d33acdb43655/typing_inspection-0.4.2.tar.gz", hash = "sha256:ba561c48a67c5958007083d386c3295464928b01faa735ab8547c5692e87f464", size = 75949, upload-time = "2025-10-01T02:14:41.687Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/dc/9b/47798a6c91d8bdb567fe2698fe81e0c6b7cb7ef4d13da4114b41d239f65d/typing_inspection-0.4.2-py3-none-any.whl", hash = "sha256:4ed1cacbdc298c220f1bd249ed5287caa16f34d44ef4e9c3d0cbad5b521545e7", size = 14611, upload-time = "2025-10-01T02:14:40.154Z" }, +] + [[package]] name = "urllib3" version = "2.6.3" diff --git a/cli/llmb-run/Bulk_Examples.md b/cli/llmb-run/Bulk_Examples.md index bde7a06..6390e3d 100644 --- a/cli/llmb-run/Bulk_Examples.md +++ b/cli/llmb-run/Bulk_Examples.md @@ -1,14 +1,13 @@ # Bulk Job Submission Examples Note: Bulk mode is now accessed via `llmb-run submit -f `. -The standalone `bulk` command is deprecated but still works. ## YAML Header Format **Required format:** `workload_key_modelsize:` The header must include both the workload name and model size, separated by an underscore. -The model size must end with `b` (for billion parameters). +The model size must end with `b` (billions of parameters) or `t` (trillions of parameters). ### Valid Examples @@ -19,7 +18,10 @@ pretrain_llama3.1_70b: # Workload with decimal version pretrain_nemotron-h_56b: # Workload with hyphen tasks: [...] -pretrain_grok1_314b: # Large model +pretrain_deepseek-v3_671b: # Large model + tasks: [...] + +pretrain_kimi-k2_1t: # Trillion-parameter model tasks: [...] ``` @@ -29,7 +31,7 @@ pretrain_grok1_314b: # Large model pretrain_nemotron-h: # ❌ Missing model size tasks: [...] -pretrain_llama_7x: # ❌ Invalid format (must end with 'b') +pretrain_llama_7x: # ❌ Invalid format (must end with 'b' or 't') tasks: [...] ``` @@ -39,7 +41,7 @@ ______________________________________________________________________ ## Job Specification Formats -There are two supported file formats for bulk job submission: +There are two supported file formats for file-based bulk job submission: 1. **YAML Format** (.yaml) - **Recommended**. Supports all features including environment variables and overrides. 2. **Text Format** (.txt) - **Legacy**. Supports basic configurations (workload, model size, dtype, scale, repeats). @@ -78,10 +80,10 @@ pretrain_llama3.1_70b: ### With Proxy Configuration ```yaml -pretrain_nemotron4_340b: +pretrain_deepseek-v3_671b: tasks: - dtypes: 'bf16' - scales: [16] + scales: [64] repeats: 1 proxy: true # Altered configuration for debug workflows ``` @@ -91,17 +93,17 @@ pretrain_nemotron4_340b: ### With Environment Variables ```yaml -pretrain_grok1_314b: +pretrain_qwen3_235b: defaults: env: DEBUG: true tasks: - - dtypes: 'fp8' - scales: [128, 256] + - dtypes: 'bf16' + scales: [256, 512] repeats: 3 ``` -**Explanation**: This example sets global environment variables for all jobs. The workload will run with fp8 precision at two scales, with each configuration repeated 3 times. The environment variables will be applied to all 6 jobs. +**Explanation**: This example sets global environment variables for all jobs. The workload will run with bf16 precision at two scales, with each configuration repeated 3 times. The environment variables will be applied to all 6 jobs. ### Complex Configuration (Overrides & Profiling) @@ -145,10 +147,10 @@ pretrain_llama3.1_405b: scales: [256, 512] repeats: 3 -pretrain_grok1_314b: +pretrain_qwen3_235b: tasks: - - dtypes: 'fp8' - scales: [128, 256] + - dtypes: 'bf16' + scales: [256, 512] repeats: 2 ``` @@ -170,10 +172,10 @@ pretrain_llama3.1_405b: ### Mixed Text Example ``` -pretrain_nemotron4_340b: -(['bf16','fp8'], [128, 256], 2) +pretrain_qwen3_235b: +('bf16', [256, 512], 2) # True enables profiling -('fp8', [512], 1, True) +('bf16', [512], 1, True) ``` **Note**: The example above shows the correct way to add comments - on their own lines. Inline comments like `('fp8', [512], 1, True) # comment` will cause parsing errors. diff --git a/cli/llmb-run/CHANGELOG.md b/cli/llmb-run/CHANGELOG.md index 57bf160..62320b4 100644 --- a/cli/llmb-run/CHANGELOG.md +++ b/cli/llmb-run/CHANGELOG.md @@ -5,17 +5,74 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## [1.10.11] - 2026-04-14 +## [1.14.4] - 2026-05-15 + +### Changed + +- `llmb-run exemplar` now falls back to one non-profiled repeat when `config.repeats` and `config.profile` are omitted. + +## [1.14.3] - 2026-05-13 + +### Added + +- `llmb-run --version` now also prints the recipe version from the configured install's `release.yaml` when available. + +## [1.14.2] - 2026-05-12 ### Fixed -- Archive now excludes experiment-level `checkpoints/` directories, preventing Megatron-Bridge `*.distcp` checkpoint shards from bloating archives. +- Exclude `torch_profile/` PyTorch profiler output directories from `llmb-run archive`. + +## [1.14.1] - 2026-05-08 -## [1.10.10] - 2026-04-10 +### Fixed + +- Require supported pretraining job logs to reach their reported final iteration before showing parsed performance metrics. + +## [1.14.0] - 2026-05-05 + +### Added + +- `llmb-run jobs`: local SQLite-backed job history with submission recording, Slurm status refresh, performance results for supported workload logs, detail view, launcher-aware log access, and rebuild from existing non-legacy `llmb-config_*.yaml` files. + +## [1.13.1] - 2026-05-04 + +### Fixed + +- Accept trillion-parameter (`t`) model-size suffixes in workload parsing, bulk YAML headers, and Exemplar ordering. + +## [1.13.0] - 2026-04-21 + +### Added + +- `llmb-run submit --dump-env` for Megatron-Bridge workloads, capturing a redacted rank-0 environment snapshot. + +### Removed + +- Removed deprecated `llmb-run single`, `llmb-run bulk`, and `llmb-run submit-all` commands. Use `llmb-run submit` for explicit, file-based, and discovery submissions. + +## [1.12.1] - 2026-04-21 + +### Fixed + +- Archive now excludes `nsys_profile/` directories to avoid packaging large profiling artifacts. + +## [1.12.0] - 2026-04-21 + +### Added + +- `llmb-run submit`: repeatable `--env KEY=value` flag for explicit job environment overrides. YAML task-spec `env:` blocks now receive the same treatment. ### Fixed - Archive now excludes `*.pt.trace.json` files produced by newer PyTorch profiling output. +- Archive now excludes experiment-level `checkpoints/` directories, preventing Megatron-Bridge `*.distcp` checkpoint shards from bloating archives. + +## [1.11.0] - 2026-04-09 + +### Added + +- `llmb-run submit`: first-class Slurm submission flags `--nodelist`, `--exclude`, `--reservation`, `--segment`, `--nice`, and repeatable `--slurm-arg`. ## [1.10.9] - 2026-04-06 diff --git a/cli/llmb-run/README.md b/cli/llmb-run/README.md index dcd16fe..0bffde0 100644 --- a/cli/llmb-run/README.md +++ b/cli/llmb-run/README.md @@ -38,6 +38,9 @@ llmb-run list # Run your first job (example) llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 + +# Check submitted jobs and results +llmb-run jobs ``` **Note**: llmb-run requires access to `cluster_config.yaml` which is located in your installation directory. Always run llmb-run commands from this directory. @@ -101,7 +104,7 @@ Workload configuration: ## Commands -llmb-run's primary interface is the `submit` command, which handles all job submission modes. The `list` command is also available for discovery. +llmb-run's primary interface is the `submit` command, which handles all job submission modes. Use `list` to discover available workloads and `jobs` to inspect submitted jobs. ### CLI Structure (Global vs Command Options) @@ -129,35 +132,39 @@ llmb-run submit -v -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 ### Submit Command -The `submit` command is a unified interface for all job submissions. It supports three main workflows: +The `submit` command is a unified interface for all job submissions. It supports these common workflows: #### Choose a Submit Workflow Pick the workflow that matches how you want to run: -- **Explicit (single job)**: You provide `--workload`, `--model_size`, `--dtype`, and `--scale`. - - Pattern: `llmb-run submit -w -s --dtype --scale ` +- **Single explicit target**: You provide `--workload`, `--model-size`, `--dtype`, and `--scale`. + - Pattern: `llmb-run submit -w -s --dtype --scale ` +- **Target-list selection**: You provide comma-separated `--workload` targets and omit `--model-size`. + - Pattern: `llmb-run submit -w _, --dtype --scale ` +- **File-based (batch; special cases)**: You provide an input file and llmb-run submits the jobs listed in it. + - Pattern: `llmb-run submit -f ` - **Auto-discovery (submit all / many)**: You provide discovery constraints and llmb-run generates jobs from installed workload metadata. - Pattern: `llmb-run submit --max-scale ` - Example: `llmb-run submit --max-scale 512` (submits eligible installed workloads up to 512 GPUs; see the section below for additional limiting flags) -- **File-based (batch; special cases)**: You provide an input file and llmb-run submits the jobs listed in it. - - Pattern: `llmb-run submit -f ` -#### 1. Single Job Submission (Explicit) +#### 1. Single Explicit Target -Submit a single workload with specific parameters. +Submit one workload/model-size target with explicit parameters. ```bash -llmb-run submit -w -s --dtype --scale +llmb-run submit -w -s --dtype --scale ``` **Required Flags:** -- `-w, --workload`: Name of the workload (e.g., `pretrain_llama3.1`) -- `-s, --model_size`: Model size (e.g., `405b`, `70b`). +- `-w, --workload`: Name of the workload (e.g., `pretrain_llama3.1`). +- `-s, --model-size`: Model size (e.g., `405b`, `70b`). - `--dtype`: Data type (e.g., `fp8`, `bf16`). - `--scale`: Number of GPUs. Accepts a single value or a comma-separated list. +Use this form when you are running one workload/model-size target. It keeps workload and model size separate, which is usually easiest to read for a first run. + **Examples:** ```bash @@ -171,9 +178,33 @@ llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128,256,512 llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 16 --proxy ``` -#### 2. File-Based Submission (Batch) +#### 2. Target-List Selection + +Use a comma-separated `-w` list when you want to apply the same dtype and scale choices across multiple workload targets. Use `--scale` for exact scales, or `--min-scale` / `--max-scale` for scale discovery. In this form, `-s` is not used because it is a single global model-size flag. -Submit multiple jobs defined in a file. This replaces the old `bulk` command. +Each `-w` entry can be either: + +- `_` to select one model size for a multi-model-size workload. +- `` to select all installed model sizes for that workload. For workloads with only one model size, the suffix is optional. + +Normal dtype and scale validation still applies, so llmb-run only generates supported combinations for the selected targets. + +**Examples:** + +```bash +# Run one Llama 3.1 model size plus the single-model-size Nemotron-H workload +llmb-run submit -w pretrain_llama3.1_70b,pretrain_nemotron-h --dtype fp8 --scale 128 + +# Run all installed Llama 3.1 model sizes plus Nemotron-H +llmb-run submit -w pretrain_llama3.1,pretrain_nemotron-h --dtype fp8 --min-scale + +# Run target-list selection with scale discovery up to 512 GPUs +llmb-run submit -w pretrain_llama3.1_70b,pretrain_nemotron-h --dtype fp8 --max-scale 512 +``` + +#### 3. File-Based Submission (Batch) + +Submit multiple jobs defined in a file. ```bash llmb-run submit -f @@ -192,9 +223,9 @@ See [Bulk_Examples.md](Bulk_Examples.md) for detailed file format specifications llmb-run submit -f my_experiment.yaml ``` -#### 3. Auto-Discovery (Submit All) +#### 4. Auto-Discovery (Submit All) -Automatically discover and submit jobs for installed workloads based on metadata. This replaces the old `submit-all` command. +Automatically discover and submit jobs for installed workloads based on metadata. ```bash llmb-run submit --max-scale @@ -204,18 +235,20 @@ llmb-run submit --max-scale - `--max-scale`: Run all workloads up to this scale. - `--min-scale`: Run only the minimum supported scale for each workload. -- `--exact-scales`: Only use scales explicitly listed in workload metadata (no power-of-2 expansion beyond metadata max). -- `-w, --workload`: Limit discovery to specific workloads (comma-separated). +- `--exact-scales`: Only use scales explicitly listed in workload metadata. Use this with `--max-scale` when you want all officially listed scales up to a limit, without adding larger power-of-2 scales beyond a workload's metadata. +- `-w, --workload`: Limit discovery to specific base workloads or workload-size targets (comma-separated). - `--scale`: specific scales to run (comma-separated). - `--proxy`: In auto-discovery, only workloads with `proxy_scales` defined are included. +By default, `--max-scale` may extend a workload beyond its largest metadata-listed scale by adding power-of-2 scales up to the requested maximum. `--exact-scales` disables that expansion. For example, if a Llama 3.1 8B target lists scales up to 128 GPUs in metadata, `--max-scale 512 --exact-scales` will not add 256- or 512-GPU jobs for that target. + **Examples:** ```bash # Run all installed workloads up to 512 GPUs llmb-run submit --max-scale 512 -# Run up to 512 GPUs but only at metadata-supported scales (avoid scale expansion) +# Run all installed workloads up to 512 GPUs, but only at metadata-listed scales llmb-run submit --max-scale 512 --exact-scales # Run specific scales for all workloads @@ -224,12 +257,53 @@ llmb-run submit --scale 128,256 #### Submit Options (All Submit Modes) -These flags apply to all `llmb-run submit` modes (explicit, file-based, and auto-discovery): +These flags apply to all `llmb-run submit` modes (single explicit target, target-list, file-based, and auto-discovery): - `-r, --repeats `: Repeat each job N times (default: 1). - `-p, --profile`: Enable profiling for all submitted jobs. +- `--dump-env`: For Megatron-Bridge workloads, write a rank-0 environment snapshot to a separate job log file, similar to running `env` at job start. Common secret-like keys are redacted. This is ignored for other workloads. - `--proxy`: Use proxy scales. - `--dry-run`: Print the jobs that would be submitted without running them. +- `--force`: Bypass dtype/scale validation for one explicit task. Use only when you intentionally need to run a configuration outside workload metadata. + +#### Slurm Options + +These flags control Slurm submission parameters and apply to all `llmb-run submit` modes: + +- `--nice `: Lower the job priority via Slurm nice. +- `--nodelist `: Restrict the job to a specific node list. +- `--exclude `: Exclude specific nodes from the job. +- `--reservation `: Submit the job under a Slurm reservation. +- `--segment `: Set the Slurm segment size for the job. +- `--env `: Repeatable environment override for the submitted job. Use this when you need a variable treated as an explicit launcher/container override. +- `--slurm-arg `: Pass an arbitrary Slurm parameter. Repeatable. Accepts `key=value` pairs or bare flags (e.g., `exclusive`). Do **not** include a leading `--`. + +**Examples:** + +```bash +# Lower job priority +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 --nice 100 + +# Pin to specific nodes +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 --nodelist node[001-032] + +# Combine multiple Slurm options +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 \ + --reservation my-reservation --exclude node099 + +# Pass explicit job environment overrides +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 \ + --env NCCL_DEBUG=INFO --env OTHER_VAR=test + +# Pass arbitrary Slurm flags +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 256 \ + --slurm-arg constraint=gpu --slurm-arg exclusive +``` + +**Notes:** + +- Parameters that have a dedicated flag (`nodelist`, `exclude`, `reservation`, `segment`, `nice`) must use that flag and cannot be passed via `--slurm-arg`. +- Slurm CLI flags **cannot** be combined with the `ADDITIONAL_SLURM_PARAMS` environment variable (set in the process environment, cluster config, workload config, or task overrides). If both are present, the command will fail with an error. ### List Command @@ -259,6 +333,38 @@ llmb-run list llmb-run list -w pretrain_llama3.1 ``` +### Jobs Command + +The jobs command shows the local job history for the current `$LLMB_INSTALL`. History is stored in `$LLMB_INSTALL/.llmb/jobs.sqlite3`. + +#### Basic Usage + +```bash +llmb-run jobs +llmb-run jobs list +``` + +Both commands refresh non-terminal Slurm jobs before printing the table. + +The jobs table shows workload, dtype, scale, job ID, Slurm status, elapsed time, and available performance results (`s/iter` and `TFLOPS/GPU`). Result columns are populated from supported NeMo 2 and Megatron-Bridge workload logs after a job reaches a terminal Slurm state. A failed or cancelled job can still show results if the log contains enough data. + +#### Other Commands + +- `llmb-run jobs show `: Show details for one job, including its log directory and any parsed results. +- `llmb-run jobs log `: Show the active log for one job. +- `llmb-run jobs log --follow`: Follow the active log. +- `llmb-run jobs log --path`: Print the active log file path. +- `llmb-run jobs log --dir`: Print the job log directory. +- `llmb-run jobs log --list`: List matching retry log files for the job. +- `llmb-run jobs refresh [job_id ...]`: Re-check specific jobs with Slurm and update any results shown in the jobs table. +- `llmb-run jobs rebuild`: Rebuild history by scanning `$LLMB_INSTALL/workloads/**/llmb-config_*.yaml`. + +Use `llmb-run jobs rebuild` once if you want to populate history from jobs that were submitted before the local history database existed. New jobs submitted with `llmb-run submit` are recorded automatically. + +`llmb-run jobs log` supports NeMo/Megatron-Bridge workload logs and managed `configured_sbatch` experiment directories. Legacy `sbatch` jobs are tracked when submitted by `llmb-run`, but log paths cannot be resolved reliably and are skipped by `jobs rebuild`. + +A `PURGED` status means `sacct` no longer has a record for the job (typically due to cluster accounting retention). It does not mean the job failed — it means llmb-run has lost track of it. + ### Exemplar Command (Cloud Certification) The exemplar command runs the cloud certification workload suite. @@ -272,7 +378,7 @@ llmb-run exemplar #### Options - `--dry-run`: Preview all jobs without submitting -- `-r, --repeats INTEGER`: Number of times to run each job (must be >= 1, default: 3). +- `-r, --repeats INTEGER`: Number of times to run each job (must be >= 1). If omitted, uses `exemplar.yaml` `config.repeats` with an llmb-run fallback of 1. - Profiling: Controlled by exemplar suite config (no CLI profiling flag). #### Behavior @@ -284,7 +390,7 @@ llmb-run exemplar - The per‑dtype configuration explicitly lists `scale: 512` (implicit ranges are not used). - The workload is listed under `workloads.installed` in `cluster_config.yaml`. - Enforces strict validation (install gating): if any workload that meets eligibility is not installed, the command fails. -- Runs 3 repetitions per job by default. When profiling is enabled in exemplar config, the last repeat is profiled and earlier repeats are non-profiled (default: 2 normal + 1 profiled). You can override repeat count for debugging via `-r/--repeats`. +- Repeat count and profiling are controlled by `exemplar.yaml` `config.repeats` and `config.profile`; when those keys are omitted, llmb-run falls back to 1 repeat with profiling disabled. When profiling is enabled, the last repeat is profiled and earlier repeats are non-profiled. You can override repeat count for debugging via `-r/--repeats`. #### Troubleshooting Missing Workloads @@ -325,7 +431,7 @@ llmb-run archive --output /shared/results/my-cluster-results.tar.zst #### What's Included -The archive collects experiment data from `$LLMB_INSTALL/workloads/*/experiments/`, including logs and `llmb-config_*.yaml` metadata files. Profiling data (`.nsys-rep` files) is excluded to keep the archive compact — profiles are typically only needed for debugging and can be shared separately if requested. +The archive collects experiment data from `$LLMB_INSTALL/workloads/*/experiments/`, including logs and `llmb-config_*.yaml` metadata files. Profiling data, including Nsight reports and PyTorch profiler traces, is excluded to keep the archive compact — profiles are typically only needed for debugging and can be shared separately if requested. ### Job Configuration Files @@ -387,14 +493,6 @@ job_config: See [example_llmb_config.yaml](example_llmb_config.yaml) for a complete example. -### Deprecated Commands - -The following commands are deprecated and will be removed in a future release. Please migrate to `llmb-run submit`. - -- `single`: Replaced by `llmb-run submit` -- `bulk`: Replaced by `llmb-run submit -f ` -- `submit-all`: Replaced by `llmb-run submit` (with discovery flags like `--max-scale`) - ## Troubleshooting ### Common Issues and Solutions diff --git a/cli/llmb-run/example_llmb_config.yaml b/cli/llmb-run/example_llmb_config.yaml index 65699fe..06e4c32 100644 --- a/cli/llmb-run/example_llmb_config.yaml +++ b/cli/llmb-run/example_llmb_config.yaml @@ -25,14 +25,14 @@ job_info: job_id: '3913022' launch_time: '2025-07-01T21:05:18.298517' workload_info: - framework: nemo2 - gsw_version: '25.07' - fw_version: 25.04.01 + framework: megatron_bridge + gsw_version: '26.04' + fw_version: 26.04.00 workload_type: pretrain synthetic_dataset: true model_info: - model_name: nemotron4 - model_size: 15b + model_name: llama3.1 + model_size: 8b dtype: fp8 scale: 8 gpu_type: h100 @@ -45,7 +45,7 @@ cluster_info: slurm_gpu_partition: slurm_partition container_info: images: - - nvcr.io#nvidia/nemo:25.04.01 + - nvcr.io#nvidia/nemo:26.04.00 job_config: profile_enabled: false proxy: false diff --git a/cli/llmb-run/pyproject.toml b/cli/llmb-run/pyproject.toml index b856ed3..13d3606 100644 --- a/cli/llmb-run/pyproject.toml +++ b/cli/llmb-run/pyproject.toml @@ -5,10 +5,11 @@ build-backend = "setuptools.build_meta" [project] name = "llmb-run" -version = "1.10.11" +version = "1.14.4" description = "Lightweight tool for automating large batches of LLM benchmarking workloads" requires-python = ">=3.10" dependencies = [ + "pydantic>=2.0,<3", "pyyaml~=6.0", "rich>=10.0.0,<15", "typer~=0.12", diff --git a/cli/llmb-run/src/llmb_run/archive.py b/cli/llmb-run/src/llmb_run/archive.py index 7b7e709..9138fcc 100644 --- a/cli/llmb-run/src/llmb_run/archive.py +++ b/cli/llmb-run/src/llmb_run/archive.py @@ -43,7 +43,7 @@ _NCCL_WORKLOAD_NAME = "microbenchmark_nccl" _NCCL_TOP_LEVEL_PATTERNS = ("llmb-config_*.yaml", "slurm-*.out") _ARCHIVE_EXCLUDED_PATTERNS = ("*.nsys-rep", "*_trace.json", "*.pt.trace.json", "*.tar.*") -_ARCHIVE_EXCLUDED_DIRS = {"code", "checkpoints"} +_ARCHIVE_EXCLUDED_DIRS = {"code", "checkpoints", "nsys_profile", "torch_profile", "pytorch_profile"} @dataclass(frozen=True) diff --git a/cli/llmb-run/src/llmb_run/env_args.py b/cli/llmb-run/src/llmb_run/env_args.py new file mode 100644 index 0000000..a330bc4 --- /dev/null +++ b/cli/llmb-run/src/llmb_run/env_args.py @@ -0,0 +1,127 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + +"""Parsing and launcher-contract helpers for explicit CLI environment overrides.""" + +from __future__ import annotations + +import re +import shlex +from typing import Iterable, Mapping + +LLMB_CONTAINER_ENV = 'LLMB_CONTAINER_ENV' +NEMO_ENV_OVERRIDE_VAR = 'CONFIG_OVERRIDES' + +_ENV_KEY_RE = re.compile(r'^[A-Za-z_][A-Za-z0-9_]*$') + + +def validate_env_key(key: object, *, source: str = 'env') -> str: + """Validate and return a shell-style environment variable name.""" + if not isinstance(key, str): + raise ValueError(f"{source} variable name '{key}' is invalid. Use shell-style environment variable names only.") + key = key.strip() + if not key: + raise ValueError(f"{source} must include a non-empty variable name.") + if not _ENV_KEY_RE.match(key): + raise ValueError(f"{source} variable name '{key}' is invalid. Use shell-style environment variable names only.") + return key + + +def validate_shell_safe_env_value(key: str, value: str) -> None: + """Reject env values that would be corrupted by downstream shell quoting. + + Shell-special characters get mangled when the nemo launcher wraps the job + command in `bash -c '...'` (Megatron-Bridge setup_experiment.py forwards + `sys.argv` wholesale into a single-quoted wrapper, so any inner single + quotes from shlex-quoted values collide with the outer ones and split the + command into bad positional args). Reject at the CLI/YAML boundary until + that upstream fix lands. + """ + if value and shlex.quote(value) != value: + raise ValueError( + f"Env value for '{key}' contains shell-special characters, which are not " + f"supported. Use only characters matching [A-Za-z0-9_@%+=:,./-] until the " + f"upstream Megatron-Bridge launcher fix lands." + ) + + +def parse_cli_env_args(raw_env_args: Iterable[str] | None) -> dict[str, str]: + """Parse repeatable `--env KEY=value` CLI values into an insertion-ordered dict.""" + parsed: dict[str, str] = {} + + for raw_arg in raw_env_args or (): + if '=' not in raw_arg: + raise ValueError("`--env` must be in `KEY=value` form.") + + key, value = raw_arg.split('=', 1) + key = validate_env_key(key, source='`--env`') + if key in parsed: + raise ValueError(f"Duplicate environment variable '{key}' was specified more than once.") + validate_shell_safe_env_value(key, value) + + parsed[key] = value + + return parsed + + +def build_nemo_env_override_flags(overrides: Mapping[str, str]) -> str: + """Render explicit env overrides as repeatable `-E KEY=value` flags. + + Each `KEY=value` token is shell-quoted so values containing whitespace or + glob characters survive word-splitting when launch scripts expand the + resulting variable unquoted (e.g. `launcher ${CONFIG_OVERRIDES}`). Wrapping + the expansion in double quotes downstream would defeat this. + """ + return ' '.join(f"-E {shlex.quote(f'{key}={value}')}" for key, value in overrides.items()) + + +def apply_sbatch_explicit_env_contract(env: dict[str, str], overrides: Mapping[str, str]) -> None: + """Expose explicit env keys to sbatch-style launch scripts.""" + if not overrides: + return + + existing_keys = [key for key in str(env.get(LLMB_CONTAINER_ENV, '')).split(',') if key] + env[LLMB_CONTAINER_ENV] = ','.join(dict.fromkeys([*existing_keys, *overrides])) + + +def apply_nemo_explicit_env_contract(env: dict[str, str], overrides: Mapping[str, str]) -> None: + """Expose explicit env overrides to Nemo launch scripts via repeatable `-E` flags.""" + if not overrides: + return + + override_flags = build_nemo_env_override_flags(overrides) + existing = str(env.get(NEMO_ENV_OVERRIDE_VAR, '')).strip() + env[NEMO_ENV_OVERRIDE_VAR] = f"{existing} {override_flags}".strip() if existing else override_flags + + +def build_nemo_workload_args(args: Iterable[str]) -> str: + """Render raw workload argv tokens for Nemo-style launch scripts.""" + return ' '.join(args) + + +def apply_nemo_workload_args_contract(env: dict[str, str], args: Iterable[str]) -> None: + """Expose extra workload argv tokens to Nemo-style launch scripts via CONFIG_OVERRIDES.""" + rendered_args = build_nemo_workload_args(args) + if not rendered_args: + return + + existing = str(env.get(NEMO_ENV_OVERRIDE_VAR, '')).strip() + env[NEMO_ENV_OVERRIDE_VAR] = f"{existing} {rendered_args}".strip() if existing else rendered_args diff --git a/cli/llmb-run/src/llmb_run/exemplar.py b/cli/llmb-run/src/llmb_run/exemplar.py index 874b0d3..14c98ab 100644 --- a/cli/llmb-run/src/llmb_run/exemplar.py +++ b/cli/llmb-run/src/llmb_run/exemplar.py @@ -26,14 +26,13 @@ """ import logging -import re from pathlib import Path from typing import Any, Dict, List, Optional, Tuple import yaml from llmb_run.config_manager import ClusterConfig -from llmb_run.metadata_utils import normalize_model_dtype_config, parse_workload_name +from llmb_run.metadata_utils import model_size_to_billions, normalize_model_dtype_config, parse_workload_name from llmb_run.task_generation import ValidationError from llmb_run.tasks import WorkloadTask @@ -180,7 +179,7 @@ def parse_exemplar_workload_name(workload_name: str) -> Tuple[str, str]: """Parse workload name into workload_key and model_size. The model_size is the suffix after the last underscore and must match the pattern: - - \\d+(\\.\\d+)?b (e.g., '7b', '70b', '3.5b') + - \\d+(\\.\\d+)?[bt] (e.g., '7b', '70b', '3.5b', '1t') Args: workload_name: Full workload name (e.g., 'pretrain_llama3.1_70b') @@ -196,7 +195,7 @@ def parse_exemplar_workload_name(workload_name: str) -> Tuple[str, str]: if model_size is None: raise ValidationError( f"Invalid workload name '{workload_name}': must contain at least one underscore " - f"with model size suffix (e.g., 'pretrain_llama3.1_70b')" + f"with model size suffix (e.g., 'pretrain_llama3.1_70b' or 'pretrain_kimi-k2_1t')" ) return workload_key, model_size @@ -241,24 +240,22 @@ def get_exemplar_configs_from_yaml(yaml_data: Dict[str, Any], gpu_type: str) -> def _extract_numeric_model_size(model_size: str) -> float: - """Extract numeric value from model size string for sorting. + """Extract model size in billions for sorting. Examples: "7b" -> 7.0 "70b" -> 70.0 "340b" -> 340.0 "405b" -> 405.0 + "1t" -> 1000.0 Args: - model_size: Model size string (e.g., "7b", "70b", "340b") + model_size: Model size string (e.g., "7b", "70b", "340b", "1t") Returns: Numeric value for sorting (defaults to 0.0 if no match) """ - match = re.match(r'^(\d+(?:\.\d+)?)', model_size) - if match: - return float(match.group(1)) - return 0.0 + return model_size_to_billions(model_size) def validate_yaml_config_against_metadata( @@ -459,7 +456,7 @@ def compute_and_validate_eligible_configs( Tuple of (eligible configs list, scale from YAML, repeats from YAML, profile from YAML) - eligible configs: List of (workload_key, model_size, dtype) tuples - scale: Scale value from YAML config (defaults to 512) - - repeats: Number of repeats from YAML config (defaults to 3) + - repeats: Number of repeats from YAML config (defaults to 1) - profile: If True, include exactly one profiled run per config (last repeat). If False, no profiling. Raises: @@ -476,8 +473,8 @@ def compute_and_validate_eligible_configs( # Extract config values from YAML config = yaml_data.get('config', {}) scale = config.get('scale', 512) - repeats = config.get('repeats', 3) - profile = config.get('profile', True) + repeats = config.get('repeats', 1) + profile = config.get('profile', False) # Get eligible configs from YAML and validate against metadata eligible_configs = get_eligible_exemplar_configs_from_yaml(yaml_data, gpu_type, workloads) @@ -508,7 +505,7 @@ def generate_exemplar_tasks( Args: workloads: Dictionary of all workloads from get_workloads() cluster_config: Cluster configuration dictionary - repeats: Number of repeats per configuration. If None, uses value from YAML config.repeats (default: 3) + repeats: Number of repeats per configuration. If None, uses value from YAML config.repeats (default: 1) Returns: List of WorkloadTask objects, deterministically ordered: diff --git a/cli/llmb-run/src/llmb_run/job_history.py b/cli/llmb-run/src/llmb_run/job_history.py new file mode 100644 index 0000000..229b70a --- /dev/null +++ b/cli/llmb-run/src/llmb_run/job_history.py @@ -0,0 +1,781 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# DEALINGS IN THE SOFTWARE. + +"""Persistent llmb-run job history backed by SQLite.""" + +from __future__ import annotations + +import contextlib +import datetime +import json +import logging +import pathlib +import sqlite3 +import sys +from collections.abc import Iterator +from dataclasses import dataclass +from typing import Any + +import yaml +from rich import box +from rich.console import Console +from rich.table import Table + +from llmb_run.config_manager import ClusterConfig +from llmb_run.pretrain_log_parser import ( + PretrainLogParseResult, + PretrainLogParseStatus, + parse_latest_pretrain_job_log, + parser_name_for_framework, +) +from llmb_run.slurm_utils import SlurmAccountingRecord, SlurmJob, get_slurm_job_statuses, parse_slurm_job_id +from llmb_run.tasks import WorkloadTask + +logger = logging.getLogger('llmb_run.job_history') + +DB_SCHEMA_VERSION = 1 +HISTORY_DIR_NAME = ".llmb" +HISTORY_DB_NAME = "jobs.sqlite3" +# sacct accounting can lag behind sbatch by several seconds; don't mark a job +# PURGED if it was created within this window — sacct just hasn't seen it yet. +PURGE_GRACE_SECONDS = 300 +TERMINAL_STATES = { + "BOOT_FAIL", + "CANCELLED", + "COMPLETED", + "DEADLINE", + "FAILED", + "NODE_FAIL", + "OUT_OF_MEMORY", + "PURGED", + "TIMEOUT", +} + + +@dataclass(frozen=True) +class JobRecord: + job_id: int + launcher_type: str + workload_key: str | None = None + model_name: str | None = None + model_size: str | None = None + dtype: str | None = None + scale: int | None = None + profile_enabled: bool = False + proxy: bool = False + log_dir: str | None = None + llmb_config_path: str | None = None + submit_time: str | None = None + env_overrides_json: str | None = None + model_overrides_json: str | None = None + + +@dataclass(frozen=True) +class RebuildStats: + scanned: int + imported: int + skipped: int + db_path: pathlib.Path + refresh_error: str | None = None + + +def get_history_db_path(llmb_install: str | pathlib.Path) -> pathlib.Path: + return pathlib.Path(llmb_install) / HISTORY_DIR_NAME / HISTORY_DB_NAME + + +def base_slurm_state(state: str | None) -> str: + if not state: + return "" + return state.strip().split()[0].upper() + + +def is_terminal_state(state: str | None) -> bool: + return base_slurm_state(state) in TERMINAL_STATES + + +@contextlib.contextmanager +def _open_history_db(config: ClusterConfig) -> Iterator[sqlite3.Connection]: + """Open the history DB, creating the parent dir and ensuring the schema.""" + db_path = get_history_db_path(config.llmb_install) + db_path.parent.mkdir(parents=True, exist_ok=True) + with _connect(db_path) as conn: + _initialize_schema(conn) + yield conn + + +def record_job_submission( + config: ClusterConfig, task: WorkloadTask, slurm_job: SlurmJob, workloads: dict[str, Any] +) -> None: + """Best-effort record of a successfully submitted primary workload job. + + Uses wall-clock time for submit_time; sacct will overwrite it with the + canonical Slurm-side value on the next refresh. Logs a warning on any + failure; never raises. + """ + if not slurm_job.job_id: + return + + try: + job_id = parse_slurm_job_id(slurm_job.job_id) + + llmb_config_path = slurm_job.llmb_config_path + log_dir = slurm_job.job_workdir or (str(pathlib.Path(llmb_config_path).parent) if llmb_config_path else None) + workload_info = workloads.get(task.workload_key, {}) + metadata = workload_info.get('metadata', {}) + launcher_type = metadata.get('run', {}).get('launcher_type') + if not launcher_type: + logger.warning(f"Unable to record job history: missing launcher type for workload {task.workload_key}") + return + + model_name = metadata.get('general', {}).get('model', workload_info.get('workload', '')) + if launcher_type == 'sbatch': + log_dir = None + + record = JobRecord( + job_id=job_id, + launcher_type=launcher_type, + workload_key=task.workload_key, + model_name=model_name, + model_size=task.model_size, + dtype=task.dtype, + scale=task.scale, + profile_enabled=task.profile, + proxy=task.proxy, + log_dir=log_dir, + llmb_config_path=llmb_config_path, + submit_time=_now_iso(), + env_overrides_json=_json_dumps(task.env_overrides), + model_overrides_json=_json_dumps(task.model_overrides), + ) + + upsert_static_job(config, record) + except Exception as e: + logger.warning(f"Unable to update llmb-run job history: {e}") + + +def upsert_static_job(config: ClusterConfig, record: JobRecord) -> None: + now = _now_iso() + + with _open_history_db(config) as conn: + # submit_time is INSERT-only so a follow-up rebuild can't clobber the + # canonical sacct value written by _update_slurm_record. + conn.execute( + """ + INSERT INTO jobs ( + job_id, + launcher_type, + workload_key, + model_name, + model_size, + dtype, + scale, + profile_enabled, + proxy, + log_dir, + llmb_config_path, + submit_time, + created_at, + updated_at, + env_overrides_json, + model_overrides_json + ) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + ON CONFLICT(job_id) DO UPDATE SET + launcher_type = excluded.launcher_type, + workload_key = excluded.workload_key, + model_name = excluded.model_name, + model_size = excluded.model_size, + dtype = excluded.dtype, + scale = excluded.scale, + profile_enabled = excluded.profile_enabled, + proxy = excluded.proxy, + log_dir = excluded.log_dir, + llmb_config_path = excluded.llmb_config_path, + updated_at = excluded.updated_at, + env_overrides_json = excluded.env_overrides_json, + model_overrides_json = excluded.model_overrides_json + """, + ( + record.job_id, + record.launcher_type, + record.workload_key, + record.model_name, + record.model_size, + record.dtype, + record.scale, + int(record.profile_enabled), + int(record.proxy), + record.log_dir, + record.llmb_config_path, + record.submit_time, + now, + now, + record.env_overrides_json, + record.model_overrides_json, + ), + ) + conn.commit() + + +def list_jobs(config: ClusterConfig) -> list[sqlite3.Row]: + with _open_history_db(config) as conn: + return list(conn.execute(""" + SELECT * + FROM jobs + ORDER BY + LOWER(COALESCE(NULLIF(workload_key, ''), NULLIF(model_name, ''), '')) ASC, + CAST(REPLACE(LOWER(COALESCE(NULLIF(model_size, ''), '0')), 'b', '') AS REAL) DESC, + LOWER(COALESCE(NULLIF(model_size, ''), '')) ASC, + LOWER(COALESCE(NULLIF(dtype, ''), '')) ASC, + scale ASC, + profile_enabled ASC, + COALESCE(NULLIF(submit_time, ''), created_at) DESC, + job_id DESC + """)) + + +def get_job(config: ClusterConfig, job_id: int) -> sqlite3.Row | None: + with _open_history_db(config) as conn: + return conn.execute("SELECT * FROM jobs WHERE job_id = ?", (job_id,)).fetchone() + + +def refresh_non_terminal_jobs(config: ClusterConfig) -> tuple[int, str | None]: + """Refresh sacct status for all non-terminal jobs. + + Returns (refreshed_count, error_message). error_message is None on success; + a short string when sacct itself failed (timeout, missing binary, etc.). + """ + with _open_history_db(config) as conn: + rows = list(conn.execute("SELECT job_id, slurm_state FROM jobs ORDER BY job_id")) + job_ids = [int(row["job_id"]) for row in rows if not is_terminal_state(row["slurm_state"])] + + refreshed, error = _refresh_slurm_statuses(config, job_ids) + if error is None: + _update_terminal_job_results(config) + return refreshed, error + + +def refresh_requested_jobs(config: ClusterConfig, job_ids: list[int]) -> tuple[int, str | None]: + """Force-refresh the requested jobs and update terminal results.""" + refreshed, error = _refresh_slurm_statuses(config, job_ids) + if error is None: + _update_terminal_job_results(config, job_ids=job_ids, reparse_existing=True) + return refreshed, error + + +def _refresh_slurm_statuses(config: ClusterConfig, job_ids: list[int]) -> tuple[int, str | None]: + """Refresh sacct status for the given jobs. + + Jobs in ``job_ids`` that sacct succeeds on but does not return are marked + ``slurm_state = "PURGED"`` (sacct accounting has dropped them). Returns + (refreshed_count, error_message); error_message is None on success. + """ + if not job_ids: + return 0, None + + refreshed = 0 + for chunk in _chunks(sorted({int(job_id) for job_id in job_ids}), 200): + records = get_slurm_job_statuses(chunk) + if records is None: + return refreshed, "sacct query failed" + + with _open_history_db(config) as conn: + for job_id in chunk: + record = records.get(job_id) + if record is not None: + _update_slurm_record(conn, record) + refreshed += 1 + elif _mark_job_purged(conn, job_id): + refreshed += 1 + # else: still inside PURGE_GRACE_SECONDS — leave for the next refresh. + conn.commit() + + return refreshed, None + + +def rebuild_history(config: ClusterConfig, workloads: dict[str, Any]) -> RebuildStats: + db_path = get_history_db_path(config.llmb_install) + workloads_root = pathlib.Path(config.llmb_install) / "workloads" + scanned = 0 + imported = 0 + skipped = 0 + imported_job_ids: list[int] = [] + + if not workloads_root.exists(): + return RebuildStats(scanned=0, imported=0, skipped=0, db_path=db_path) + + for config_path in sorted(workloads_root.glob("**/llmb-config_*.yaml")): + scanned += 1 + record = job_record_from_config(config.llmb_install, config_path, workloads) + if record is None: + skipped += 1 + continue + + upsert_static_job(config, record) + imported += 1 + imported_job_ids.append(record.job_id) + + _, refresh_error = _refresh_slurm_statuses(config, imported_job_ids) + if refresh_error is None: + _update_terminal_job_results(config) + return RebuildStats( + scanned=scanned, imported=imported, skipped=skipped, db_path=db_path, refresh_error=refresh_error + ) + + +def job_record_from_config( + llmb_install: str | pathlib.Path, config_path: pathlib.Path, workloads: dict[str, Any] +) -> JobRecord | None: + config_data = _load_llmb_config(config_path) + if not config_data: + return None + + job_info = config_data.get('job_info') or {} + model_info = config_data.get('model_info') or {} + job_config = config_data.get('job_config') or {} + + raw_job_id = job_info.get('job_id') or _job_id_from_config_filename(config_path) + try: + job_id = parse_slurm_job_id(raw_job_id) + except ValueError: + logger.debug(f"Skipping llmb config without a parseable job id: {config_path}") + return None + + workloads_root = pathlib.Path(llmb_install) / "workloads" + workload_key = None + try: + relative = config_path.relative_to(workloads_root) + if relative.parts: + workload_key = relative.parts[0] + except ValueError: + pass + + launcher_type = _launcher_type_for_workload(workloads, workload_key) + if not launcher_type: + logger.debug(f"Skipping llmb config without a known launcher type: {config_path}") + return None + if launcher_type == 'sbatch': + logger.debug(f"Skipping legacy sbatch llmb config during rebuild: {config_path}") + return None + + # Older llmb-config files call submit_time launch_time, but it is recorded + # immediately after Slurm submission so the values are interchangeable. + submit_time = job_info.get('submit_time') or job_info.get('launch_time') + + return JobRecord( + job_id=job_id, + launcher_type=launcher_type, + workload_key=workload_key, + model_name=model_info.get('model_name'), + model_size=model_info.get('model_size'), + dtype=model_info.get('dtype'), + scale=_as_int(model_info.get('scale')), + profile_enabled=bool(job_config.get('profile_enabled')), + proxy=bool(job_config.get('proxy')), + log_dir=str(config_path.parent), + llmb_config_path=str(config_path), + submit_time=submit_time, + env_overrides_json=_json_dumps(job_config.get('env_overrides') or {}), + model_overrides_json=_json_dumps(job_config.get('model_overrides') or {}), + ) + + +# States omitted from this map render in the terminal's default color. +# PENDING intentionally stays unstyled so RUNNING is easier to spot. +_SLURM_STATE_STYLES = { + "COMPLETED": "green", + "RUNNING": "cyan", + "REQUEUED": "yellow", + "PREEMPTED": "yellow", + "CANCELLED": "yellow", + "PURGED": "dim", + "FAILED": "red", + "BOOT_FAIL": "red", + "NODE_FAIL": "red", + "OUT_OF_MEMORY": "red", + "DEADLINE": "red", + "TIMEOUT": "red", +} + + +def format_jobs_table(rows: list[sqlite3.Row]) -> str: + if not rows: + return "No jobs found. Run `llmb-run jobs rebuild` to scan existing llmb-config files." + + # Color when stdout is a terminal; plain text when piped or under tests. + use_color = sys.stdout.isatty() + + table = Table(box=box.SIMPLE_HEAD, header_style="bold", show_edge=False, pad_edge=False) + table.add_column("Workload", overflow="fold") + table.add_column("DType") + table.add_column("Scale", justify="right") + table.add_column("Job ID", justify="right") + table.add_column("Profile") + table.add_column("Submit Time") + table.add_column("Slurm Status") + table.add_column("Elapsed") + table.add_column("s/iter", justify="right") + table.add_column("TFLOPS/GPU", justify="right") + + for row in rows: + slurm_state = row["slurm_state"] or "" + style = _SLURM_STATE_STYLES.get(base_slurm_state(slurm_state)) if use_color else None + display_state = _display_slurm_state(slurm_state) + styled_state = f"[{style}]{display_state}[/{style}]" if style and display_state else display_state + + table.add_row( + _display_workload(row), + row["dtype"] or "", + str(row["scale"]) if row["scale"] is not None else "", + str(row["job_id"]), + _yes_no(row["profile_enabled"]), + _format_timestamp(row["submit_time"]), + styled_state, + row["elapsed"] or "", + _format_perf_metric(row, "train_step_time_seconds"), + _format_perf_metric(row, "tflops_per_gpu"), + ) + + console = Console( + width=160, + force_terminal=use_color, + color_system="truecolor" if use_color else None, + ) + with console.capture() as capture: + console.print(table) + return capture.get().rstrip() + + +def format_job_details(row: sqlite3.Row) -> str: + details = [ + ("Job ID", str(row["job_id"])), + ("Launcher", row["launcher_type"]), + ("Workload", row["workload_key"]), + ("Model", _display_model(row)), + ("DType", row["dtype"]), + ("Scale", str(row["scale"]) if row["scale"] is not None else ""), + ("Profile", _yes_no(row["profile_enabled"])), + ("Proxy", _yes_no(row["proxy"])), + ("Status", row["slurm_state"]), + ("Elapsed", row["elapsed"]), + ("Perf Parse", _display_perf_parse_status(row["perf_parse_status"])), + ("s/iter", _format_float(row["train_step_time_seconds"])), + ("TFLOPS/GPU", _format_float(row["tflops_per_gpu"])), + ("Submit Time", _format_timestamp(row["submit_time"])), + ("Node List", row["node_list"]), + ("Exit Code", row["exit_code"]), + ("Log Dir", row["log_dir"]), + ] + width = max(len(label) for label, _ in details) + return "\n".join(f"{label.ljust(width)} : {value or ''}" for label, value in details) + + +def _connect(db_path: pathlib.Path) -> sqlite3.Connection: + conn = sqlite3.connect(db_path) + conn.row_factory = sqlite3.Row + conn.execute("PRAGMA busy_timeout = 5000") + return conn + + +def _initialize_schema(conn: sqlite3.Connection) -> None: + conn.execute(""" + CREATE TABLE IF NOT EXISTS metadata ( + key TEXT PRIMARY KEY, + value TEXT NOT NULL + ) + """) + stored = conn.execute("SELECT value FROM metadata WHERE key = 'schema_version'").fetchone() + if stored is not None: + try: + stored_version = int(stored[0]) + except (TypeError, ValueError) as e: + raise RuntimeError(f"Job history DB has an unparseable schema_version {stored[0]!r}.") from e + if stored_version > DB_SCHEMA_VERSION: + raise RuntimeError( + f"Job history DB schema_version {stored_version} is newer than this llmb-run " + f"build supports ({DB_SCHEMA_VERSION}). Upgrade llmb-run." + ) + conn.execute(""" + CREATE TABLE IF NOT EXISTS jobs ( + job_id INTEGER PRIMARY KEY, + launcher_type TEXT NOT NULL, + workload_key TEXT, + model_name TEXT, + model_size TEXT, + dtype TEXT, + scale INTEGER, + profile_enabled INTEGER NOT NULL DEFAULT 0, + proxy INTEGER NOT NULL DEFAULT 0, + log_dir TEXT, + llmb_config_path TEXT, + created_at TEXT, + updated_at TEXT, + slurm_state TEXT, + elapsed TEXT, + submit_time TEXT, + node_list TEXT, + exit_code TEXT, + last_status_refresh TEXT, + env_overrides_json TEXT, + model_overrides_json TEXT, + train_step_time_seconds REAL, + tflops_per_gpu REAL, + perf_parse_status TEXT + ) + """) + conn.execute( + """ + INSERT INTO metadata (key, value) + VALUES ('schema_version', ?) + ON CONFLICT(key) DO UPDATE SET value = excluded.value + """, + (str(DB_SCHEMA_VERSION),), + ) + + +def _update_slurm_record(conn: sqlite3.Connection, record: SlurmAccountingRecord) -> None: + now = _now_iso() + conn.execute( + """ + UPDATE jobs + SET + slurm_state = ?, + elapsed = ?, + submit_time = ?, + node_list = ?, + exit_code = ?, + last_status_refresh = ?, + updated_at = ? + WHERE job_id = ? + """, + ( + record.state, + record.elapsed, + record.submit_time, + record.node_list, + record.exit_code, + now, + now, + record.job_id, + ), + ) + + +def _mark_job_purged(conn: sqlite3.Connection, job_id: int) -> bool: + """Flag a job as PURGED when sacct reports nothing for it. + + Skips jobs created within ``PURGE_GRACE_SECONDS`` so a freshly submitted + job that hasn't propagated to sacct yet isn't mislabeled. + Returns True when the row was updated; False when the grace period blocked it. + """ + now_dt = datetime.datetime.now(datetime.timezone.utc).replace(microsecond=0) + cutoff = (now_dt - datetime.timedelta(seconds=PURGE_GRACE_SECONDS)).isoformat() + now = now_dt.isoformat() + cursor = conn.execute( + """ + UPDATE jobs + SET slurm_state = ?, last_status_refresh = ?, updated_at = ? + WHERE job_id = ? AND (created_at IS NULL OR created_at < ?) + """, + ("PURGED", now, now, job_id, cutoff), + ) + return cursor.rowcount > 0 + + +def _update_terminal_job_results( + config: ClusterConfig, *, job_ids: list[int] | None = None, reparse_existing: bool = False +) -> None: + """Update result columns for terminal jobs, skipping previous attempts by default.""" + with _open_history_db(config) as conn: + filters = [ + "launcher_type IN ('nemo', 'megatron_bridge')", + "log_dir IS NOT NULL", + "llmb_config_path IS NOT NULL", + ] + params: list[Any] = [] + if not reparse_existing: + filters.append("perf_parse_status IS NULL") + if job_ids: + job_ids = sorted({int(job_id) for job_id in job_ids}) + filters.append(f"job_id IN ({','.join('?' for _ in job_ids)})") + params.extend(job_ids) + + rows = list( + conn.execute( + f""" + SELECT * + FROM jobs + WHERE {' AND '.join(filters)} + ORDER BY job_id + """, + params, + ) + ) + + for row in rows: + if not is_terminal_state(row["slurm_state"]): + continue + + result = _parse_job_performance(row) + if result is None: + continue + + metrics = result.metrics + now = _now_iso() + conn.execute( + """ + UPDATE jobs + SET + perf_parse_status = ?, + train_step_time_seconds = ?, + tflops_per_gpu = ?, + updated_at = ? + WHERE job_id = ? + """, + ( + result.status.value, + metrics.time_mean_seconds if metrics else None, + metrics.tflops_per_gpu_mean if metrics else None, + now, + row["job_id"], + ), + ) + + conn.commit() + + +def _parse_job_performance(row: sqlite3.Row) -> PretrainLogParseResult | None: + framework = _framework_from_llmb_config(row["llmb_config_path"]) + if parser_name_for_framework(framework) is None: + return None + + try: + return parse_latest_pretrain_job_log(row["log_dir"], int(row["job_id"]), framework) + except OSError as e: + logger.debug(f"Unable to parse perf log for job {row['job_id']}: {e}") + return None + + +def _framework_from_llmb_config(config_path: str | pathlib.Path | None) -> str | None: + if not config_path: + return None + + config_data = _load_llmb_config(pathlib.Path(config_path)) + framework = (config_data.get('workload_info') or {}).get('framework') + return str(framework) if framework else None + + +def _display_model(row: sqlite3.Row) -> str: + model_name = row["model_name"] or row["workload_key"] or "" + model_size = row["model_size"] or "" + if model_name and model_size: + return f"{model_name}_{model_size}" + return model_name or model_size + + +def _display_workload(row: sqlite3.Row) -> str: + workload_key = row["workload_key"] or "" + model_size = row["model_size"] or "" + if workload_key and model_size: + return f"{workload_key}_{model_size}" + return workload_key or _display_model(row) + + +def _display_slurm_state(state: str | None) -> str: + if base_slurm_state(state) == "CANCELLED": + return "CANCELLED" + return state or "" + + +def _format_timestamp(value: str | None) -> str: + if not value: + return "" + return value.replace("T", " ")[:16] + + +def _format_float(value: float | None) -> str: + return "" if value is None else f"{value:.2f}" + + +def _format_perf_metric(row: sqlite3.Row, field: str) -> str: + if row["perf_parse_status"] == PretrainLogParseStatus.INVALID_GRAD_NORM.value: + return "Invalid" + return _format_float(row[field]) + + +def _display_perf_parse_status(status: str | None) -> str: + if status == PretrainLogParseStatus.INVALID_GRAD_NORM.value: + return "invalid: grad_norm=nan" + return status or "" + + +def _yes_no(value: object) -> str: + return "Yes" if value else "-" + + +def _load_llmb_config(config_path: pathlib.Path) -> dict[str, Any]: + try: + with config_path.open('r') as f: + data = yaml.safe_load(f) + except (OSError, yaml.YAMLError) as e: + logger.debug(f"Unable to read llmb config {config_path}: {e}") + return {} + + return data if isinstance(data, dict) else {} + + +def _launcher_type_for_workload(workloads: dict[str, Any], workload_key: str | None) -> str | None: + if not workload_key: + return None + workload_info = workloads.get(workload_key) + if not isinstance(workload_info, dict): + return None + metadata = workload_info.get('metadata') + if not isinstance(metadata, dict): + return None + launcher_type = (metadata.get('run') or {}).get('launcher_type') + return str(launcher_type) if launcher_type else None + + +def _job_id_from_config_filename(config_path: pathlib.Path) -> str | None: + name = config_path.name + prefix = "llmb-config_" + suffix = ".yaml" + if name.startswith(prefix) and name.endswith(suffix): + return name[len(prefix) : -len(suffix)] + return None + + +def _json_dumps(value: Any) -> str: + return json.dumps(value or {}, sort_keys=True) + + +def _as_int(value: object) -> int | None: + try: + return int(value) + except (TypeError, ValueError): + return None + + +def _chunks(values: list[int], size: int) -> list[list[int]]: + return [values[index : index + size] for index in range(0, len(values), size)] + + +def _now_iso() -> str: + return datetime.datetime.now(datetime.timezone.utc).replace(microsecond=0).isoformat() diff --git a/cli/llmb-run/src/llmb_run/job_launcher.py b/cli/llmb-run/src/llmb_run/job_launcher.py index 69b0c7b..9b9822d 100644 --- a/cli/llmb-run/src/llmb_run/job_launcher.py +++ b/cli/llmb-run/src/llmb_run/job_launcher.py @@ -37,9 +37,20 @@ NVLINK_DOMAIN_SIZE, SLURM_OUTPUT_PATTERN, ) +from llmb_run.env_args import ( + apply_nemo_explicit_env_contract, + apply_nemo_workload_args_contract, + apply_sbatch_explicit_env_contract, +) +from llmb_run.job_history import record_job_submission from llmb_run.nsys_mount_handler import get_tool_mounts from llmb_run.run_config import create_llmb_config -from llmb_run.slurm_utils import SlurmJob, get_cluster_name +from llmb_run.slurm_args import ( + ADDITIONAL_SLURM_PARAMS_KEY, + SlurmArgs, + validate_no_additional_slurm_params_conflict, +) +from llmb_run.slurm_utils import SlurmJob, get_cluster_name, parse_slurm_job_id from llmb_run.tasks import format_task_output logger = logging.getLogger('llmb_run.job_launcher') @@ -284,6 +295,24 @@ def get_gpu_type(self, task): """Determine GPU type for a task from cluster config only.""" return self.config.gpu_type + def resolve_task_slurm_args(self, task, *, workload_config=None) -> SlurmArgs | None: + """Resolve per-task canonical Slurm args in one place.""" + cli_args = task.slurm_args + if not cli_args: + return None + + workload_environment = {} + if workload_config: + workload_environment = workload_config.get('environment', {}) + + validate_no_additional_slurm_params_conflict( + cli_args=cli_args, + cluster_environment=self.config.environment, + workload_environment=workload_environment, + task_environment=task.env_overrides, + ) + return cli_args + class SbatchLauncher(JobLauncher): """Launcher for SLURM sbatch jobs. @@ -292,6 +321,7 @@ class SbatchLauncher(JobLauncher): def launch(self, task): """Launch a task using the legacy sbatch method.""" + workload_config = self.config.workload_config(task.workload_key) job = { "workload": task.workload_key, "LLMB_INSTALL": self.config.llmb_install, @@ -317,6 +347,11 @@ def launch(self, task): ntasks_per_node = str(gpus_per_node) if task.scale >= gpus_per_node else str(task.scale) llmb_workload = f"{self.config.llmb_install}/workloads/{job['workload']}" + try: + slurm_args = self.resolve_task_slurm_args(task, workload_config=workload_config) + except ValueError as e: + logger.error(format_task_output(task, prefix="ERROR: ", suffix=str(e))) + return SlurmJob(job_id=None, job_workdir=None) cmd = [ "sbatch", @@ -329,8 +364,8 @@ def launch(self, task): f"--ntasks-per-node={ntasks_per_node}", ] - if task.extra_slurm_params and 'nice' in task.extra_slurm_params: - cmd.append(f"--nice={task.extra_slurm_params['nice']}") + if slurm_args: + cmd.extend(slurm_args.to_sbatch_args()) cmd.append(self.get_launch_script(task)) @@ -367,6 +402,7 @@ def launch(self, task): # Convert all task override values to strings task_env = {k: str(v) for k, v in task.env_overrides.items()} env.update(task_env) + apply_sbatch_explicit_env_contract(env, task.explicit_env_overrides) # Handle model parameter overrides if task.model_overrides: @@ -375,17 +411,20 @@ def launch(self, task): try: logger.debug(f"Command: {cmd}") result = subprocess.run(cmd, capture_output=True, check=True, text=True, env=env, cwd=job['dir']) - job_id = result.stdout.strip() + job_id = parse_slurm_job_id(result.stdout) logger.info(format_task_output(task, prefix="SUBMITTED: ", suffix=f"jobid={job_id}")) # Create llmb-config.yaml file in the current directory - create_llmb_config(task, job_id, None, self.config, self.workloads) + config_path = create_llmb_config(task, str(job_id), None, self.config, self.workloads) # TODO: This job directory is not correct. - return SlurmJob(job_id=job_id, job_workdir=None) + return SlurmJob(job_id=job_id, job_workdir=None, llmb_config_path=config_path) except subprocess.CalledProcessError as e: logger.error(format_task_output(task, prefix="ERROR: ", suffix=f"error={e.stderr.strip()}")) return SlurmJob(job_id=None, job_workdir=None) + except ValueError as e: + logger.error(format_task_output(task, prefix="ERROR: ", suffix=str(e))) + return SlurmJob(job_id=None, job_workdir=None) class ConfiguredSbatchLauncher(JobLauncher): @@ -396,6 +435,7 @@ class ConfiguredSbatchLauncher(JobLauncher): def launch(self, task): """Launch a task using sbatch with a pre-created experiment directory.""" + workload_config = self.config.workload_config(task.workload_key) # Working directory for sbatch (where the launch script lives) script_dir = self.workloads[task.workload_key]["dir"] # Base directory for experiments (under $LLMB_INSTALL/workloads/) @@ -430,6 +470,11 @@ def launch(self, task): num_nodes = (task.scale + gpus_per_node - 1) // gpus_per_node ntasks_per_node = str(gpus_per_node) if task.scale >= gpus_per_node else str(task.scale) + try: + slurm_args = self.resolve_task_slurm_args(task, workload_config=workload_config) + except ValueError as e: + logger.error(format_task_output(task, prefix="ERROR: ", suffix=str(e))) + return SlurmJob(job_id=None, job_workdir=None) cmd = [ "sbatch", @@ -442,8 +487,9 @@ def launch(self, task): f"--ntasks-per-node={ntasks_per_node}", ] - if task.extra_slurm_params and 'nice' in task.extra_slurm_params: - cmd.append(f"--nice={task.extra_slurm_params['nice']}") + segment_override = slurm_args.get_named_param('segment') if slurm_args else None + if slurm_args: + cmd.extend(slurm_args.to_sbatch_args()) cmd.append(self.get_launch_script(task)) @@ -472,15 +518,6 @@ def launch(self, task): env['ENABLE_VBOOST'] = 'true' logger.debug("Automatically enabled VBoost for 'eos' cluster") - # Set SBATCH_SEGMENT_SIZE for GB200/GB300 if not already set in environment - if ( - gpu_type in {'gb200', 'gb300'} - and 'SBATCH_SEGMENT_SIZE' not in env - and 'SBATCH_SEGMENT_SIZE' not in task.env_overrides - ): - env['SBATCH_SEGMENT_SIZE'] = str(_compute_segment_size(num_nodes)) - logger.debug(f"Set SBATCH_SEGMENT_SIZE={env['SBATCH_SEGMENT_SIZE']} for {gpu_type}") - # Handle environment variables from config if self.config.environment: env_vars = {k: str(v) for k, v in self.config.environment.items()} @@ -489,31 +526,49 @@ def launch(self, task): # Convert all task override values to strings task_env = {k: str(v) for k, v in task.env_overrides.items()} env.update(task_env) + apply_sbatch_explicit_env_contract(env, task.explicit_env_overrides) # Handle model parameter overrides if task.model_overrides: env.update({k.upper(): str(v) for k, v in task.model_overrides.items()}) - # Backward-compat: also pass segment size as a CLI flag for older Slurm - # versions where SBATCH_SEGMENT_SIZE did not exist but --segment did. - if 'SBATCH_SEGMENT_SIZE' in env: - cmd.insert(-1, f"--segment={env['SBATCH_SEGMENT_SIZE']}") + # Resolve segment as a --segment flag. CLI override is already in cmd + # via to_sbatch_args(). Otherwise, check for SBATCH_SEGMENT_SIZE from any + # env source, or auto-detect for GB200/GB300. + # NOTE: Newer Slurm versions support SBATCH_SEGMENT_SIZE as an env var + # natively; if we ever drop support for older versions, we could set the + # env var instead of the flag. + if segment_override is None: + env_segment = env.get('SBATCH_SEGMENT_SIZE') + if env_segment is not None: + cmd.insert(-1, f"--segment={env_segment}") + elif gpu_type in {'gb200', 'gb300'}: + computed = _compute_segment_size(num_nodes) + cmd.insert(-1, f"--segment={computed}") + logger.debug(f"Auto-detected segment size {computed} for {gpu_type}") + + # Remove SBATCH_SEGMENT_SIZE from the subprocess env — we pass --segment + # as a flag instead. + env.pop('SBATCH_SEGMENT_SIZE', None) try: logger.debug(f"Command: {cmd}") result = subprocess.run(cmd, capture_output=True, check=True, text=True, env=env, cwd=script_dir) - job_id = result.stdout.strip() + job_id = parse_slurm_job_id(result.stdout) logger.info( format_task_output(task, prefix="SUBMITTED: ", suffix=f"jobid={job_id} workdir={experiment_dir}") ) # Create llmb-config.yaml file in the experiment directory - create_llmb_config(task, job_id, experiment_dir, self.config, self.workloads) + config_path = create_llmb_config(task, str(job_id), experiment_dir, self.config, self.workloads) - return SlurmJob(job_id=job_id, job_workdir=experiment_dir) + return SlurmJob(job_id=job_id, job_workdir=experiment_dir, llmb_config_path=config_path) except subprocess.CalledProcessError as e: logger.error(format_task_output(task, prefix="ERROR: ", suffix=f"error={e.stderr.strip()}")) return SlurmJob(job_id=None, job_workdir=None) + except ValueError as e: + logger.error(format_task_output(task, prefix="ERROR: ", suffix=str(e))) + return SlurmJob(job_id=None, job_workdir=None) class Nemo2Launcher(JobLauncher): @@ -534,6 +589,15 @@ def launch(self, task): return SlurmJob(job_id=None, job_workdir=None) try: + slurm_args = self.resolve_task_slurm_args(task, workload_config=workload_config) + except ValueError as e: + logger.error(format_task_output(task, prefix="ERROR: ", suffix=str(e))) + return SlurmJob(job_id=None, job_workdir=None) + + try: + workload_metadata = self.workloads[task.workload_key].get('metadata', {}) + launcher_type = workload_metadata.get('run', {}).get('launcher_type') + # Get venv environment with the correct type env = get_venv_environment(venv_path, venv_type) @@ -575,14 +639,19 @@ def launch(self, task): # Convert all task override values to strings task_env = {k: str(v) for k, v in task.env_overrides.items()} env.update(task_env) + if launcher_type == 'megatron_bridge': + apply_nemo_workload_args_contract(env, task.extra_workload_args) + apply_nemo_explicit_env_contract(env, task.explicit_env_overrides) # Handle model parameter overrides if task.model_overrides: env.update({k.upper(): str(v) for k, v in task.model_overrides.items()}) + if slurm_args: + env[ADDITIONAL_SLURM_PARAMS_KEY] = slurm_args.to_additional_slurm_params() + # Handle custom tool mounts if needed (workarounds for container bugs during profiling) try: - workload_metadata = self.workloads[task.workload_key].get('metadata', {}) tool_mounts = get_tool_mounts( llmb_install=self.config.llmb_install, workload_metadata=workload_metadata, @@ -641,34 +710,18 @@ def launch(self, task): ) return SlurmJob(job_id=None, job_workdir=None) else: + try: + parsed_job_id = parse_slurm_job_id(job_id) + except ValueError as e: + logger.error(format_task_output(task, prefix="ERROR: ", suffix=str(e))) + return SlurmJob(job_id=None, job_workdir=None) logger.info(format_task_output(task, prefix="LAUNCHED: ", suffix=f"jobid={job_id}")) logger.info(f"JobID: {job_id}, Workdir: {local_dir}") - # Apply scontrol nice if specified - if nice_value := (task.extra_slurm_params or {}).get('nice'): - scontrol_cmd = ["scontrol", "update", f"jobid={job_id}", f"nice={nice_value}"] - try: - scontrol_result = subprocess.run( - scontrol_cmd, capture_output=True, check=False, text=True, timeout=10 - ) - if scontrol_result.returncode != 0: - # Log non-zero return codes for debugging, as this can happen if the job already started. - logger.debug( - f"scontrol failed for job {job_id} (nice={nice_value}): {scontrol_result.stderr.strip()}" - ) - except FileNotFoundError: - logger.warning("`scontrol` command not found. Cannot apply nice value.") - except subprocess.TimeoutExpired: - logger.warning(f"Timeout executing `scontrol` for job {job_id}.") - except Exception as e: - logger.warning( - f"An unexpected error occurred while executing scontrol for job {job_id}: {e}" - ) - # Create llmb-config.yaml file in the experiment directory - create_llmb_config(task, job_id, local_dir, self.config, self.workloads) + config_path = create_llmb_config(task, str(parsed_job_id), local_dir, self.config, self.workloads) - return SlurmJob(job_id=job_id, job_workdir=local_dir) + return SlurmJob(job_id=parsed_job_id, job_workdir=local_dir, llmb_config_path=config_path) else: console.print("\n[bold red]Launch Error:[/bold red]") console.print(result.stderr) @@ -735,6 +788,8 @@ def run_tests(config, task_list, workloads): failed_tasks.append(task) continue + record_job_submission(config, task, slurm_job, workloads) + # Run post-processing pipeline (resparse, upload, workload inspector) if slurm_job.job_id and slurm_job.job_workdir: try: diff --git a/cli/llmb-run/src/llmb_run/job_logs.py b/cli/llmb-run/src/llmb_run/job_logs.py new file mode 100644 index 0000000..4983a22 --- /dev/null +++ b/cli/llmb-run/src/llmb_run/job_logs.py @@ -0,0 +1,87 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +"""Helpers for locating and reading llmb-run job log files.""" + +from __future__ import annotations + +import pathlib +import re +import subprocess +from collections import deque +from dataclasses import dataclass + + +@dataclass(frozen=True) +class JobLogFile: + path: pathlib.Path + retry: int | None = None + + +def find_job_logs(log_dir: str | pathlib.Path, job_id: int) -> list[JobLogFile]: + """Find retry-numbered workload log files for one Slurm job.""" + directory = pathlib.Path(log_dir) + if not directory.is_dir(): + raise FileNotFoundError(f"Log directory not found: {directory}") + + pattern = re.compile(rf"^log-.*_{re.escape(str(job_id))}_(\d+)\.out$") + logs = [ + JobLogFile(path=path, retry=int(match.group(1))) + for path in directory.iterdir() + if path.is_file() and (match := pattern.match(path.name)) + ] + return sorted(logs, key=lambda f: f.retry) + + +def find_configured_sbatch_logs(log_dir: str | pathlib.Path, job_id: int) -> list[JobLogFile]: + """Find configured_sbatch logs, preferring workload logs over Slurm stdout.""" + logs = find_job_logs(log_dir, job_id) + if logs: + return logs + + slurm_log = pathlib.Path(log_dir) / f"slurm-{job_id}.out" + if slurm_log.is_file(): + return [JobLogFile(path=slurm_log)] + + return [] + + +def active_job_log(logs: list[JobLogFile]) -> JobLogFile | None: + if not logs: + return None + return logs[-1] + + +def read_tail(path: str | pathlib.Path, line_count: int) -> str: + """Read the last line_count lines from a text log file.""" + if line_count < 1: + raise ValueError("--tail must be at least 1.") + + with pathlib.Path(path).open("r", errors="replace") as f: + lines = deque(f, maxlen=line_count) + + return "".join(lines).rstrip("\n") + + +def follow_tail(path: str | pathlib.Path, line_count: int) -> int: + """Follow a log using the platform tail command.""" + if line_count < 1: + raise ValueError("--tail must be at least 1.") + + result = subprocess.run(["tail", "-n", str(line_count), "-f", str(path)], check=False) + return result.returncode diff --git a/cli/llmb-run/src/llmb_run/main.py b/cli/llmb-run/src/llmb_run/main.py index 9488082..021accf 100644 --- a/cli/llmb-run/src/llmb_run/main.py +++ b/cli/llmb-run/src/llmb_run/main.py @@ -21,18 +21,32 @@ import logging import os +import pathlib import sys from importlib.metadata import PackageNotFoundError from importlib.metadata import version as package_version from typing import Annotated, Optional import typer +import yaml from llmb_run.archive import run_archive from llmb_run.config_manager import ClusterConfig, get_cluster_config +from llmb_run.env_args import parse_cli_env_args from llmb_run.exemplar import generate_exemplar_tasks +from llmb_run.job_history import ( + format_job_details, + format_jobs_table, + get_job, + list_jobs, + rebuild_history, + refresh_non_terminal_jobs, + refresh_requested_jobs, +) from llmb_run.job_launcher import run_tests +from llmb_run.job_logs import active_job_log, find_configured_sbatch_logs, find_job_logs, follow_tail, read_tail from llmb_run.metadata_utils import parse_workload_name +from llmb_run.slurm_args import build_cli_slurm_args, validate_no_additional_slurm_params_conflict from llmb_run.task_generation import TaskGenerationRequest, ValidationError, generate_tasks from llmb_run.tasks import ( format_task_output, @@ -88,6 +102,13 @@ def format(self, record): add_completion=True, context_settings={"help_option_names": ["-h", "--help"]}, ) +jobs_app = typer.Typer( + help='View llmb-run job history and logs.', + no_args_is_help=False, + invoke_without_command=True, + context_settings={"help_option_names": ["-h", "--help"]}, +) +app.add_typer(jobs_app, name="jobs") class AppContext: @@ -99,6 +120,31 @@ def __init__(self): self.verbose = False +def _get_recipe_version() -> Optional[tuple[str, str]]: + """Return (recipe_version, abs_repo_path) if discoverable, else None. + + Lightweight: never raises, never logs. Uses the same cluster_config.yaml + resolution as the launcher (CWD takes precedence over $LLMB_INSTALL). + """ + try: + candidates = [pathlib.Path.cwd() / 'cluster_config.yaml'] + if install := os.environ.get('LLMB_INSTALL'): + candidates.append(pathlib.Path(install) / 'cluster_config.yaml') + cfg_path = next((p for p in candidates if p.exists()), None) + if cfg_path is None: + return None + cfg = yaml.safe_load(cfg_path.read_text()) or {} + repo = cfg.get('llmb_repo') or (cfg.get('launcher') or {}).get('llmb_repo') + if not repo: + return None + release_data = yaml.safe_load((pathlib.Path(repo) / 'release.yaml').read_text()) or {} + if version := release_data.get('llmb_version'): + return str(version), str(pathlib.Path(repo).resolve()) + return None + except Exception: + return None + + def version_callback(value: bool): if value: try: @@ -106,6 +152,10 @@ def version_callback(value: bool): except PackageNotFoundError: version = "unknown" typer.echo(f"llmb-run {version}") + recipe = _get_recipe_version() + if recipe is not None: + recipe_version, repo_path = recipe + typer.echo(f"Recipe Version {recipe_version} ({repo_path})") raise typer.Exit() @@ -149,6 +199,11 @@ def main_callback( logger.error(f"Configuration error: {e}") raise typer.Exit(code=EXIT_VALIDATION_ERROR) from e + # Archive and most jobs history commands only require cluster config. + # `jobs rebuild` loads workload metadata in its command handler. + if ctx.invoked_subcommand in {'archive', 'jobs'}: + return + # Best-effort gsw-common image freshness check try: from llmb_run.internal.image_updater import check_gsw_common_update @@ -159,10 +214,6 @@ def main_callback( except Exception as e: logger.debug(f"gsw-common update check skipped: {e}") - # Archive only requires cluster config; skip workload metadata loading. - if ctx.invoked_subcommand == 'archive': - return - # Load workloads try: app_ctx.workloads = get_workloads(app_ctx.cluster_config) @@ -252,7 +303,7 @@ def report_validation_results(validated_tasks, error_summary, task_list, cluster error_summary: Dictionary of validation errors task_list: Original list of all tasks cluster_config: Cluster configuration - mode_name: Name of the mode for error messages (e.g., "bulk", "submit-all") + mode_name: Name of the mode for error messages (e.g., "submit", "exemplar") """ cluster_gpu_type = cluster_config.gpu_type @@ -297,11 +348,191 @@ def report_validation_results(validated_tasks, error_summary, task_list, cluster logger.debug(f"✅ All {len(task_list)} tasks validated successfully.") +def _ctx_app_context(ctx: typer.Context) -> AppContext: + current = ctx + while current is not None: + if isinstance(current.obj, AppContext): + return current.obj + current = current.parent + raise RuntimeError("Missing llmb-run application context.") + + +def _jobs_list_impl(ctx: typer.Context) -> None: + app_ctx = _ctx_app_context(ctx) + _, refresh_error = refresh_non_terminal_jobs(app_ctx.cluster_config) + rows = list_jobs(app_ctx.cluster_config) + typer.echo(format_jobs_table(rows)) + if refresh_error: + # Print after the table so users notice it even when the table scrolls. + logger.warning(f"sacct unavailable; status may be stale ({refresh_error}).") + + +def _get_job_or_exit(app_ctx: AppContext, job_id: int): + row = get_job(app_ctx.cluster_config, job_id) + if row is None: + logger.error(f"Job {job_id} was not found in llmb-run history.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + return row + + +@jobs_app.callback() +def jobs_callback(ctx: typer.Context): + """View llmb-run job history.""" + if ctx.invoked_subcommand is None: + _jobs_list_impl(ctx) + + +@jobs_app.command(name="list") +def jobs_list(ctx: typer.Context): + """List known jobs and refresh non-terminal Slurm states.""" + _jobs_list_impl(ctx) + + +@jobs_app.command(name="show") +def jobs_show(ctx: typer.Context, job_id: Annotated[int, typer.Argument(help='Slurm job ID to show.')]): + """Show details for a single job, including its log directory.""" + app_ctx = _ctx_app_context(ctx) + _, refresh_error = refresh_non_terminal_jobs(app_ctx.cluster_config) + row = _get_job_or_exit(app_ctx, job_id) + typer.echo(format_job_details(row)) + if refresh_error: + logger.warning(f"sacct unavailable; status may be stale ({refresh_error}).") + + +@jobs_app.command(name="log") +def jobs_log( + ctx: typer.Context, + job_id: Annotated[int, typer.Argument(help='Slurm job ID to inspect.')], + tail_lines: Annotated[int, typer.Option('--tail', min=1, help='Number of lines to show.')] = 200, + follow: Annotated[ + bool, typer.Option('-f', '--follow', help='Follow the active log file after printing the initial tail.') + ] = False, + print_path: Annotated[bool, typer.Option('--path', help='Print the active log file path only.')] = False, + print_dir: Annotated[bool, typer.Option('--dir', help='Print the job log directory only.')] = False, + list_files: Annotated[bool, typer.Option('--list', help='List all matching retry log files for the job.')] = False, +): + """Show or follow the active log for a single job.""" + app_ctx = _ctx_app_context(ctx) + row = _get_job_or_exit(app_ctx, job_id) + launcher_type = row["launcher_type"] + if launcher_type == 'sbatch': + logger.error(f"Job {job_id} uses legacy sbatch logging, which llmb-run cannot resolve reliably.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + + log_dir = row["log_dir"] + if not log_dir: + logger.error(f"Job {job_id} does not have a log directory recorded. Run `llmb-run jobs rebuild` to rescan.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + + selected_modes = sum(bool(value) for value in (print_path, print_dir, list_files)) + if selected_modes > 1: + logger.error("Use only one of --path, --dir, or --list.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + if follow and selected_modes: + logger.error("--follow can only be used when printing log contents.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + + if print_dir: + typer.echo(log_dir) + return + + try: + if launcher_type in {'nemo', 'megatron_bridge'}: + logs = find_job_logs(log_dir, job_id) + elif launcher_type == 'configured_sbatch': + logs = find_configured_sbatch_logs(log_dir, job_id) + else: + logger.error(f"Job {job_id} has unsupported launcher type '{launcher_type}'.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + except FileNotFoundError as e: + logger.error(str(e)) + raise typer.Exit(code=EXIT_VALIDATION_ERROR) from e + + active_log = active_job_log(logs) + if list_files: + if not logs: + logger.info(f"No log files found for job {job_id} in {log_dir}.") + return + for log_file in logs: + suffix = " (active)" if log_file == active_log else "" + label = str(log_file.retry) if log_file.retry is not None else "slurm" + typer.echo(f"{label}: {log_file.path}{suffix}") + return + + if active_log is None: + logger.error(f"No log file found for job {job_id} in {log_dir}.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + + if print_path: + typer.echo(active_log.path) + return + + try: + if follow: + raise typer.Exit(code=follow_tail(active_log.path, tail_lines)) + tail_output = read_tail(active_log.path, tail_lines) + except ValueError as e: + logger.error(str(e)) + raise typer.Exit(code=EXIT_VALIDATION_ERROR) from e + except OSError as e: + logger.error(f"Unable to read log file {active_log.path}: {e}") + raise typer.Exit(code=EXIT_SYSTEM_ERROR) from e + + if tail_output: + typer.echo(tail_output) + + +@jobs_app.command(name="refresh") +def jobs_refresh( + ctx: typer.Context, + job_ids: Annotated[list[int], typer.Argument(help='Slurm job ID(s) to force refresh.')], +): + """Force-refresh one or more job records.""" + app_ctx = _ctx_app_context(ctx) + requested_job_ids = sorted({int(job_id) for job_id in job_ids}) + missing_job_ids = [job_id for job_id in requested_job_ids if get_job(app_ctx.cluster_config, job_id) is None] + if missing_job_ids: + ids = ", ".join(str(job_id) for job_id in missing_job_ids) + logger.error(f"Job ID(s) not found in llmb-run history: {ids}.") + raise typer.Exit(code=EXIT_VALIDATION_ERROR) + + refreshed, refresh_error = refresh_requested_jobs(app_ctx.cluster_config, requested_job_ids) + if refresh_error: + logger.error(f"sacct unavailable: {refresh_error}.") + raise typer.Exit(code=EXIT_SYSTEM_ERROR) + logger.info(f"Refreshed {refreshed} job status records.") + + +@jobs_app.command(name="rebuild") +def jobs_rebuild(ctx: typer.Context): + """Rebuild job history by scanning llmb-config files under $LLMB_INSTALL.""" + app_ctx = _ctx_app_context(ctx) + try: + workloads = get_workloads(app_ctx.cluster_config) + except Exception as e: + logger.error(f"Failed to load workloads: {e}") + raise typer.Exit(code=EXIT_SYSTEM_ERROR) from e + + stats = rebuild_history(app_ctx.cluster_config, workloads) + logger.info(f"Job history database: {stats.db_path}") + logger.info(f"Scanned {stats.scanned} llmb-config files; imported {stats.imported}, skipped {stats.skipped}.") + if stats.refresh_error: + logger.warning(f"sacct unavailable; refreshed statuses may be stale ({stats.refresh_error}).") + + def _submit_impl(ctx: typer.Context, request: TaskGenerationRequest, dryrun: bool, mode_name: str = "submit"): """Shared implementation for all submission commands.""" app_ctx: AppContext = ctx.obj try: + # Early check: fail fast if CLI slurm flags conflict with env/cluster-level + # ADDITIONAL_SLURM_PARAMS. Per-workload/task checks remain in the launcher. + if request.slurm_args: + validate_no_additional_slurm_params_conflict( + cli_args=request.slurm_args, + cluster_environment=app_ctx.cluster_config.environment, + ) + # Generate tasks task_list = generate_tasks(request) @@ -325,6 +556,16 @@ def _submit_impl(ctx: typer.Context, request: TaskGenerationRequest, dryrun: boo # Report results report_validation_results(validated_tasks, error_summary, task_list, app_ctx.cluster_config, mode_name) + if request.slurm_args: + for task in validated_tasks: + workload_environment = app_ctx.cluster_config.workload_config(task.workload_key).get("environment", {}) + validate_no_additional_slurm_params_conflict( + cli_args=request.slurm_args, + cluster_environment=app_ctx.cluster_config.environment, + workload_environment=workload_environment, + task_environment=task.env_overrides, + ) + # Print the concrete jobs we’re about to submit (kept concise; launcher output follows). logger.info(f"Jobs ({len(validated_tasks)}):") for task in validated_tasks: @@ -357,7 +598,7 @@ def submit( typer.Option( '-s', '--model-size', - help='Size of the model (e.g., 7b, 13b). Requires explicit single workload via -w.', + help='Size of the model (e.g., 7b, 13b, 1t). Requires explicit single workload via -w.', ), ] = None, dtype: Annotated[ @@ -389,6 +630,13 @@ def submit( ] = None, repeats: Annotated[int, typer.Option('-r', '--repeats', help='Number of repeats for each test configuration.')] = 1, profile: Annotated[bool, typer.Option('-p', '--profile', help='Enable Profiling for jobs.')] = False, + dump_env: Annotated[ + bool, + typer.Option( + '--dump-env', + help='Write a redacted rank-0 environment snapshot for Megatron-Bridge workloads. Ignored for other workloads.', + ), + ] = False, proxy: Annotated[bool, typer.Option('--proxy', help='Use proxy scales.')] = False, dryrun: Annotated[ bool, @@ -402,7 +650,39 @@ def submit( ), ] = False, nice: Annotated[ - Optional[int], typer.Option('--nice', help='Lower the priority of the job using Slurm --nice feature.') + Optional[int], + typer.Option('--nice', help='Lower the job priority via Slurm nice.', rich_help_panel='Slurm'), + ] = None, + nodelist: Annotated[ + Optional[str], + typer.Option('--nodelist', help='Restrict the job to a specific node list.', rich_help_panel='Slurm'), + ] = None, + exclude: Annotated[ + Optional[str], typer.Option('--exclude', help='Exclude specific nodes from the job.', rich_help_panel='Slurm') + ] = None, + reservation: Annotated[ + Optional[str], + typer.Option('--reservation', help='Submit the job under a Slurm reservation.', rich_help_panel='Slurm'), + ] = None, + segment: Annotated[ + Optional[int], + typer.Option('--segment', help='Set the Slurm segment size for the job.', rich_help_panel='Slurm'), + ] = None, + slurm_arg_values: Annotated[ + Optional[list[str]], + typer.Option( + '--slurm-arg', + help='Repeatable raw Slurm parameter in `key=value` or bare-flag form.', + rich_help_panel='Slurm', + ), + ] = None, + env_values: Annotated[ + Optional[list[str]], + typer.Option( + '--env', + help='Repeatable environment variable override in `KEY=value` form.', + rich_help_panel='Slurm', + ), ] = None, ): """ @@ -410,9 +690,24 @@ def submit( """ app_ctx: AppContext = ctx.obj - extra_slurm_params = {} - if nice is not None: - extra_slurm_params['nice'] = nice + try: + explicit_env_overrides = parse_cli_env_args(env_values) + except ValueError as e: + logger.error(str(e)) + raise typer.Exit(code=EXIT_VALIDATION_ERROR) from e + + try: + slurm_args = build_cli_slurm_args( + nodelist=nodelist, + exclude=exclude, + reservation=reservation, + segment=segment, + nice=nice, + slurm_args=slurm_arg_values, + ) + except ValueError as e: + logger.error(str(e)) + raise typer.Exit(code=EXIT_VALIDATION_ERROR) from e request = TaskGenerationRequest( workloads=app_ctx.workloads, @@ -429,7 +724,9 @@ def submit( profile=profile, proxy=proxy, force=force, - extra_slurm_params=extra_slurm_params, + slurm_args=slurm_args, + explicit_env_overrides=explicit_env_overrides, + extra_workload_args=("--dump_env",) if dump_env else (), ) _submit_impl(ctx, request, dryrun, mode_name="submit") @@ -461,169 +758,6 @@ def list_workloads( raise typer.Exit(code=EXIT_VALIDATION_ERROR) -@app.command() -def single( - ctx: typer.Context, - workload: Annotated[ - str, typer.Option('-w', '--workload', help='Name of the workload (e.g., "pretraining_nemotron").') - ], - model_size: Annotated[str, typer.Option('-s', '--model-size', help='Size of the model (e.g., 7b, 13b).')], - dtype: Annotated[str, typer.Option('--dtype', help='Data type (e.g., fp16, bf16).')], - scale: Annotated[str, typer.Option('--scale', help='Scale parameter indicating the number of GPUs.')], - profile: Annotated[bool, typer.Option('-p', '--profile', help='Enable Profiling for job.')] = False, - dryrun: Annotated[ - bool, typer.Option('-d', '--dryrun', help='List the job to be submitted without actually submitting it.') - ] = False, - force: Annotated[ - bool, - typer.Option( - '--force', - help='Bypass dtype/scale validation for one explicit task. Use with caution.', - ), - ] = False, -): - """ - (DEPRECATED) Submit a single job. Use 'llmb-run submit' instead. - """ - logger.warning("⚠️ 'single' command is deprecated. Please use 'llmb-run submit' instead.") - logger.warning(" Use:") - logger.warning(f" ↳ llmb-run submit -w {workload} -s {model_size} --dtype {dtype} --scale {scale}") - - app_ctx: AppContext = ctx.obj - - request = TaskGenerationRequest( - workloads=app_ctx.workloads, - cluster_config=app_ctx.cluster_config, - workload=workload, - model_size=model_size, - dtype=dtype, - scale=scale, # Passed as string, TaskGenerationRequest handles parsing - profile=profile, - force=force, - ) - - _submit_impl(ctx, request, dryrun, mode_name="single") - - -@app.command() -def bulk( - ctx: typer.Context, - input_file: Annotated[ - str, typer.Argument(help='Path to the workload specification file (simple .txt or advanced .yaml).') - ], - dryrun: Annotated[ - bool, typer.Option('-d', '--dryrun', help='List all jobs to be submitted without actually submitting them.') - ] = False, -): - """ - (DEPRECATED) Submit multiple jobs from a specification file. Use 'llmb-run submit -f' instead. - """ - logger.warning("⚠️ 'bulk' command is deprecated. Please use 'llmb-run submit -f ' instead.") - logger.warning(" Use:") - logger.warning(f" ↳ llmb-run submit -f {input_file}{' --dry-run' if dryrun else ''}") - - app_ctx: AppContext = ctx.obj - - request = TaskGenerationRequest( - workloads=app_ctx.workloads, - cluster_config=app_ctx.cluster_config, - file_path=input_file, - ) - - _submit_impl(ctx, request, dryrun, mode_name="bulk") - - -@app.command() -def submit_all( - ctx: typer.Context, - max_scale: Annotated[ - Optional[int], typer.Option('--max-scale', help='Maximum scale (number of GPUs) to test up to.') - ] = None, - min_scale: Annotated[ - bool, - typer.Option( - '--min-scale', help='When set, only run the minimum scale per the metadata for all installed workloads.' - ), - ] = False, - scales: Annotated[ - Optional[str], - typer.Option( - '--scales', - help='Comma-separated list of specific scales to run (e.g., "8,16,32" or "16"). Mutually exclusive with --min-scale and --max-scale.', - ), - ] = None, - dtype: Annotated[ - Optional[str], - typer.Option( - '--dtype', - help='Comma separated list of dtypes to run. If unset, run all available dtypes per metadata for a workload.', - ), - ] = None, - workloads: Annotated[ - Optional[str], - typer.Option( - '-w', - '--workloads', - help='Comma separated list of workloads to run. Reduces scope to only the specified workloads.', - ), - ] = None, - repeats: Annotated[int, typer.Option('--repeats', help='Number of repeats for each test configuration.')] = 1, - profile: Annotated[bool, typer.Option('-p', '--profile', help='Enable profiling for all jobs.')] = False, - dryrun: Annotated[ - bool, typer.Option('-d', '--dryrun', help='List all jobs to be submitted without actually submitting them.') - ] = False, - nice: Annotated[ - Optional[int], typer.Option('--nice', help='Lower the priority of the job using Slurm --nice feature.') - ] = None, -): - """ - (DEPRECATED) Submit jobs for all installed recipes. Use 'llmb-run submit' instead. - """ - logger.warning("⚠️ 'submit-all' command is deprecated. Please use 'llmb-run submit' instead.") - logger.warning(" Use:") - submit_all_cmd = "llmb-run submit" - if workloads: - submit_all_cmd += f" -w {workloads}" - if dtype: - submit_all_cmd += f" --dtype {dtype}" - if scales: - submit_all_cmd += f" --scale {scales}" - if max_scale is not None: - submit_all_cmd += f" --max-scale {max_scale}" - if min_scale: - submit_all_cmd += " --min-scale" - if repeats != 1: - submit_all_cmd += f" -r {repeats}" - if profile: - submit_all_cmd += " -p" - if nice is not None: - submit_all_cmd += f" --nice {nice}" - if dryrun: - submit_all_cmd += " --dry-run" - logger.warning(f" ↳ {submit_all_cmd}") - - app_ctx: AppContext = ctx.obj - - extra_slurm_params = {} - if nice is not None: - extra_slurm_params['nice'] = nice - - request = TaskGenerationRequest( - workloads=app_ctx.workloads, - cluster_config=app_ctx.cluster_config, - workload=workloads, - dtype=dtype, - scale=scales, - max_scale=max_scale, - min_scale=min_scale, - repeats=repeats, - profile=profile, - extra_slurm_params=extra_slurm_params, - ) - - _submit_impl(ctx, request, dryrun, mode_name="submit-all") - - @app.command() def exemplar( ctx: typer.Context, @@ -637,7 +771,7 @@ def exemplar( '-r', '--repeats', min=1, - help='Number of repeats for each test configuration. If not provided, uses value from exemplar.yaml config.repeats (default: 3).', + help='Number of repeats for each test configuration. If not provided, uses exemplar.yaml config.repeats (fallback: 1).', ), ] = None, ): @@ -647,7 +781,7 @@ def exemplar( Runs workloads listed in exemplar.yaml for your cluster's GPU type. All workloads must be installed. - Defaults: scale=512, profile=true, repeats=3 (override with -r). + Fallbacks when omitted from exemplar.yaml: scale=512, profile=false, repeats=1 (override repeats with -r). If profile=true, the last repeat is profiled and earlier repeats are non-profiled. """ app_ctx: AppContext = ctx.obj diff --git a/cli/llmb-run/src/llmb_run/metadata_utils.py b/cli/llmb-run/src/llmb_run/metadata_utils.py index c3b3a9f..da0e83c 100644 --- a/cli/llmb-run/src/llmb_run/metadata_utils.py +++ b/cli/llmb-run/src/llmb_run/metadata_utils.py @@ -59,6 +59,7 @@ # Known dtype keys supported by our tooling. Extend as needed. _KNOWN_DTYPES = {"fp8", "bf16", "nvfp4", "mxfp4"} +_MODEL_SIZE_SUFFIX_RE = re.compile(r'^(?P\d+(?:\.\d+)?)(?P[bt])$', re.IGNORECASE) def normalize_model_dtype_config(model_config: dict) -> Dict[str, Dict[str, object]]: @@ -144,7 +145,7 @@ def normalize_model_dtype_config(model_config: dict) -> Dict[str, Dict[str, obje def parse_workload_name(workload_name: str) -> tuple[str, str | None]: """Parse workload name into base workload_key and optional model_size suffix. - The model_size suffix pattern is: _[.]b at end of string + The model_size suffix pattern is: _[.](b|t) at end of string Args: workload_name: Full workload name (e.g., 'pretrain_llama3.1_70b') @@ -157,6 +158,8 @@ def parse_workload_name(workload_name: str) -> tuple[str, str | None]: ('pretrain_foo', '7b') >>> parse_workload_name('pretrain_bar_340b') ('pretrain_bar', '340b') + >>> parse_workload_name('pretrain_kimi-k2_1t') + ('pretrain_kimi-k2', '1t') >>> parse_workload_name('pretrain_baz') ('pretrain_baz', None) >>> parse_workload_name('pretrain_invalid_7x') @@ -169,7 +172,24 @@ def parse_workload_name(workload_name: str) -> tuple[str, str | None]: potential_size = parts[1] # Check if last segment matches model size pattern - if re.match(r'^\d+(\.\d+)?b$', potential_size): - return parts[0], potential_size + if _MODEL_SIZE_SUFFIX_RE.match(potential_size): + return parts[0], potential_size.lower() return workload_name, None + + +def model_size_to_billions(model_size: str) -> float: + """Convert a model size suffix to billions for numeric comparisons. + + Examples: + "70b" -> 70.0 + "1t" -> 1000.0 + "1.5t" -> 1500.0 + """ + match = _MODEL_SIZE_SUFFIX_RE.match(model_size) + if not match: + return 0.0 + + value = float(match.group('value')) + unit = match.group('unit').lower() + return value * 1000 if unit == 't' else value diff --git a/cli/llmb-run/src/llmb_run/nsys_mount_handler.py b/cli/llmb-run/src/llmb_run/nsys_mount_handler.py index 104e4c3..cf2e1eb 100644 --- a/cli/llmb-run/src/llmb_run/nsys_mount_handler.py +++ b/cli/llmb-run/src/llmb_run/nsys_mount_handler.py @@ -49,6 +49,7 @@ 'nvcr.io#nvidia/nemo:25.11.01': '/usr/local/cuda-13.0/NsightSystems-cli-2025.5.1', 'nvcr.io#nvidia/nemo:26.02.00': '/usr/local/cuda-13.0/NsightSystems-cli-2025.5.1', 'nvcr.io#nvidia/nemo:26.02.01': '/usr/local/cuda-13.0/NsightSystems-cli-2026.1.0', + 'nvcr.io#nvidia/nemo:26.04.00': '/usr/local/cuda-13.1/NsightSystems-cli-2026.1.1/', } # Container image to CUPTI library path lookup table diff --git a/cli/llmb-run/src/llmb_run/pretrain_log_parser.py b/cli/llmb-run/src/llmb_run/pretrain_log_parser.py new file mode 100644 index 0000000..33cc368 --- /dev/null +++ b/cli/llmb-run/src/llmb_run/pretrain_log_parser.py @@ -0,0 +1,353 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +"""Internal pretraining log parsers for llmb-run.""" + +from __future__ import annotations + +import pathlib +import re +import statistics +from dataclasses import dataclass +from enum import Enum + +from llmb_run.job_logs import active_job_log, find_job_logs + +MIN_ITERATION = 35 +MAX_ITERATION = 44 + +_NUMBER = r"([0-9]+(?:\.[0-9]+)?(?:[eE][+-]?[0-9]+)?)" +_ITERATION_RE = re.compile(r"iteration\s+(\d+)\s*/\s*(\d+)") +_NEMO_TIME_RE = re.compile(rf"train_step_timing in s:\s*{_NUMBER}") +_NEMO_TFLOPS_RE = re.compile(rf"TFLOPS_per_GPU:\s*{_NUMBER}") +_MBRIDGE_TIME_RE = re.compile(rf"elapsed time per iteration \(ms\):\s*{_NUMBER}") +_MBRIDGE_TFLOPS_RE = re.compile(rf"{_NUMBER}\s*(?:MODEL_TFLOP/s/GPU|TFLOP/s/GPU)") +_MBRIDGE_NAN_GRAD_NORM_RE = re.compile(r"\bgrad[ _]norm\s*:\s*nan\b", re.IGNORECASE) + + +class PretrainLogParseStatus(str, Enum): + SUCCESS = "success" + INCOMPLETE = "incomplete" + INVALID_GRAD_NORM = "invalid_grad_norm" + NO_DATA = "no_data" + NO_LOG = "no_log" + UNSUPPORTED_FRAMEWORK = "unsupported_framework" + + +@dataclass(frozen=True) +class PretrainLogMetrics: + """Averaged pretraining log metrics normalized for downstream use.""" + + time_mean_seconds: float + time_std_seconds: float + time_sample_count: int + tflops_per_gpu_mean: float | None + tflops_per_gpu_std: float | None + tflops_sample_count: int + + +@dataclass(frozen=True) +class PretrainLogParseResult: + """Structured output from one pretraining log parse.""" + + status: PretrainLogParseStatus + parser: str | None + framework: str + log_path: pathlib.Path | None = None + metrics: PretrainLogMetrics | None = None + min_iteration: int = MIN_ITERATION + max_iteration: int = MAX_ITERATION + max_iteration_seen: int | None = None + invalid_grad_norm_iteration: int | None = None + final_iteration_seen: bool = False + + @property + def succeeded(self) -> bool: + return self.status == PretrainLogParseStatus.SUCCESS + + +def parse_latest_pretrain_job_log( + log_dir: str | pathlib.Path, + job_id: int, + framework: str, + min_iteration: int = MIN_ITERATION, + max_iteration: int = MAX_ITERATION, +) -> PretrainLogParseResult: + """Parse the most recent retry log for one tracked job.""" + + parser = parser_name_for_framework(framework) + if parser is None: + return PretrainLogParseResult( + status=PretrainLogParseStatus.UNSUPPORTED_FRAMEWORK, + parser=None, + framework=framework, + min_iteration=min_iteration, + max_iteration=max_iteration, + ) + + try: + logs = find_job_logs(log_dir, job_id) + except FileNotFoundError: + logs = [] + + active_log = active_job_log(logs) + if active_log is None: + return PretrainLogParseResult( + status=PretrainLogParseStatus.NO_LOG, + parser=parser, + framework=framework, + min_iteration=min_iteration, + max_iteration=max_iteration, + ) + + return parse_pretrain_log(active_log.path, framework, min_iteration, max_iteration) + + +def parse_pretrain_log( + log_path: str | pathlib.Path, + framework: str, + min_iteration: int = MIN_ITERATION, + max_iteration: int = MAX_ITERATION, +) -> PretrainLogParseResult: + """Parse one pretraining log using the parser selected by framework name.""" + + parser = parser_name_for_framework(framework) + path = pathlib.Path(log_path) + + if parser == "nemo": + return _parse_nemo_log(path, framework, min_iteration, max_iteration) + if parser == "megatron_bridge": + return _parse_megatron_bridge_log(path, framework, min_iteration, max_iteration) + + return PretrainLogParseResult( + status=PretrainLogParseStatus.UNSUPPORTED_FRAMEWORK, + parser=None, + framework=framework, + log_path=path, + min_iteration=min_iteration, + max_iteration=max_iteration, + ) + + +def parser_name_for_framework(framework: str | None) -> str | None: + """Return the parser family for a workload framework string.""" + + if not framework: + return None + + normalized = framework.strip().lower() + if normalized == "nemo2": + return "nemo" + if normalized == "megatron_bridge": + return "megatron_bridge" + + return None + + +def _parse_nemo_log( + log_path: pathlib.Path, framework: str, min_iteration: int, max_iteration: int +) -> PretrainLogParseResult: + times_seconds: list[float] = [] + tflops: list[float] = [] + perf_iterations_seen: set[int] = set() + max_iteration_seen: int | None = None + final_iteration_seen = False + + with log_path.open("r", errors="replace") as f: + for line in f: + iteration_marker = _iteration_marker_from_line(line) + if iteration_marker is None: + continue + + iteration, final_iteration = iteration_marker + if iteration == final_iteration: + final_iteration_seen = True + + if not min_iteration <= iteration <= max_iteration: + continue + + time_match = _NEMO_TIME_RE.search(line) + if not time_match: + continue + + times_seconds.append(float(time_match.group(1))) + perf_iterations_seen.add(iteration) + max_iteration_seen = _max_optional(max_iteration_seen, iteration) + + tflops_match = _NEMO_TFLOPS_RE.search(line) + if tflops_match: + tflops.append(float(tflops_match.group(1))) + + return _build_result( + parser="nemo", + framework=framework, + log_path=log_path, + times_seconds=times_seconds, + tflops=tflops, + perf_iterations_seen=perf_iterations_seen, + min_iteration=min_iteration, + max_iteration=max_iteration, + max_iteration_seen=max_iteration_seen, + final_iteration_seen=final_iteration_seen, + ) + + +def _parse_megatron_bridge_log( + log_path: pathlib.Path, framework: str, min_iteration: int, max_iteration: int +) -> PretrainLogParseResult: + # Timing and TFLOPS lines are emitted independently and the AWK in + # common/parse_train_timing_mbridge.sh paired each iteration with the most + # recent TFLOPS line, dropping any TFLOPS samples that arrived back-to-back. + # Collect them as parallel lists and pair positionally so every sample counts. + timing_samples: list[tuple[int, float]] = [] + tflops_samples: list[float] = [] + invalid_grad_norm_iteration: int | None = None + final_iteration_seen = False + + with log_path.open("r", errors="replace") as f: + for line in f: + tflops_match = _MBRIDGE_TFLOPS_RE.search(line) + if tflops_match: + tflops_samples.append(float(tflops_match.group(1))) + + iteration_marker = _iteration_marker_from_line(line) + if iteration_marker is None: + continue + + iteration, final_iteration = iteration_marker + if iteration == final_iteration: + final_iteration_seen = True + + if invalid_grad_norm_iteration is None and _MBRIDGE_NAN_GRAD_NORM_RE.search(line): + invalid_grad_norm_iteration = iteration + + time_match = _MBRIDGE_TIME_RE.search(line) + if not time_match: + continue + + timing_samples.append((iteration, float(time_match.group(1)) / 1000.0)) + + if invalid_grad_norm_iteration is not None: + return PretrainLogParseResult( + status=PretrainLogParseStatus.INVALID_GRAD_NORM, + parser="megatron_bridge", + framework=framework, + log_path=log_path, + min_iteration=min_iteration, + max_iteration=max_iteration, + invalid_grad_norm_iteration=invalid_grad_norm_iteration, + ) + + times_seconds: list[float] = [] + tflops: list[float] = [] + perf_iterations_seen: set[int] = set() + max_iteration_seen: int | None = None + for index, (iteration, time_seconds) in enumerate(timing_samples): + if not min_iteration <= iteration <= max_iteration: + continue + + times_seconds.append(time_seconds) + perf_iterations_seen.add(iteration) + max_iteration_seen = _max_optional(max_iteration_seen, iteration) + if index < len(tflops_samples): + tflops.append(tflops_samples[index]) + + return _build_result( + parser="megatron_bridge", + framework=framework, + log_path=log_path, + times_seconds=times_seconds, + tflops=tflops, + perf_iterations_seen=perf_iterations_seen, + min_iteration=min_iteration, + max_iteration=max_iteration, + max_iteration_seen=max_iteration_seen, + final_iteration_seen=final_iteration_seen, + ) + + +def _build_result( + parser: str, + framework: str, + log_path: pathlib.Path, + times_seconds: list[float], + tflops: list[float], + perf_iterations_seen: set[int], + min_iteration: int, + max_iteration: int, + max_iteration_seen: int | None, + final_iteration_seen: bool, +) -> PretrainLogParseResult: + if not times_seconds: + return PretrainLogParseResult( + status=PretrainLogParseStatus.NO_DATA, + parser=parser, + framework=framework, + log_path=log_path, + min_iteration=min_iteration, + max_iteration=max_iteration, + max_iteration_seen=max_iteration_seen, + final_iteration_seen=final_iteration_seen, + ) + + expected_perf_iterations = set(range(min_iteration, max_iteration + 1)) + if not final_iteration_seen or not expected_perf_iterations.issubset(perf_iterations_seen): + return PretrainLogParseResult( + status=PretrainLogParseStatus.INCOMPLETE, + parser=parser, + framework=framework, + log_path=log_path, + min_iteration=min_iteration, + max_iteration=max_iteration, + max_iteration_seen=max_iteration_seen, + final_iteration_seen=final_iteration_seen, + ) + + return PretrainLogParseResult( + status=PretrainLogParseStatus.SUCCESS, + parser=parser, + framework=framework, + log_path=log_path, + metrics=PretrainLogMetrics( + time_mean_seconds=statistics.mean(times_seconds), + time_std_seconds=_stdev_or_zero(times_seconds), + time_sample_count=len(times_seconds), + tflops_per_gpu_mean=statistics.mean(tflops) if tflops else None, + tflops_per_gpu_std=_stdev_or_zero(tflops) if tflops else None, + tflops_sample_count=len(tflops), + ), + min_iteration=min_iteration, + max_iteration=max_iteration, + max_iteration_seen=max_iteration_seen, + final_iteration_seen=final_iteration_seen, + ) + + +def _iteration_marker_from_line(line: str) -> tuple[int, int] | None: + match = _ITERATION_RE.search(line) + if not match: + return None + return int(match.group(1)), int(match.group(2)) + + +def _stdev_or_zero(values: list[float]) -> float: + return statistics.stdev(values) if len(values) > 1 else 0.0 + + +def _max_optional(current: int | None, value: int) -> int: + return value if current is None else max(current, value) diff --git a/cli/llmb-run/src/llmb_run/run_config.py b/cli/llmb-run/src/llmb_run/run_config.py index 8ed6dfc..c87e3b4 100644 --- a/cli/llmb-run/src/llmb_run/run_config.py +++ b/cli/llmb-run/src/llmb_run/run_config.py @@ -344,6 +344,7 @@ def create_llmb_config(task, job_id, workdir, config: ClusterConfig, workloads): llmb_config = { 'job_info': { 'job_id': job_id, + 'launcher_type': metadata.get('run', {}).get('launcher_type', ''), 'launch_time': datetime.now().isoformat(), 'experiment_id': experiment_id, }, diff --git a/cli/llmb-run/src/llmb_run/slurm_args.py b/cli/llmb-run/src/llmb_run/slurm_args.py new file mode 100644 index 0000000..1c8a792 --- /dev/null +++ b/cli/llmb-run/src/llmb_run/slurm_args.py @@ -0,0 +1,172 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +"""Shared parsing and serialization helpers for Slurm submit args.""" + +from __future__ import annotations + +import os +from typing import Iterable, Mapping + +from pydantic import BaseModel, ConfigDict + +ADDITIONAL_SLURM_PARAMS_KEY = 'ADDITIONAL_SLURM_PARAMS' +FIRST_CLASS_SLURM_KEYS = ('nodelist', 'exclude', 'reservation', 'segment', 'nice') + + +class SlurmParam(BaseModel): + """A single Slurm parameter rendered as key=value or a bare flag.""" + + model_config = ConfigDict(frozen=True) + + key: str + value: str | None = None + + def render(self) -> str: + return self.key if self.value is None else f"{self.key}={self.value}" + + def as_sbatch_arg(self) -> str: + return f"--{self.render()}" + + +class SlurmArgs(BaseModel): + """Canonical Slurm submit arguments shared across launchers.""" + + model_config = ConfigDict(frozen=True) + + named_params: dict[str, str] = {} + passthrough_params: tuple[SlurmParam, ...] = () + + def is_empty(self) -> bool: + return not self.named_params and not self.passthrough_params + + def get_named_param(self, key: str) -> str | None: + return self.named_params.get(key) + + def iter_params(self) -> Iterable[SlurmParam]: + for key in FIRST_CLASS_SLURM_KEYS: + if key in self.named_params: + yield SlurmParam(key=key, value=self.named_params[key]) + yield from self.passthrough_params + + def to_additional_slurm_params(self) -> str: + return ';'.join(param.render() for param in self.iter_params()) + + def to_sbatch_args(self) -> list[str]: + return [param.as_sbatch_arg() for param in self.iter_params()] + + +def build_cli_slurm_args( + *, + nodelist: str | None = None, + exclude: str | None = None, + reservation: str | None = None, + segment: int | None = None, + nice: int | None = None, + slurm_args: Iterable[str] | None = None, +) -> SlurmArgs | None: + """Build canonical Slurm args from first-class CLI flags.""" + named_params: dict[str, str] = {} + if nodelist is not None: + named_params['nodelist'] = nodelist + if exclude is not None: + named_params['exclude'] = exclude + if reservation is not None: + named_params['reservation'] = reservation + if segment is not None: + named_params['segment'] = str(segment) + if nice is not None: + named_params['nice'] = str(nice) + + seen_keys = set(named_params) + passthrough_params: list[SlurmParam] = [] + + for raw_arg in slurm_args or (): + raw_arg = raw_arg.strip() + if not raw_arg: + raise ValueError("`--slurm-arg` cannot be empty.") + if raw_arg.startswith('--'): + raise ValueError( + "`--slurm-arg` values should not include a leading '--'. " + "Use `constraint=gpu` or `exclusive` instead." + ) + + if '=' in raw_arg: + key, value = raw_arg.split('=', 1) + key = key.strip() + value = value.strip() + if not key or not value: + raise ValueError("`--slurm-arg` assignments must be in `key=value` form with non-empty key and value.") + param = SlurmParam(key=key, value=value) + else: + key = raw_arg + param = SlurmParam(key=key, value=None) + + if key in FIRST_CLASS_SLURM_KEYS: + raise ValueError(f"`{key}` has a dedicated flag. " f"Use `--{key}` instead of `--slurm-arg {raw_arg}`.") + + if key in seen_keys: + raise ValueError(f"Duplicate Slurm parameter '{key}' was specified more than once.") + + seen_keys.add(key) + passthrough_params.append(param) + + if not named_params and not passthrough_params: + return None + + return SlurmArgs(named_params=named_params, passthrough_params=tuple(passthrough_params)) + + +def validate_no_additional_slurm_params_conflict( + *, + cli_args: SlurmArgs | None, + cluster_environment: Mapping[str, object] | None = None, + workload_environment: Mapping[str, object] | None = None, + task_environment: Mapping[str, object] | None = None, +) -> None: + """Ensure first-class CLI Slurm args do not mix with direct env-var injection.""" + if cli_args is None or cli_args.is_empty(): + return + + sources: list[str] = [] + if _has_non_empty_value(os.environ, ADDITIONAL_SLURM_PARAMS_KEY): + sources.append('process environment') + if _has_non_empty_value(cluster_environment, ADDITIONAL_SLURM_PARAMS_KEY): + sources.append('cluster config environment') + if _has_non_empty_value(workload_environment, ADDITIONAL_SLURM_PARAMS_KEY): + sources.append('workload config environment') + if _has_non_empty_value(task_environment, ADDITIONAL_SLURM_PARAMS_KEY): + sources.append('task environment overrides') + + if sources: + source_list = ', '.join(sources) + raise ValueError( + f"Cannot combine first-class Slurm CLI flags with {ADDITIONAL_SLURM_PARAMS_KEY} from {source_list}." + ) + + +def _has_non_empty_value(env: Mapping[str, object] | None, key: str) -> bool: + if not env: + return False + value = env.get(key) + if value is None: + return False + return str(value).strip() != '' diff --git a/cli/llmb-run/src/llmb_run/slurm_utils.py b/cli/llmb-run/src/llmb_run/slurm_utils.py index 55a459b..e1d35b4 100644 --- a/cli/llmb-run/src/llmb_run/slurm_utils.py +++ b/cli/llmb-run/src/llmb_run/slurm_utils.py @@ -1,4 +1,4 @@ -# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: MIT # # Permission is hereby granted, free of charge, to any person obtaining a @@ -22,18 +22,40 @@ """SLURM utilities for job management.""" import logging +import re import shlex import subprocess from dataclasses import dataclass logger = logging.getLogger('llmb_run.slurm_utils') +SACCT_TIMEOUT_SECONDS = 30 + @dataclass class SlurmJob: job_id: int | None - job_status: str = None - job_workdir: str = None + job_status: str | None = None + job_workdir: str | None = None + llmb_config_path: str | None = None + + +@dataclass(frozen=True) +class SlurmAccountingRecord: + job_id: int + state: str + elapsed: str + submit_time: str + node_list: str + exit_code: str + + +def parse_slurm_job_id(raw_job_id: object) -> int: + """Parse a Slurm job id from sbatch/NeMo output.""" + match = re.match(r'\s*(\d+)', str(raw_job_id or '')) + if not match: + raise ValueError(f"Unable to parse Slurm job id from '{raw_job_id}'.") + return int(match.group(1)) def get_slurm_job_status(jobid: int): @@ -56,6 +78,64 @@ def get_slurm_job_status(jobid: int): return None +def get_slurm_job_statuses(job_ids: list[int]) -> dict[int, SlurmAccountingRecord] | None: + """Get Slurm accounting records for multiple jobs with one sacct call. + + Returns a dict (possibly empty) on success. Job ids that sacct does not + know about are simply absent from the dict — sacct does not error for + unknown ids. Returns None when sacct itself could not be queried (timeout, + missing binary, non-zero exit), so callers can distinguish "no records + found" from "could not refresh". + """ + if not job_ids: + return {} + + unique_job_ids = sorted({int(job_id) for job_id in job_ids}) + cmd = [ + "sacct", + "-X", + "-P", + "--noheader", + f"--jobs={','.join(str(job_id) for job_id in unique_job_ids)}", + "--format=JobIDRaw,State,Elapsed,Submit,NodeList,ExitCode", + ] + + try: + result = subprocess.run(cmd, capture_output=True, text=True, check=True, timeout=SACCT_TIMEOUT_SECONDS) + except (OSError, subprocess.CalledProcessError, subprocess.TimeoutExpired) as e: + stderr = getattr(e, 'stderr', '') or str(e) + logger.warning(f"Unable to refresh Slurm job status with sacct: {stderr}") + return None + + records: dict[int, SlurmAccountingRecord] = {} + for line in result.stdout.splitlines(): + if not line.strip(): + continue + + fields = line.rstrip('\n').split('|') + if len(fields) < 6: + logger.debug(f"Skipping unexpected sacct output line: {line}") + continue + + raw_job_id, state, elapsed, submit_time, node_list, exit_code = fields[:6] + try: + job_id = parse_slurm_job_id(raw_job_id) + except ValueError: + logger.debug(f"Skipping sacct output with invalid job id: {line}") + continue + + records[job_id] = SlurmAccountingRecord( + job_id=job_id, + state=state.strip(), + elapsed=elapsed.strip(), + submit_time=submit_time.strip(), + node_list=node_list.strip(), + exit_code=exit_code.strip(), + ) + + return records + + def get_cluster_name(): """Get the cluster name from SLURM configuration. diff --git a/cli/llmb-run/src/llmb_run/task_generation.py b/cli/llmb-run/src/llmb_run/task_generation.py index 2bbdc27..95f6ef1 100644 --- a/cli/llmb-run/src/llmb_run/task_generation.py +++ b/cli/llmb-run/src/llmb_run/task_generation.py @@ -22,8 +22,8 @@ """Unified task generation logic for llmb-run.""" import logging -from dataclasses import dataclass -from typing import Any, Dict, List, Optional +from dataclasses import dataclass, field +from typing import TYPE_CHECKING, Any, Dict, List, Optional from llmb_run.config_manager import ClusterConfig from llmb_run.metadata_utils import normalize_model_dtype_config, parse_workload_name @@ -40,6 +40,9 @@ logger = logging.getLogger('llmb_run.task_generation') +if TYPE_CHECKING: + from llmb_run.slurm_args import SlurmArgs + class ValidationError(Exception): """Custom exception for validation errors during task generation.""" @@ -69,7 +72,9 @@ class TaskGenerationRequest: profile: bool = False proxy: bool = False force: bool = False - extra_slurm_params: Optional[Dict[str, Any]] = None + slurm_args: Optional['SlurmArgs'] = None + explicit_env_overrides: dict[str, str] = field(default_factory=dict) + extra_workload_args: tuple[str, ...] = () def validate(self) -> None: """Validate parameter combinations.""" @@ -84,6 +89,8 @@ def validate(self) -> None: # Model size restrictions if self.model_size: + self.model_size = self.model_size.strip().lower() + if not self.workload: raise ValidationError("--model-size requires --workload") @@ -94,7 +101,7 @@ def validate(self) -> None: "To target specific model sizes, append them to the workload name.\n" "Workloads without a suffix will run ALL available sizes.\n\n" "Example:\n" - " llmb-run submit -w pretrain_llama3.1_70b,pretrain_nemotron-h" + " llmb-run submit -w pretrain_llama3.1_70b,pretrain_kimi-k2_1t,pretrain_nemotron-h" ) # Strip redundant size suffix from workload name when -s is also provided. @@ -172,19 +179,19 @@ def generate_tasks(request: TaskGenerationRequest) -> List[WorkloadTask]: raise ValueError(str(e)) from e if request.file_path: - return _generate_from_file(request) - - if request.force: - return _generate_forced_explicit_task(request) - - if request.model_size: + tasks = _generate_from_file(request) + elif request.force: + tasks = _generate_forced_explicit_task(request) + elif request.model_size: # Has explicit model size # Always use discovery/targeted mode logic which supports metadata-backed # generation, implicit dtypes, and various scale specifications. - return _generate_explicit_workload_with_scale_discovery(request) + tasks = _generate_explicit_workload_with_scale_discovery(request) else: # Discovery mode: workload names include size - return _generate_discovery_tasks(request) + tasks = _generate_discovery_tasks(request) + + return _apply_task_generation_modifiers(tasks, request) def parse_comma_list(value: Optional[str]) -> List[str]: @@ -198,8 +205,8 @@ def parse_comma_list(value: Optional[str]) -> List[str]: def _generate_explicit_workload_with_scale_discovery(request: TaskGenerationRequest) -> List[WorkloadTask]: """Generate tasks for explicit workload with scale discovery. - Example: llmb-run submit -w pretrain_nemotron4 -s 340b -d fp8 --max-scale 512 - Generates: nemotron4_340b at all supported scales up to 512 + Example: llmb-run submit -w pretrain_kimi-k2 -s 1t -d fp8 --max-scale 512 + Generates: pretrain_kimi-k2_1t at all supported scales up to 512 """ workload_key = request.workload model_size = request.model_size @@ -217,7 +224,7 @@ def _generate_explicit_workload_with_scale_discovery(request: TaskGenerationRequ else: specific_scales = None - return generate_submit_all_tasks( + tasks = generate_submit_all_tasks( request.workloads, request.cluster_config, request.max_scale, @@ -228,9 +235,10 @@ def _generate_explicit_workload_with_scale_discovery(request: TaskGenerationRequ dtype_filter=dtype_filter, workload_filter=workload_filter, specific_scales=specific_scales, - extra_slurm_params=request.extra_slurm_params, + slurm_args=request.slurm_args, proxy=request.proxy, ) + return tasks def _generate_discovery_tasks(request: TaskGenerationRequest) -> List[WorkloadTask]: @@ -247,7 +255,7 @@ def _generate_discovery_tasks(request: TaskGenerationRequest) -> List[WorkloadTa else: specific_scales = None - return generate_submit_all_tasks( + tasks = generate_submit_all_tasks( request.workloads, request.cluster_config, request.max_scale, @@ -258,9 +266,10 @@ def _generate_discovery_tasks(request: TaskGenerationRequest) -> List[WorkloadTa dtype_filter=dtype_filter, workload_filter=workload_filter, specific_scales=specific_scales, - extra_slurm_params=request.extra_slurm_params, + slurm_args=request.slurm_args, proxy=request.proxy, ) + return tasks def _generate_from_file(request: TaskGenerationRequest) -> List[WorkloadTask]: @@ -273,10 +282,9 @@ def _generate_from_file(request: TaskGenerationRequest) -> List[WorkloadTask]: else: tasks = gen_tasks(tasks_parsed) - # Propagate extra_slurm_params to all generated tasks - if request.extra_slurm_params: + if request.slurm_args: for task in tasks: - task.extra_slurm_params = request.extra_slurm_params + task.slurm_args = request.slurm_args return tasks @@ -287,7 +295,7 @@ def _generate_forced_explicit_task(request: TaskGenerationRequest) -> List[Workl workload_key = parse_comma_list(request.workload)[0] model_size = parse_comma_list(request.model_size)[0] else: - # Resolve from workload_size name (e.g., "pretrain_foo_7b" -> "pretrain_foo", "7b") + # Resolve from workload_size name (e.g., "pretrain_foo_1t" -> "pretrain_foo", "1t") workload_key, model_size = parse_workload_name(parse_comma_list(request.workload)[0]) dtype = parse_comma_list(request.dtype)[0] @@ -318,7 +326,7 @@ def _generate_forced_explicit_task(request: TaskGenerationRequest) -> List[Workl ) ) - return [ + tasks = [ WorkloadTask( workload_key=workload_key, model_size=model_size, @@ -326,10 +334,41 @@ def _generate_forced_explicit_task(request: TaskGenerationRequest) -> List[Workl scale=scale, profile=request.profile, proxy=request.proxy, - extra_slurm_params=request.extra_slurm_params or {}, + slurm_args=request.slurm_args, ) for _ in range(request.repeats) ] + return tasks + + +def _apply_task_generation_modifiers(tasks: List[WorkloadTask], request: TaskGenerationRequest) -> List[WorkloadTask]: + """Apply request-level modifiers to generated tasks.""" + tasks = _apply_explicit_env_overrides(tasks, request.explicit_env_overrides) + tasks = _apply_extra_workload_args(tasks, request.extra_workload_args) + return tasks + + +def _apply_explicit_env_overrides(tasks: List[WorkloadTask], overrides: dict[str, str]) -> List[WorkloadTask]: + """Apply explicit CLI env vars to generated tasks.""" + if not overrides: + return tasks + + for task in tasks: + task.env_overrides = {**task.env_overrides, **overrides} + task.explicit_env_overrides = {**task.explicit_env_overrides, **overrides} + + return tasks + + +def _apply_extra_workload_args(tasks: List[WorkloadTask], args: tuple[str, ...]) -> List[WorkloadTask]: + """Apply request-level extra workload args to generated tasks.""" + if not args: + return tasks + + for task in tasks: + task.extra_workload_args = (*task.extra_workload_args, *args) + + return tasks def generate_submit_all_tasks( @@ -343,7 +382,7 @@ def generate_submit_all_tasks( dtype_filter=None, workload_filter=None, specific_scales=None, - extra_slurm_params: Optional[Dict[str, Any]] = None, + slurm_args: Optional['SlurmArgs'] = None, proxy=False, ): """Generate tasks for all installed workloads up to max_scale. @@ -363,7 +402,7 @@ def generate_submit_all_tasks( dtype_filter: List of dtypes to filter by, or None for all (default: None) workload_filter: List of workloads to filter by, or None for all (default: None) specific_scales: List of specific scales to run, or None to use max_scale/min_scale logic (default: None) - extra_slurm_params: Optional dictionary of extra Slurm parameters to apply to jobs. + slurm_args: Optional canonical Slurm submit args to apply to jobs. proxy: If True, use proxy_scales instead of production scales (default: False) Returns: @@ -402,10 +441,9 @@ def generate_submit_all_tasks( # Apply workload filter if specified if workload_filter: - # Check if workload_key matches any filter (either exact match or filter starts with workload_key) + # Match a base workload filter or a model-size-specific filter for the same workload. workload_matches = False for filter_item in workload_filter: - # Exact match or filter starts with workload (e.g., pretrain_nemotron matches pretrain_nemotron_340b) if workload_key == filter_item or filter_item.startswith(workload_key + '_'): workload_matches = True break @@ -436,7 +474,7 @@ def generate_submit_all_tasks( dtype_filter=dtype_filter, workload_filter=workload_filter, specific_scales=specific_scales, - extra_slurm_params=extra_slurm_params, + slurm_args=slurm_args, proxy=proxy, ) @@ -457,7 +495,7 @@ def _generate_workload_tasks( dtype_filter=None, workload_filter=None, specific_scales=None, - extra_slurm_params: Optional[Dict[str, Any]] = None, + slurm_args: Optional['SlurmArgs'] = None, proxy=False, ): """Generate tasks for a single workload and add them to task_list. @@ -475,7 +513,7 @@ def _generate_workload_tasks( dtype_filter: List of dtypes to filter by, or None for all (default: None) workload_filter: List of workload filters, may include workload_modelsize (default: None) specific_scales: List of specific scales to run, or None to use max_scale/min_scale logic (default: None) - extra_slurm_params: Optional dictionary of extra Slurm parameters to apply to jobs. + slurm_args: Optional canonical Slurm submit args to apply to jobs. proxy: If True, use proxy_scales instead of production scales (default: False) """ metadata = workload_data['metadata'] @@ -611,7 +649,7 @@ def _generate_workload_tasks( scale=scale, profile=profile, proxy=proxy, - extra_slurm_params=extra_slurm_params or {}, + slurm_args=slurm_args, ) ) diff --git a/cli/llmb-run/src/llmb_run/task_loader.py b/cli/llmb-run/src/llmb_run/task_loader.py index a535108..6078b51 100644 --- a/cli/llmb-run/src/llmb_run/task_loader.py +++ b/cli/llmb-run/src/llmb_run/task_loader.py @@ -29,6 +29,7 @@ import yaml from llmb_run.config_manager import ClusterConfig +from llmb_run.env_args import validate_env_key, validate_shell_safe_env_value from llmb_run.metadata_utils import parse_workload_name from llmb_run.tasks import WorkloadTask from llmb_run.workload_validator import ( @@ -57,10 +58,10 @@ def get_tasks_simple(workloads, input_file, cluster_config: ClusterConfig | None (dtype_list, scale_list, repeats, profile=False) Example: - pretraining_grok1_314b: - (['fp8', 'bf16'], [128, 256], 3) + pretrain_llama3.1_70b: + ('bf16', [128, 256], 3) # With profiling enabled - ('fp8', [128, 256, 512], 1, True) + ('bf16', [512], 1, True) Note: Inline trailing comments on task lines are not supported. Put comments on their own lines instead. @@ -181,8 +182,9 @@ def get_tasks_yaml(input_file, workloads=None, cluster_config: ClusterConfig | N f" Examples:\n" f" • pretrain_llama3.1_70b:\n" f" • pretrain_nemotron-h_56b:\n" + f" • pretrain_kimi-k2_1t:\n" f"\n" - f"Model size must match pattern: _b or _.b\n" + f"Model size must match pattern: _(b|t) or _.(b|t)\n" f"\n" f"To see available workloads:\n" f" llmb-run list" @@ -229,6 +231,14 @@ def get_tasks_yaml(input_file, workloads=None, cluster_config: ClusterConfig | N # Env Overrides task_env = merge_dicts(default_env, overrides.get("env", {})) + normalized_task_env = {} + for _env_key, _env_value in task_env.items(): + env_key = validate_env_key(_env_key, source='YAML env') + if env_key in normalized_task_env: + raise ValueError(f"Duplicate YAML env variable '{env_key}' was specified more than once.") + validate_shell_safe_env_value(env_key, str(_env_value)) + normalized_task_env[env_key] = _env_value + task_env = normalized_task_env # Model Specific Overrides param_overrides = overrides.get("params", {}) @@ -324,7 +334,15 @@ def flatten_yaml_tasks(advanced_tasks): w, m, dt, scale, profile, proxy, env_overrides, model_overrides = t task_list.append( WorkloadTask( - w, m, dt, scale, profile, proxy, env_overrides=env_overrides, model_overrides=model_overrides + w, + m, + dt, + scale, + profile, + proxy, + env_overrides=env_overrides, + explicit_env_overrides=env_overrides, + model_overrides=model_overrides, ) ) return task_list diff --git a/cli/llmb-run/src/llmb_run/tasks.py b/cli/llmb-run/src/llmb_run/tasks.py index 8745a7a..a4e84c0 100644 --- a/cli/llmb-run/src/llmb_run/tasks.py +++ b/cli/llmb-run/src/llmb_run/tasks.py @@ -23,7 +23,10 @@ import logging from dataclasses import dataclass, field -from typing import Any, Dict +from typing import TYPE_CHECKING + +if TYPE_CHECKING: + from llmb_run.slurm_args import SlurmArgs logger = logging.getLogger('llmb_run.tasks') @@ -37,8 +40,10 @@ class WorkloadTask: profile: bool = False proxy: bool = False env_overrides: dict = field(default_factory=dict) + explicit_env_overrides: dict = field(default_factory=dict) model_overrides: dict = field(default_factory=dict) - extra_slurm_params: Dict[str, Any] = field(default_factory=dict) + slurm_args: 'SlurmArgs | None' = None + extra_workload_args: tuple[str, ...] = () def format_task_output(task, prefix="", suffix=""): diff --git a/cli/llmb-run/src/llmb_run/workload_validator.py b/cli/llmb-run/src/llmb_run/workload_validator.py index 31f6052..bb0b1fb 100644 --- a/cli/llmb-run/src/llmb_run/workload_validator.py +++ b/cli/llmb-run/src/llmb_run/workload_validator.py @@ -26,13 +26,13 @@ Throughout this codebase, workloads use a consistent naming structure: - workload_key: Full identifier in format "{workload_type}_{workload}" - Example: "pretrain_nemotron4" + Example: "pretrain_" - workload_type: Category of workload (e.g., "pretrain", "finetune", "inference") Example: "pretrain" -- workload: Base workload name (e.g., "nemotron4", "llama3.1") - Example: "nemotron4" +- workload: Base workload name + Example: "" The relationship is: workload_key = f"{workload_type}_{workload}" This format is established when loading metadata and used consistently throughout the tool. @@ -47,7 +47,7 @@ from llmb_run.config_manager import ClusterConfig from llmb_run.constants import EXCLUDE_WORKLOADS, METADATA_FILE_PATTERN -from llmb_run.metadata_utils import normalize_model_dtype_config +from llmb_run.metadata_utils import model_size_to_billions, normalize_model_dtype_config logger = logging.getLogger('llmb_run.workload_validator') @@ -164,7 +164,10 @@ def validate_workload_with_details( break if not model_config: - available_sizes = sorted({config.get('model_size') for config in model_configs if config.get('model_size')}) + available_sizes = sorted( + {config.get('model_size') for config in model_configs if config.get('model_size')}, + key=lambda size: (model_size_to_billions(size), size), + ) gpu_info = f" for GPU type '{cluster_gpu_type}'" error_msg = f"Model size '{model_size}' not supported for workload '{workload_key}'{gpu_info}." return False, ValidationErrorType.MODEL_SIZE_NOT_SUPPORTED, error_msg, available_sizes @@ -375,7 +378,7 @@ def print_avail_workloads(workloads, cluster_config, cluster_gpu_type=None, verb model_details[model_size]['gpu_types'].add(gpu_type) if verbose: - for model_size in sorted(all_model_sizes): + for model_size in sorted(all_model_sizes, key=lambda size: (model_size_to_billions(size), size)): details = model_details[model_size] logger.info(f" {model_size}:") if details['dtypes']: @@ -432,7 +435,7 @@ def print_avail_workloads(workloads, cluster_config, cluster_gpu_type=None, verb if len(details['gpu_types']) > 1 or not cluster_gpu_type: logger.info(f" GPU types: {', '.join(sorted(details['gpu_types']))}") else: - model_sizes = sorted(all_model_sizes) + model_sizes = sorted(all_model_sizes, key=lambda size: (model_size_to_billions(size), size)) logger.info(f" Model sizes: {', '.join(model_sizes)}") if not verbose: diff --git a/cli/llmb-run/uv.lock b/cli/llmb-run/uv.lock index 0c7b3d3..a7c1218 100644 --- a/cli/llmb-run/uv.lock +++ b/cli/llmb-run/uv.lock @@ -2,9 +2,27 @@ version = 1 revision = 3 requires-python = ">=3.10" +[[package]] +name = "annotated-doc" +version = "0.0.4" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/57/ba/046ceea27344560984e26a590f90bc7f4a75b06701f653222458922b558c/annotated_doc-0.0.4.tar.gz", hash = "sha256:fbcda96e87e9c92ad167c2e53839e57503ecfda18804ea28102353485033faa4", size = 7288, upload-time = "2025-11-10T22:07:42.062Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/1e/d3/26bf1008eb3d2daa8ef4cacc7f3bfdc11818d111f7e2d0201bc6e3b49d45/annotated_doc-0.0.4-py3-none-any.whl", hash = "sha256:571ac1dc6991c450b25a9c2d84a3705e2ae7a53467b5d111c24fa8baabbed320", size = 5303, upload-time = "2025-11-10T22:07:40.673Z" }, +] + +[[package]] +name = "annotated-types" +version = "0.7.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/ee/67/531ea369ba64dcff5ec9c3402f9f51bf748cec26dde048a2f973a4eea7f5/annotated_types-0.7.0.tar.gz", hash = "sha256:aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89", size = 16081, upload-time = "2024-05-20T21:33:25.928Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643, upload-time = "2024-05-20T21:33:24.1Z" }, +] + [[package]] name = "black" -version = "26.1.0" +version = "26.3.1" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "click" }, @@ -16,46 +34,46 @@ dependencies = [ { name = "tomli", marker = "python_full_version < '3.11'" }, { name = "typing-extensions", marker = "python_full_version < '3.11'" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/13/88/560b11e521c522440af991d46848a2bde64b5f7202ec14e1f46f9509d328/black-26.1.0.tar.gz", hash = "sha256:d294ac3340eef9c9eb5d29288e96dc719ff269a88e27b396340459dd85da4c58", size = 658785, upload-time = "2026-01-18T04:50:11.993Z" } +sdist = { url = "https://files.pythonhosted.org/packages/e1/c5/61175d618685d42b005847464b8fb4743a67b1b8fdb75e50e5a96c31a27a/black-26.3.1.tar.gz", hash = "sha256:2c50f5063a9641c7eed7795014ba37b0f5fa227f3d408b968936e24bc0566b07", size = 666155, upload-time = "2026-03-12T03:36:03.593Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/51/1b/523329e713f965ad0ea2b7a047eeb003007792a0353622ac7a8cb2ee6fef/black-26.1.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:ca699710dece84e3ebf6e92ee15f5b8f72870ef984bf944a57a777a48357c168", size = 1849661, upload-time = "2026-01-18T04:59:12.425Z" }, - { url = "https://files.pythonhosted.org/packages/14/82/94c0640f7285fa71c2f32879f23e609dd2aa39ba2641f395487f24a578e7/black-26.1.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:5e8e75dabb6eb83d064b0db46392b25cabb6e784ea624219736e8985a6b3675d", size = 1689065, upload-time = "2026-01-18T04:59:13.993Z" }, - { url = "https://files.pythonhosted.org/packages/f0/78/474373cbd798f9291ed8f7107056e343fd39fef42de4a51c7fd0d360840c/black-26.1.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:eb07665d9a907a1a645ee41a0df8a25ffac8ad9c26cdb557b7b88eeeeec934e0", size = 1751502, upload-time = "2026-01-18T04:59:15.971Z" }, - { url = "https://files.pythonhosted.org/packages/29/89/59d0e350123f97bc32c27c4d79563432d7f3530dca2bff64d855c178af8b/black-26.1.0-cp310-cp310-win_amd64.whl", hash = "sha256:7ed300200918147c963c87700ccf9966dceaefbbb7277450a8d646fc5646bf24", size = 1400102, upload-time = "2026-01-18T04:59:17.8Z" }, - { url = "https://files.pythonhosted.org/packages/e1/bc/5d866c7ae1c9d67d308f83af5462ca7046760158bbf142502bad8f22b3a1/black-26.1.0-cp310-cp310-win_arm64.whl", hash = "sha256:c5b7713daea9bf943f79f8c3b46f361cc5229e0e604dcef6a8bb6d1c37d9df89", size = 1207038, upload-time = "2026-01-18T04:59:19.543Z" }, - { url = "https://files.pythonhosted.org/packages/30/83/f05f22ff13756e1a8ce7891db517dbc06200796a16326258268f4658a745/black-26.1.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:3cee1487a9e4c640dc7467aaa543d6c0097c391dc8ac74eb313f2fbf9d7a7cb5", size = 1831956, upload-time = "2026-01-18T04:59:21.38Z" }, - { url = "https://files.pythonhosted.org/packages/7d/f2/b2c570550e39bedc157715e43927360312d6dd677eed2cc149a802577491/black-26.1.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:d62d14ca31c92adf561ebb2e5f2741bf8dea28aef6deb400d49cca011d186c68", size = 1672499, upload-time = "2026-01-18T04:59:23.257Z" }, - { url = "https://files.pythonhosted.org/packages/7a/d7/990d6a94dc9e169f61374b1c3d4f4dd3037e93c2cc12b6f3b12bc663aa7b/black-26.1.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fb1dafbbaa3b1ee8b4550a84425aac8874e5f390200f5502cf3aee4a2acb2f14", size = 1735431, upload-time = "2026-01-18T04:59:24.729Z" }, - { url = "https://files.pythonhosted.org/packages/36/1c/cbd7bae7dd3cb315dfe6eeca802bb56662cc92b89af272e014d98c1f2286/black-26.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:101540cb2a77c680f4f80e628ae98bd2bd8812fb9d72ade4f8995c5ff019e82c", size = 1400468, upload-time = "2026-01-18T04:59:27.381Z" }, - { url = "https://files.pythonhosted.org/packages/59/b1/9fe6132bb2d0d1f7094613320b56297a108ae19ecf3041d9678aec381b37/black-26.1.0-cp311-cp311-win_arm64.whl", hash = "sha256:6f3977a16e347f1b115662be07daa93137259c711e526402aa444d7a88fdc9d4", size = 1207332, upload-time = "2026-01-18T04:59:28.711Z" }, - { url = "https://files.pythonhosted.org/packages/f5/13/710298938a61f0f54cdb4d1c0baeb672c01ff0358712eddaf29f76d32a0b/black-26.1.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:6eeca41e70b5f5c84f2f913af857cf2ce17410847e1d54642e658e078da6544f", size = 1878189, upload-time = "2026-01-18T04:59:30.682Z" }, - { url = "https://files.pythonhosted.org/packages/79/a6/5179beaa57e5dbd2ec9f1c64016214057b4265647c62125aa6aeffb05392/black-26.1.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:dd39eef053e58e60204f2cdf059e2442e2eb08f15989eefe259870f89614c8b6", size = 1700178, upload-time = "2026-01-18T04:59:32.387Z" }, - { url = "https://files.pythonhosted.org/packages/8c/04/c96f79d7b93e8f09d9298b333ca0d31cd9b2ee6c46c274fd0f531de9dc61/black-26.1.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9459ad0d6cd483eacad4c6566b0f8e42af5e8b583cee917d90ffaa3778420a0a", size = 1777029, upload-time = "2026-01-18T04:59:33.767Z" }, - { url = "https://files.pythonhosted.org/packages/49/f9/71c161c4c7aa18bdda3776b66ac2dc07aed62053c7c0ff8bbda8c2624fe2/black-26.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:a19915ec61f3a8746e8b10adbac4a577c6ba9851fa4a9e9fbfbcf319887a5791", size = 1406466, upload-time = "2026-01-18T04:59:35.177Z" }, - { url = "https://files.pythonhosted.org/packages/4a/8b/a7b0f974e473b159d0ac1b6bcefffeb6bec465898a516ee5cc989503cbc7/black-26.1.0-cp312-cp312-win_arm64.whl", hash = "sha256:643d27fb5facc167c0b1b59d0315f2674a6e950341aed0fc05cf307d22bf4954", size = 1216393, upload-time = "2026-01-18T04:59:37.18Z" }, - { url = "https://files.pythonhosted.org/packages/79/04/fa2f4784f7237279332aa735cdfd5ae2e7730db0072fb2041dadda9ae551/black-26.1.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:ba1d768fbfb6930fc93b0ecc32a43d8861ded16f47a40f14afa9bb04ab93d304", size = 1877781, upload-time = "2026-01-18T04:59:39.054Z" }, - { url = "https://files.pythonhosted.org/packages/cf/ad/5a131b01acc0e5336740a039628c0ab69d60cf09a2c87a4ec49f5826acda/black-26.1.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:2b807c240b64609cb0e80d2200a35b23c7df82259f80bef1b2c96eb422b4aac9", size = 1699670, upload-time = "2026-01-18T04:59:41.005Z" }, - { url = "https://files.pythonhosted.org/packages/da/7c/b05f22964316a52ab6b4265bcd52c0ad2c30d7ca6bd3d0637e438fc32d6e/black-26.1.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1de0f7d01cc894066a1153b738145b194414cc6eeaad8ef4397ac9abacf40f6b", size = 1775212, upload-time = "2026-01-18T04:59:42.545Z" }, - { url = "https://files.pythonhosted.org/packages/a6/a3/e8d1526bea0446e040193185353920a9506eab60a7d8beb062029129c7d2/black-26.1.0-cp313-cp313-win_amd64.whl", hash = "sha256:91a68ae46bf07868963671e4d05611b179c2313301bd756a89ad4e3b3db2325b", size = 1409953, upload-time = "2026-01-18T04:59:44.357Z" }, - { url = "https://files.pythonhosted.org/packages/c7/5a/d62ebf4d8f5e3a1daa54adaab94c107b57be1b1a2f115a0249b41931e188/black-26.1.0-cp313-cp313-win_arm64.whl", hash = "sha256:be5e2fe860b9bd9edbf676d5b60a9282994c03fbbd40fe8f5e75d194f96064ca", size = 1217707, upload-time = "2026-01-18T04:59:45.719Z" }, - { url = "https://files.pythonhosted.org/packages/6a/83/be35a175aacfce4b05584ac415fd317dd6c24e93a0af2dcedce0f686f5d8/black-26.1.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:9dc8c71656a79ca49b8d3e2ce8103210c9481c57798b48deeb3a8bb02db5f115", size = 1871864, upload-time = "2026-01-18T04:59:47.586Z" }, - { url = "https://files.pythonhosted.org/packages/a5/f5/d33696c099450b1274d925a42b7a030cd3ea1f56d72e5ca8bbed5f52759c/black-26.1.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:b22b3810451abe359a964cc88121d57f7bce482b53a066de0f1584988ca36e79", size = 1701009, upload-time = "2026-01-18T04:59:49.443Z" }, - { url = "https://files.pythonhosted.org/packages/1b/87/670dd888c537acb53a863bc15abbd85b22b429237d9de1b77c0ed6b79c42/black-26.1.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:53c62883b3f999f14e5d30b5a79bd437236658ad45b2f853906c7cbe79de00af", size = 1767806, upload-time = "2026-01-18T04:59:50.769Z" }, - { url = "https://files.pythonhosted.org/packages/fe/9c/cd3deb79bfec5bcf30f9d2100ffeec63eecce826eb63e3961708b9431ff1/black-26.1.0-cp314-cp314-win_amd64.whl", hash = "sha256:f016baaadc423dc960cdddf9acae679e71ee02c4c341f78f3179d7e4819c095f", size = 1433217, upload-time = "2026-01-18T04:59:52.218Z" }, - { url = "https://files.pythonhosted.org/packages/4e/29/f3be41a1cf502a283506f40f5d27203249d181f7a1a2abce1c6ce188035a/black-26.1.0-cp314-cp314-win_arm64.whl", hash = "sha256:66912475200b67ef5a0ab665011964bf924745103f51977a78b4fb92a9fc1bf0", size = 1245773, upload-time = "2026-01-18T04:59:54.457Z" }, - { url = "https://files.pythonhosted.org/packages/e4/3d/51bdb3ecbfadfaf825ec0c75e1de6077422b4afa2091c6c9ba34fbfc0c2d/black-26.1.0-py3-none-any.whl", hash = "sha256:1054e8e47ebd686e078c0bb0eaf31e6ce69c966058d122f2c0c950311f9f3ede", size = 204010, upload-time = "2026-01-18T04:50:09.978Z" }, + { url = "https://files.pythonhosted.org/packages/32/a8/11170031095655d36ebc6664fe0897866f6023892396900eec0e8fdc4299/black-26.3.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:86a8b5035fce64f5dcd1b794cf8ec4d31fe458cf6ce3986a30deb434df82a1d2", size = 1866562, upload-time = "2026-03-12T03:39:58.639Z" }, + { url = "https://files.pythonhosted.org/packages/69/ce/9e7548d719c3248c6c2abfd555d11169457cbd584d98d179111338423790/black-26.3.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:5602bdb96d52d2d0672f24f6ffe5218795736dd34807fd0fd55ccd6bf206168b", size = 1703623, upload-time = "2026-03-12T03:40:00.347Z" }, + { url = "https://files.pythonhosted.org/packages/7f/0a/8d17d1a9c06f88d3d030d0b1d4373c1551146e252afe4547ed601c0e697f/black-26.3.1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6c54a4a82e291a1fee5137371ab488866b7c86a3305af4026bdd4dc78642e1ac", size = 1768388, upload-time = "2026-03-12T03:40:01.765Z" }, + { url = "https://files.pythonhosted.org/packages/52/79/c1ee726e221c863cde5164f925bacf183dfdf0397d4e3f94889439b947b4/black-26.3.1-cp310-cp310-win_amd64.whl", hash = "sha256:6e131579c243c98f35bce64a7e08e87fb2d610544754675d4a0e73a070a5aa3a", size = 1412969, upload-time = "2026-03-12T03:40:03.252Z" }, + { url = "https://files.pythonhosted.org/packages/73/a5/15c01d613f5756f68ed8f6d4ec0a1e24b82b18889fa71affd3d1f7fad058/black-26.3.1-cp310-cp310-win_arm64.whl", hash = "sha256:5ed0ca58586c8d9a487352a96b15272b7fa55d139fc8496b519e78023a8dab0a", size = 1220345, upload-time = "2026-03-12T03:40:04.892Z" }, + { url = "https://files.pythonhosted.org/packages/17/57/5f11c92861f9c92eb9dddf515530bc2d06db843e44bdcf1c83c1427824bc/black-26.3.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:28ef38aee69e4b12fda8dba75e21f9b4f979b490c8ac0baa7cb505369ac9e1ff", size = 1851987, upload-time = "2026-03-12T03:40:06.248Z" }, + { url = "https://files.pythonhosted.org/packages/54/aa/340a1463660bf6831f9e39646bf774086dbd8ca7fc3cded9d59bbdf4ad0a/black-26.3.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:bf9bf162ed91a26f1adba8efda0b573bc6924ec1408a52cc6f82cb73ec2b142c", size = 1689499, upload-time = "2026-03-12T03:40:07.642Z" }, + { url = "https://files.pythonhosted.org/packages/f3/01/b726c93d717d72733da031d2de10b92c9fa4c8d0c67e8a8a372076579279/black-26.3.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:474c27574d6d7037c1bc875a81d9be0a9a4f9ee95e62800dab3cfaadbf75acd5", size = 1754369, upload-time = "2026-03-12T03:40:09.279Z" }, + { url = "https://files.pythonhosted.org/packages/e3/09/61e91881ca291f150cfc9eb7ba19473c2e59df28859a11a88248b5cbbc4d/black-26.3.1-cp311-cp311-win_amd64.whl", hash = "sha256:5e9d0d86df21f2e1677cc4bd090cd0e446278bcbbe49bf3659c308c3e402843e", size = 1413613, upload-time = "2026-03-12T03:40:10.943Z" }, + { url = "https://files.pythonhosted.org/packages/16/73/544f23891b22e7efe4d8f812371ab85b57f6a01b2fc45e3ba2e52ba985b8/black-26.3.1-cp311-cp311-win_arm64.whl", hash = "sha256:9a5e9f45e5d5e1c5b5c29b3bd4265dcc90e8b92cf4534520896ed77f791f4da5", size = 1219719, upload-time = "2026-03-12T03:40:12.597Z" }, + { url = "https://files.pythonhosted.org/packages/dc/f8/da5eae4fc75e78e6dceb60624e1b9662ab00d6b452996046dfa9b8a6025b/black-26.3.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b5e6f89631eb88a7302d416594a32faeee9fb8fb848290da9d0a5f2903519fc1", size = 1895920, upload-time = "2026-03-12T03:40:13.921Z" }, + { url = "https://files.pythonhosted.org/packages/2c/9f/04e6f26534da2e1629b2b48255c264cabf5eedc5141d04516d9d68a24111/black-26.3.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:41cd2012d35b47d589cb8a16faf8a32ef7a336f56356babd9fcf70939ad1897f", size = 1718499, upload-time = "2026-03-12T03:40:15.239Z" }, + { url = "https://files.pythonhosted.org/packages/04/91/a5935b2a63e31b331060c4a9fdb5a6c725840858c599032a6f3aac94055f/black-26.3.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0f76ff19ec5297dd8e66eb64deda23631e642c9393ab592826fd4bdc97a4bce7", size = 1794994, upload-time = "2026-03-12T03:40:17.124Z" }, + { url = "https://files.pythonhosted.org/packages/e7/0a/86e462cdd311a3c2a8ece708d22aba17d0b2a0d5348ca34b40cdcbea512e/black-26.3.1-cp312-cp312-win_amd64.whl", hash = "sha256:ddb113db38838eb9f043623ba274cfaf7d51d5b0c22ecb30afe58b1bb8322983", size = 1420867, upload-time = "2026-03-12T03:40:18.83Z" }, + { url = "https://files.pythonhosted.org/packages/5b/e5/22515a19cb7eaee3440325a6b0d95d2c0e88dd180cb011b12ae488e031d1/black-26.3.1-cp312-cp312-win_arm64.whl", hash = "sha256:dfdd51fc3e64ea4f35873d1b3fb25326773d55d2329ff8449139ebaad7357efb", size = 1230124, upload-time = "2026-03-12T03:40:20.425Z" }, + { url = "https://files.pythonhosted.org/packages/f5/77/5728052a3c0450c53d9bb3945c4c46b91baa62b2cafab6801411b6271e45/black-26.3.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:855822d90f884905362f602880ed8b5df1b7e3ee7d0db2502d4388a954cc8c54", size = 1895034, upload-time = "2026-03-12T03:40:21.813Z" }, + { url = "https://files.pythonhosted.org/packages/52/73/7cae55fdfdfbe9d19e9a8d25d145018965fe2079fa908101c3733b0c55a0/black-26.3.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:8a33d657f3276328ce00e4d37fe70361e1ec7614da5d7b6e78de5426cb56332f", size = 1718503, upload-time = "2026-03-12T03:40:23.666Z" }, + { url = "https://files.pythonhosted.org/packages/e1/87/af89ad449e8254fdbc74654e6467e3c9381b61472cc532ee350d28cfdafb/black-26.3.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f1cd08e99d2f9317292a311dfe578fd2a24b15dbce97792f9c4d752275c1fa56", size = 1793557, upload-time = "2026-03-12T03:40:25.497Z" }, + { url = "https://files.pythonhosted.org/packages/43/10/d6c06a791d8124b843bf325ab4ac7d2f5b98731dff84d6064eafd687ded1/black-26.3.1-cp313-cp313-win_amd64.whl", hash = "sha256:c7e72339f841b5a237ff14f7d3880ddd0fc7f98a1199e8c4327f9a4f478c1839", size = 1422766, upload-time = "2026-03-12T03:40:27.14Z" }, + { url = "https://files.pythonhosted.org/packages/59/4f/40a582c015f2d841ac24fed6390bd68f0fc896069ff3a886317959c9daf8/black-26.3.1-cp313-cp313-win_arm64.whl", hash = "sha256:afc622538b430aa4c8c853f7f63bc582b3b8030fd8c80b70fb5fa5b834e575c2", size = 1232140, upload-time = "2026-03-12T03:40:28.882Z" }, + { url = "https://files.pythonhosted.org/packages/d5/da/e36e27c9cebc1311b7579210df6f1c86e50f2d7143ae4fcf8a5017dc8809/black-26.3.1-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:2d6bfaf7fd0993b420bed691f20f9492d53ce9a2bcccea4b797d34e947318a78", size = 1889234, upload-time = "2026-03-12T03:40:30.964Z" }, + { url = "https://files.pythonhosted.org/packages/0e/7b/9871acf393f64a5fa33668c19350ca87177b181f44bb3d0c33b2d534f22c/black-26.3.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:f89f2ab047c76a9c03f78d0d66ca519e389519902fa27e7a91117ef7611c0568", size = 1720522, upload-time = "2026-03-12T03:40:32.346Z" }, + { url = "https://files.pythonhosted.org/packages/03/87/e766c7f2e90c07fb7586cc787c9ae6462b1eedab390191f2b7fc7f6170a9/black-26.3.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b07fc0dab849d24a80a29cfab8d8a19187d1c4685d8a5e6385a5ce323c1f015f", size = 1787824, upload-time = "2026-03-12T03:40:33.636Z" }, + { url = "https://files.pythonhosted.org/packages/ac/94/2424338fb2d1875e9e83eed4c8e9c67f6905ec25afd826a911aea2b02535/black-26.3.1-cp314-cp314-win_amd64.whl", hash = "sha256:0126ae5b7c09957da2bdbd91a9ba1207453feada9e9fe51992848658c6c8e01c", size = 1445855, upload-time = "2026-03-12T03:40:35.442Z" }, + { url = "https://files.pythonhosted.org/packages/86/43/0c3338bd928afb8ee7471f1a4eec3bdbe2245ccb4a646092a222e8669840/black-26.3.1-cp314-cp314-win_arm64.whl", hash = "sha256:92c0ec1f2cc149551a2b7b47efc32c866406b6891b0ee4625e95967c8f4acfb1", size = 1258109, upload-time = "2026-03-12T03:40:36.832Z" }, + { url = "https://files.pythonhosted.org/packages/8e/0d/52d98722666d6fc6c3dd4c76df339501d6efd40e0ff95e6186a7b7f0befd/black-26.3.1-py3-none-any.whl", hash = "sha256:2bd5aa94fc267d38bb21a70d7410a89f1a1d318841855f698746f8e7f51acd1b", size = 207542, upload-time = "2026-03-12T03:36:01.668Z" }, ] [[package]] name = "click" -version = "8.3.1" +version = "8.3.3" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "colorama", marker = "sys_platform == 'win32'" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/3d/fa/656b739db8587d7b5dfa22e22ed02566950fbfbcdc20311993483657a5c0/click-8.3.1.tar.gz", hash = "sha256:12ff4785d337a1bb490bb7e9c2b1ee5da3112e94a8622f26a6c77f5d2fc6842a", size = 295065, upload-time = "2025-11-15T20:45:42.706Z" } +sdist = { url = "https://files.pythonhosted.org/packages/bb/63/f9e1ea081ce35720d8b92acde70daaedace594dc93b693c869e0d5910718/click-8.3.3.tar.gz", hash = "sha256:398329ad4837b2ff7cbe1dd166a4c0f8900c3ca3a218de04466f38f6497f18a2", size = 328061, upload-time = "2026-04-22T15:11:27.506Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/98/78/01c019cdb5d6498122777c1a43056ebb3ebfeef2076d9d026bfe15583b2b/click-8.3.1-py3-none-any.whl", hash = "sha256:981153a64e25f12d547d3426c367a4857371575ee7ad18df2a6183ab0545b2a6", size = 108274, upload-time = "2025-11-15T20:45:41.139Z" }, + { url = "https://files.pythonhosted.org/packages/ae/44/c1221527f6a71a01ec6fbad7fa78f1d50dfa02217385cf0fa3eec7087d59/click-8.3.3-py3-none-any.whl", hash = "sha256:a2bf429bb3033c89fa4936ffb35d5cb471e3719e1f3c8a7c3fff0b8314305613", size = 110502, upload-time = "2026-04-22T15:11:25.044Z" }, ] [[package]] @@ -69,101 +87,115 @@ wheels = [ [[package]] name = "coverage" -version = "7.13.3" +version = "7.13.5" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/11/43/3e4ac666cc35f231fa70c94e9f38459299de1a152813f9d2f60fc5f3ecaf/coverage-7.13.3.tar.gz", hash = "sha256:f7f6182d3dfb8802c1747eacbfe611b669455b69b7c037484bb1efbbb56711ac", size = 826832, upload-time = "2026-02-03T14:02:30.944Z" } +sdist = { url = "https://files.pythonhosted.org/packages/9d/e0/70553e3000e345daff267cec284ce4cbf3fc141b6da229ac52775b5428f1/coverage-7.13.5.tar.gz", hash = "sha256:c81f6515c4c40141f83f502b07bbfa5c240ba25bbe73da7b33f1e5b6120ff179", size = 915967, upload-time = "2026-03-17T10:33:18.341Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/ab/07/1c8099563a8a6c389a31c2d0aa1497cee86d6248bb4b9ba5e779215db9f9/coverage-7.13.3-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:0b4f345f7265cdbdb5ec2521ffff15fa49de6d6c39abf89fc7ad68aa9e3a55f0", size = 219143, upload-time = "2026-02-03T13:59:40.459Z" }, - { url = "https://files.pythonhosted.org/packages/69/39/a892d44af7aa092cab70e0cc5cdbba18eeccfe1d6930695dab1742eef9e9/coverage-7.13.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:96c3be8bae9d0333e403cc1a8eb078a7f928b5650bae94a18fb4820cc993fb9b", size = 219663, upload-time = "2026-02-03T13:59:41.951Z" }, - { url = "https://files.pythonhosted.org/packages/9a/25/9669dcf4c2bb4c3861469e6db20e52e8c11908cf53c14ec9b12e9fd4d602/coverage-7.13.3-cp310-cp310-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:d6f4a21328ea49d38565b55599e1c02834e76583a6953e5586d65cb1efebd8f8", size = 246424, upload-time = "2026-02-03T13:59:43.418Z" }, - { url = "https://files.pythonhosted.org/packages/f3/68/d9766c4e298aca62ea5d9543e1dd1e4e1439d7284815244d8b7db1840bfb/coverage-7.13.3-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:fc970575799a9d17d5c3fafd83a0f6ccf5d5117cdc9ad6fbd791e9ead82418b0", size = 248228, upload-time = "2026-02-03T13:59:44.816Z" }, - { url = "https://files.pythonhosted.org/packages/f0/e2/eea6cb4a4bd443741adf008d4cccec83a1f75401df59b6559aca2bdd9710/coverage-7.13.3-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:87ff33b652b3556b05e204ae20793d1f872161b0fa5ec8a9ac76f8430e152ed6", size = 250103, upload-time = "2026-02-03T13:59:46.271Z" }, - { url = "https://files.pythonhosted.org/packages/db/77/664280ecd666c2191610842177e2fab9e5dbdeef97178e2078fed46a3d2c/coverage-7.13.3-cp310-cp310-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:7df8759ee57b9f3f7b66799b7660c282f4375bef620ade1686d6a7b03699e75f", size = 247107, upload-time = "2026-02-03T13:59:48.53Z" }, - { url = "https://files.pythonhosted.org/packages/2b/df/2a672eab99e0d0eba52d8a63e47dc92245eee26954d1b2d3c8f7d372151f/coverage-7.13.3-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:f45c9bcb16bee25a798ccba8a2f6a1251b19de6a0d617bb365d7d2f386c4e20e", size = 248143, upload-time = "2026-02-03T13:59:50.027Z" }, - { url = "https://files.pythonhosted.org/packages/a5/dc/a104e7a87c13e57a358b8b9199a8955676e1703bb372d79722b54978ae45/coverage-7.13.3-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:318b2e4753cbf611061e01b6cc81477e1cdfeb69c36c4a14e6595e674caadb56", size = 246148, upload-time = "2026-02-03T13:59:52.025Z" }, - { url = "https://files.pythonhosted.org/packages/2b/89/e113d3a58dc20b03b7e59aed1e53ebc9ca6167f961876443e002b10e3ae9/coverage-7.13.3-cp310-cp310-musllinux_1_2_riscv64.whl", hash = "sha256:24db3959de8ee394eeeca89ccb8ba25305c2da9a668dd44173394cbd5aa0777f", size = 246414, upload-time = "2026-02-03T13:59:53.859Z" }, - { url = "https://files.pythonhosted.org/packages/3f/60/a3fd0a6e8d89b488396019a2268b6a1f25ab56d6d18f3be50f35d77b47dc/coverage-7.13.3-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:be14d0622125edef21b3a4d8cd2d138c4872bf6e38adc90fd92385e3312f406a", size = 247023, upload-time = "2026-02-03T13:59:55.454Z" }, - { url = "https://files.pythonhosted.org/packages/19/fa/de4840bb939dbb22ba0648a6d8069fa91c9cf3b3fca8b0d1df461e885b3d/coverage-7.13.3-cp310-cp310-win32.whl", hash = "sha256:53be4aab8ddef18beb6188f3a3fdbf4d1af2277d098d4e618be3a8e6c88e74be", size = 221751, upload-time = "2026-02-03T13:59:57.383Z" }, - { url = "https://files.pythonhosted.org/packages/de/87/233ff8b7ef62fb63f58c78623b50bef69681111e0c4d43504f422d88cda4/coverage-7.13.3-cp310-cp310-win_amd64.whl", hash = "sha256:bfeee64ad8b4aae3233abb77eb6b52b51b05fa89da9645518671b9939a78732b", size = 222686, upload-time = "2026-02-03T13:59:58.825Z" }, - { url = "https://files.pythonhosted.org/packages/ec/09/1ac74e37cf45f17eb41e11a21854f7f92a4c2d6c6098ef4a1becb0c6d8d3/coverage-7.13.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:5907605ee20e126eeee2abe14aae137043c2c8af2fa9b38d2ab3b7a6b8137f73", size = 219276, upload-time = "2026-02-03T14:00:00.296Z" }, - { url = "https://files.pythonhosted.org/packages/2e/cb/71908b08b21beb2c437d0d5870c4ec129c570ca1b386a8427fcdb11cf89c/coverage-7.13.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:a88705500988c8acad8b8fd86c2a933d3aa96bec1ddc4bc5cb256360db7bbd00", size = 219776, upload-time = "2026-02-03T14:00:02.414Z" }, - { url = "https://files.pythonhosted.org/packages/09/85/c4f3dd69232887666a2c0394d4be21c60ea934d404db068e6c96aa59cd87/coverage-7.13.3-cp311-cp311-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:7bbb5aa9016c4c29e3432e087aa29ebee3f8fda089cfbfb4e6d64bd292dcd1c2", size = 250196, upload-time = "2026-02-03T14:00:04.197Z" }, - { url = "https://files.pythonhosted.org/packages/9c/cc/560ad6f12010344d0778e268df5ba9aa990aacccc310d478bf82bf3d302c/coverage-7.13.3-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:0c2be202a83dde768937a61cdc5d06bf9fb204048ca199d93479488e6247656c", size = 252111, upload-time = "2026-02-03T14:00:05.639Z" }, - { url = "https://files.pythonhosted.org/packages/f0/66/3193985fb2c58e91f94cfbe9e21a6fdf941e9301fe2be9e92c072e9c8f8c/coverage-7.13.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0f45e32ef383ce56e0ca099b2e02fcdf7950be4b1b56afaab27b4ad790befe5b", size = 254217, upload-time = "2026-02-03T14:00:07.738Z" }, - { url = "https://files.pythonhosted.org/packages/c5/78/f0f91556bf1faa416792e537c523c5ef9db9b1d32a50572c102b3d7c45b3/coverage-7.13.3-cp311-cp311-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:6ed2e787249b922a93cd95c671cc9f4c9797a106e81b455c83a9ddb9d34590c0", size = 250318, upload-time = "2026-02-03T14:00:09.224Z" }, - { url = "https://files.pythonhosted.org/packages/6f/aa/fc654e45e837d137b2c1f3a2cc09b4aea1e8b015acd2f774fa0f3d2ddeba/coverage-7.13.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:05dd25b21afffe545e808265897c35f32d3e4437663923e0d256d9ab5031fb14", size = 251909, upload-time = "2026-02-03T14:00:10.712Z" }, - { url = "https://files.pythonhosted.org/packages/73/4d/ab53063992add8a9ca0463c9d92cce5994a29e17affd1c2daa091b922a93/coverage-7.13.3-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:46d29926349b5c4f1ea4fca95e8c892835515f3600995a383fa9a923b5739ea4", size = 249971, upload-time = "2026-02-03T14:00:12.402Z" }, - { url = "https://files.pythonhosted.org/packages/29/25/83694b81e46fcff9899694a1b6f57573429cdd82b57932f09a698f03eea5/coverage-7.13.3-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:fae6a21537519c2af00245e834e5bf2884699cc7c1055738fd0f9dc37a3644ad", size = 249692, upload-time = "2026-02-03T14:00:13.868Z" }, - { url = "https://files.pythonhosted.org/packages/d4/ef/d68fc304301f4cb4bf6aefa0045310520789ca38dabdfba9dbecd3f37919/coverage-7.13.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:c672d4e2f0575a4ca2bf2aa0c5ced5188220ab806c1bb6d7179f70a11a017222", size = 250597, upload-time = "2026-02-03T14:00:15.461Z" }, - { url = "https://files.pythonhosted.org/packages/8d/85/240ad396f914df361d0f71e912ddcedb48130c71b88dc4193fe3c0306f00/coverage-7.13.3-cp311-cp311-win32.whl", hash = "sha256:fcda51c918c7a13ad93b5f89a58d56e3a072c9e0ba5c231b0ed81404bf2648fb", size = 221773, upload-time = "2026-02-03T14:00:17.462Z" }, - { url = "https://files.pythonhosted.org/packages/2f/71/165b3a6d3d052704a9ab52d11ea64ef3426745de517dda44d872716213a7/coverage-7.13.3-cp311-cp311-win_amd64.whl", hash = "sha256:d1a049b5c51b3b679928dd35e47c4a2235e0b6128b479a7596d0ef5b42fa6301", size = 222711, upload-time = "2026-02-03T14:00:19.449Z" }, - { url = "https://files.pythonhosted.org/packages/51/d0/0ddc9c5934cdd52639c5df1f1eb0fdab51bb52348f3a8d1c7db9c600d93a/coverage-7.13.3-cp311-cp311-win_arm64.whl", hash = "sha256:79f2670c7e772f4917895c3d89aad59e01f3dbe68a4ed2d0373b431fad1dcfba", size = 221377, upload-time = "2026-02-03T14:00:20.968Z" }, - { url = "https://files.pythonhosted.org/packages/94/44/330f8e83b143f6668778ed61d17ece9dc48459e9e74669177de02f45fec5/coverage-7.13.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:ed48b4170caa2c4420e0cd27dc977caaffc7eecc317355751df8373dddcef595", size = 219441, upload-time = "2026-02-03T14:00:22.585Z" }, - { url = "https://files.pythonhosted.org/packages/08/e7/29db05693562c2e65bdf6910c0af2fd6f9325b8f43caf7a258413f369e30/coverage-7.13.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:8f2adf4bcffbbec41f366f2e6dffb9d24e8172d16e91da5799c9b7ed6b5716e6", size = 219801, upload-time = "2026-02-03T14:00:24.186Z" }, - { url = "https://files.pythonhosted.org/packages/90/ae/7f8a78249b02b0818db46220795f8ac8312ea4abd1d37d79ea81db5cae81/coverage-7.13.3-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:01119735c690786b6966a1e9f098da4cd7ca9174c4cfe076d04e653105488395", size = 251306, upload-time = "2026-02-03T14:00:25.798Z" }, - { url = "https://files.pythonhosted.org/packages/62/71/a18a53d1808e09b2e9ebd6b47dad5e92daf4c38b0686b4c4d1b2f3e42b7f/coverage-7.13.3-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:8bb09e83c603f152d855f666d70a71765ca8e67332e5829e62cb9466c176af23", size = 254051, upload-time = "2026-02-03T14:00:27.474Z" }, - { url = "https://files.pythonhosted.org/packages/4a/0a/eb30f6455d04c5a3396d0696cad2df0269ae7444bb322f86ffe3376f7bf9/coverage-7.13.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b607a40cba795cfac6d130220d25962931ce101f2f478a29822b19755377fb34", size = 255160, upload-time = "2026-02-03T14:00:29.024Z" }, - { url = "https://files.pythonhosted.org/packages/7b/7e/a45baac86274ce3ed842dbb84f14560c673ad30535f397d89164ec56c5df/coverage-7.13.3-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:44f14a62f5da2e9aedf9080e01d2cda61df39197d48e323538ec037336d68da8", size = 251709, upload-time = "2026-02-03T14:00:30.641Z" }, - { url = "https://files.pythonhosted.org/packages/c0/df/dd0dc12f30da11349993f3e218901fdf82f45ee44773596050c8f5a1fb25/coverage-7.13.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:debf29e0b157769843dff0981cc76f79e0ed04e36bb773c6cac5f6029054bd8a", size = 253083, upload-time = "2026-02-03T14:00:32.14Z" }, - { url = "https://files.pythonhosted.org/packages/ab/32/fc764c8389a8ce95cb90eb97af4c32f392ab0ac23ec57cadeefb887188d3/coverage-7.13.3-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:824bb95cd71604031ae9a48edb91fd6effde669522f960375668ed21b36e3ec4", size = 251227, upload-time = "2026-02-03T14:00:34.721Z" }, - { url = "https://files.pythonhosted.org/packages/dd/ca/d025e9da8f06f24c34d2da9873957cfc5f7e0d67802c3e34d0caa8452130/coverage-7.13.3-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:8f1010029a5b52dc427c8e2a8dbddb2303ddd180b806687d1acd1bb1d06649e7", size = 250794, upload-time = "2026-02-03T14:00:36.278Z" }, - { url = "https://files.pythonhosted.org/packages/45/c7/76bf35d5d488ec8f68682eb8e7671acc50a6d2d1c1182de1d2b6d4ffad3b/coverage-7.13.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:cd5dee4fd7659d8306ffa79eeaaafd91fa30a302dac3af723b9b469e549247e0", size = 252671, upload-time = "2026-02-03T14:00:38.368Z" }, - { url = "https://files.pythonhosted.org/packages/bf/10/1921f1a03a7c209e1cb374f81a6b9b68b03cdb3ecc3433c189bc90e2a3d5/coverage-7.13.3-cp312-cp312-win32.whl", hash = "sha256:f7f153d0184d45f3873b3ad3ad22694fd73aadcb8cdbc4337ab4b41ea6b4dff1", size = 221986, upload-time = "2026-02-03T14:00:40.442Z" }, - { url = "https://files.pythonhosted.org/packages/3c/7c/f5d93297f8e125a80c15545edc754d93e0ed8ba255b65e609b185296af01/coverage-7.13.3-cp312-cp312-win_amd64.whl", hash = "sha256:03a6e5e1e50819d6d7436f5bc40c92ded7e484e400716886ac921e35c133149d", size = 222793, upload-time = "2026-02-03T14:00:42.106Z" }, - { url = "https://files.pythonhosted.org/packages/43/59/c86b84170015b4555ebabca8649bdf9f4a1f737a73168088385ed0f947c4/coverage-7.13.3-cp312-cp312-win_arm64.whl", hash = "sha256:51c4c42c0e7d09a822b08b6cf79b3c4db8333fffde7450da946719ba0d45730f", size = 221410, upload-time = "2026-02-03T14:00:43.726Z" }, - { url = "https://files.pythonhosted.org/packages/81/f3/4c333da7b373e8c8bfb62517e8174a01dcc373d7a9083698e3b39d50d59c/coverage-7.13.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:853c3d3c79ff0db65797aad79dee6be020efd218ac4510f15a205f1e8d13ce25", size = 219468, upload-time = "2026-02-03T14:00:45.829Z" }, - { url = "https://files.pythonhosted.org/packages/d6/31/0714337b7d23630c8de2f4d56acf43c65f8728a45ed529b34410683f7217/coverage-7.13.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:f75695e157c83d374f88dcc646a60cb94173304a9258b2e74ba5a66b7614a51a", size = 219839, upload-time = "2026-02-03T14:00:47.407Z" }, - { url = "https://files.pythonhosted.org/packages/12/99/bd6f2a2738144c98945666f90cae446ed870cecf0421c767475fcf42cdbe/coverage-7.13.3-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:2d098709621d0819039f3f1e471ee554f55a0b2ac0d816883c765b14129b5627", size = 250828, upload-time = "2026-02-03T14:00:49.029Z" }, - { url = "https://files.pythonhosted.org/packages/6f/99/97b600225fbf631e6f5bfd3ad5bcaf87fbb9e34ff87492e5a572ff01bbe2/coverage-7.13.3-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:16d23d6579cf80a474ad160ca14d8b319abaa6db62759d6eef53b2fc979b58c8", size = 253432, upload-time = "2026-02-03T14:00:50.655Z" }, - { url = "https://files.pythonhosted.org/packages/5f/5c/abe2b3490bda26bd4f5e3e799be0bdf00bd81edebedc2c9da8d3ef288fa8/coverage-7.13.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:00d34b29a59d2076e6f318b30a00a69bf63687e30cd882984ed444e753990cc1", size = 254672, upload-time = "2026-02-03T14:00:52.757Z" }, - { url = "https://files.pythonhosted.org/packages/31/ba/5d1957c76b40daff53971fe0adb84d9c2162b614280031d1d0653dd010c1/coverage-7.13.3-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:ab6d72bffac9deb6e6cb0f61042e748de3f9f8e98afb0375a8e64b0b6e11746b", size = 251050, upload-time = "2026-02-03T14:00:54.332Z" }, - { url = "https://files.pythonhosted.org/packages/69/dc/dffdf3bfe9d32090f047d3c3085378558cb4eb6778cda7de414ad74581ed/coverage-7.13.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:e129328ad1258e49cae0123a3b5fcb93d6c2fa90d540f0b4c7cdcdc019aaa3dc", size = 252801, upload-time = "2026-02-03T14:00:56.121Z" }, - { url = "https://files.pythonhosted.org/packages/87/51/cdf6198b0f2746e04511a30dc9185d7b8cdd895276c07bdb538e37f1cd50/coverage-7.13.3-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:2213a8d88ed35459bda71597599d4eec7c2ebad201c88f0bfc2c26fd9b0dd2ea", size = 250763, upload-time = "2026-02-03T14:00:58.719Z" }, - { url = "https://files.pythonhosted.org/packages/d7/1a/596b7d62218c1d69f2475b69cc6b211e33c83c902f38ee6ae9766dd422da/coverage-7.13.3-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:00dd3f02de6d5f5c9c3d95e3e036c3c2e2a669f8bf2d3ceb92505c4ce7838f67", size = 250587, upload-time = "2026-02-03T14:01:01.197Z" }, - { url = "https://files.pythonhosted.org/packages/f7/46/52330d5841ff660f22c130b75f5e1dd3e352c8e7baef5e5fef6b14e3e991/coverage-7.13.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:f9bada7bc660d20b23d7d312ebe29e927b655cf414dadcdb6335a2075695bd86", size = 252358, upload-time = "2026-02-03T14:01:02.824Z" }, - { url = "https://files.pythonhosted.org/packages/36/8a/e69a5be51923097ba7d5cff9724466e74fe486e9232020ba97c809a8b42b/coverage-7.13.3-cp313-cp313-win32.whl", hash = "sha256:75b3c0300f3fa15809bd62d9ca8b170eb21fcf0100eb4b4154d6dc8b3a5bbd43", size = 222007, upload-time = "2026-02-03T14:01:04.876Z" }, - { url = "https://files.pythonhosted.org/packages/0a/09/a5a069bcee0d613bdd48ee7637fa73bc09e7ed4342b26890f2df97cc9682/coverage-7.13.3-cp313-cp313-win_amd64.whl", hash = "sha256:a2f7589c6132c44c53f6e705e1a6677e2b7821378c22f7703b2cf5388d0d4587", size = 222812, upload-time = "2026-02-03T14:01:07.296Z" }, - { url = "https://files.pythonhosted.org/packages/3d/4f/d62ad7dfe32f9e3d4a10c178bb6f98b10b083d6e0530ca202b399371f6c1/coverage-7.13.3-cp313-cp313-win_arm64.whl", hash = "sha256:123ceaf2b9d8c614f01110f908a341e05b1b305d6b2ada98763b9a5a59756051", size = 221433, upload-time = "2026-02-03T14:01:09.156Z" }, - { url = "https://files.pythonhosted.org/packages/04/b2/4876c46d723d80b9c5b695f1a11bf5f7c3dabf540ec00d6edc076ff025e6/coverage-7.13.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:cc7fd0f726795420f3678ac82ff882c7fc33770bd0074463b5aef7293285ace9", size = 220162, upload-time = "2026-02-03T14:01:11.409Z" }, - { url = "https://files.pythonhosted.org/packages/fc/04/9942b64a0e0bdda2c109f56bda42b2a59d9d3df4c94b85a323c1cae9fc77/coverage-7.13.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:d358dc408edc28730aed5477a69338e444e62fba0b7e9e4a131c505fadad691e", size = 220510, upload-time = "2026-02-03T14:01:13.038Z" }, - { url = "https://files.pythonhosted.org/packages/5a/82/5cfe1e81eae525b74669f9795f37eb3edd4679b873d79d1e6c1c14ee6c1c/coverage-7.13.3-cp313-cp313t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:5d67b9ed6f7b5527b209b24b3df9f2e5bf0198c1bbf99c6971b0e2dcb7e2a107", size = 261801, upload-time = "2026-02-03T14:01:14.674Z" }, - { url = "https://files.pythonhosted.org/packages/0b/ec/a553d7f742fd2cd12e36a16a7b4b3582d5934b496ef2b5ea8abeb10903d4/coverage-7.13.3-cp313-cp313t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:59224bfb2e9b37c1335ae35d00daa3a5b4e0b1a20f530be208fff1ecfa436f43", size = 263882, upload-time = "2026-02-03T14:01:16.343Z" }, - { url = "https://files.pythonhosted.org/packages/e1/58/8f54a2a93e3d675635bc406de1c9ac8d551312142ff52c9d71b5e533ad45/coverage-7.13.3-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ae9306b5299e31e31e0d3b908c66bcb6e7e3ddca143dea0266e9ce6c667346d3", size = 266306, upload-time = "2026-02-03T14:01:18.02Z" }, - { url = "https://files.pythonhosted.org/packages/1a/be/e593399fd6ea1f00aee79ebd7cc401021f218d34e96682a92e1bae092ff6/coverage-7.13.3-cp313-cp313t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:343aaeb5f8bb7bcd38620fd7bc56e6ee8207847d8c6103a1e7b72322d381ba4a", size = 261051, upload-time = "2026-02-03T14:01:19.757Z" }, - { url = "https://files.pythonhosted.org/packages/5c/e5/e9e0f6138b21bcdebccac36fbfde9cf15eb1bbcea9f5b1f35cd1f465fb91/coverage-7.13.3-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:b2182129f4c101272ff5f2f18038d7b698db1bf8e7aa9e615cb48440899ad32e", size = 263868, upload-time = "2026-02-03T14:01:21.487Z" }, - { url = "https://files.pythonhosted.org/packages/9a/bf/de72cfebb69756f2d4a2dde35efcc33c47d85cd3ebdf844b3914aac2ef28/coverage-7.13.3-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:94d2ac94bd0cc57c5626f52f8c2fffed1444b5ae8c9fc68320306cc2b255e155", size = 261498, upload-time = "2026-02-03T14:01:23.097Z" }, - { url = "https://files.pythonhosted.org/packages/f2/91/4a2d313a70fc2e98ca53afd1c8ce67a89b1944cd996589a5b1fe7fbb3e5c/coverage-7.13.3-cp313-cp313t-musllinux_1_2_riscv64.whl", hash = "sha256:65436cde5ecabe26fb2f0bf598962f0a054d3f23ad529361326ac002c61a2a1e", size = 260394, upload-time = "2026-02-03T14:01:24.949Z" }, - { url = "https://files.pythonhosted.org/packages/40/83/25113af7cf6941e779eb7ed8de2a677865b859a07ccee9146d4cc06a03e3/coverage-7.13.3-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:db83b77f97129813dbd463a67e5335adc6a6a91db652cc085d60c2d512746f96", size = 262579, upload-time = "2026-02-03T14:01:26.703Z" }, - { url = "https://files.pythonhosted.org/packages/1e/19/a5f2b96262977e82fb9aabbe19b4d83561f5d063f18dde3e72f34ffc3b2f/coverage-7.13.3-cp313-cp313t-win32.whl", hash = "sha256:dfb428e41377e6b9ba1b0a32df6db5409cb089a0ed1d0a672dc4953ec110d84f", size = 222679, upload-time = "2026-02-03T14:01:28.553Z" }, - { url = "https://files.pythonhosted.org/packages/81/82/ef1747b88c87a5c7d7edc3704799ebd650189a9158e680a063308b6125ef/coverage-7.13.3-cp313-cp313t-win_amd64.whl", hash = "sha256:5badd7e596e6b0c89aa8ec6d37f4473e4357f982ce57f9a2942b0221cd9cf60c", size = 223740, upload-time = "2026-02-03T14:01:30.776Z" }, - { url = "https://files.pythonhosted.org/packages/1c/4c/a67c7bb5b560241c22736a9cb2f14c5034149ffae18630323fde787339e4/coverage-7.13.3-cp313-cp313t-win_arm64.whl", hash = "sha256:989aa158c0eb19d83c76c26f4ba00dbb272485c56e452010a3450bdbc9daafd9", size = 221996, upload-time = "2026-02-03T14:01:32.495Z" }, - { url = "https://files.pythonhosted.org/packages/5e/b3/677bb43427fed9298905106f39c6520ac75f746f81b8f01104526a8026e4/coverage-7.13.3-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:c6f6169bbdbdb85aab8ac0392d776948907267fcc91deeacf6f9d55f7a83ae3b", size = 219513, upload-time = "2026-02-03T14:01:34.29Z" }, - { url = "https://files.pythonhosted.org/packages/42/53/290046e3bbf8986cdb7366a42dab3440b9983711eaff044a51b11006c67b/coverage-7.13.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:2f5e731627a3d5ef11a2a35aa0c6f7c435867c7ccbc391268eb4f2ca5dbdcc10", size = 219850, upload-time = "2026-02-03T14:01:35.984Z" }, - { url = "https://files.pythonhosted.org/packages/ea/2b/ab41f10345ba2e49d5e299be8663be2b7db33e77ac1b85cd0af985ea6406/coverage-7.13.3-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:9db3a3285d91c0b70fab9f39f0a4aa37d375873677efe4e71e58d8321e8c5d39", size = 250886, upload-time = "2026-02-03T14:01:38.287Z" }, - { url = "https://files.pythonhosted.org/packages/72/2d/b3f6913ee5a1d5cdd04106f257e5fac5d048992ffc2d9995d07b0f17739f/coverage-7.13.3-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:06e49c5897cb12e3f7ecdc111d44e97c4f6d0557b81a7a0204ed70a8b038f86f", size = 253393, upload-time = "2026-02-03T14:01:40.118Z" }, - { url = "https://files.pythonhosted.org/packages/f0/f6/b1f48810ffc6accf49a35b9943636560768f0812330f7456aa87dc39aff5/coverage-7.13.3-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fb25061a66802df9fc13a9ba1967d25faa4dae0418db469264fd9860a921dde4", size = 254740, upload-time = "2026-02-03T14:01:42.413Z" }, - { url = "https://files.pythonhosted.org/packages/57/d0/e59c54f9be0b61808f6bc4c8c4346bd79f02dd6bbc3f476ef26124661f20/coverage-7.13.3-cp314-cp314-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:99fee45adbb1caeb914da16f70e557fb7ff6ddc9e4b14de665bd41af631367ef", size = 250905, upload-time = "2026-02-03T14:01:44.163Z" }, - { url = "https://files.pythonhosted.org/packages/d5/f7/5291bcdf498bafbee3796bb32ef6966e9915aebd4d0954123c8eae921c32/coverage-7.13.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:318002f1fd819bdc1651c619268aa5bc853c35fa5cc6d1e8c96bd9cd6c828b75", size = 252753, upload-time = "2026-02-03T14:01:45.974Z" }, - { url = "https://files.pythonhosted.org/packages/a0/a9/1dcafa918c281554dae6e10ece88c1add82db685be123e1b05c2056ff3fb/coverage-7.13.3-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:71295f2d1d170b9977dc386d46a7a1b7cbb30e5405492529b4c930113a33f895", size = 250716, upload-time = "2026-02-03T14:01:48.844Z" }, - { url = "https://files.pythonhosted.org/packages/44/bb/4ea4eabcce8c4f6235df6e059fbc5db49107b24c4bdffc44aee81aeca5a8/coverage-7.13.3-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:5b1ad2e0dc672625c44bc4fe34514602a9fd8b10d52ddc414dc585f74453516c", size = 250530, upload-time = "2026-02-03T14:01:50.793Z" }, - { url = "https://files.pythonhosted.org/packages/6d/31/4a6c9e6a71367e6f923b27b528448c37f4e959b7e4029330523014691007/coverage-7.13.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:b2beb64c145593a50d90db5c7178f55daeae129123b0d265bdb3cbec83e5194a", size = 252186, upload-time = "2026-02-03T14:01:52.607Z" }, - { url = "https://files.pythonhosted.org/packages/27/92/e1451ef6390a4f655dc42da35d9971212f7abbbcad0bdb7af4407897eb76/coverage-7.13.3-cp314-cp314-win32.whl", hash = "sha256:3d1aed4f4e837a832df2f3b4f68a690eede0de4560a2dbc214ea0bc55aabcdb4", size = 222253, upload-time = "2026-02-03T14:01:55.071Z" }, - { url = "https://files.pythonhosted.org/packages/8a/98/78885a861a88de020c32a2693487c37d15a9873372953f0c3c159d575a43/coverage-7.13.3-cp314-cp314-win_amd64.whl", hash = "sha256:9f9efbbaf79f935d5fbe3ad814825cbce4f6cdb3054384cb49f0c0f496125fa0", size = 223069, upload-time = "2026-02-03T14:01:56.95Z" }, - { url = "https://files.pythonhosted.org/packages/eb/fb/3784753a48da58a5337972abf7ca58b1fb0f1bda21bc7b4fae992fd28e47/coverage-7.13.3-cp314-cp314-win_arm64.whl", hash = "sha256:31b6e889c53d4e6687ca63706148049494aace140cffece1c4dc6acadb70a7b3", size = 221633, upload-time = "2026-02-03T14:01:58.758Z" }, - { url = "https://files.pythonhosted.org/packages/40/f9/75b732d9674d32cdbffe801ed5f770786dd1c97eecedef2125b0d25102dc/coverage-7.13.3-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:c5e9787cec750793a19a28df7edd85ac4e49d3fb91721afcdc3b86f6c08d9aa8", size = 220243, upload-time = "2026-02-03T14:02:01.109Z" }, - { url = "https://files.pythonhosted.org/packages/cf/7e/2868ec95de5a65703e6f0c87407ea822d1feb3619600fbc3c1c4fa986090/coverage-7.13.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:e5b86db331c682fd0e4be7098e6acee5e8a293f824d41487c667a93705d415ca", size = 220515, upload-time = "2026-02-03T14:02:02.862Z" }, - { url = "https://files.pythonhosted.org/packages/7d/eb/9f0d349652fced20bcaea0f67fc5777bd097c92369f267975732f3dc5f45/coverage-7.13.3-cp314-cp314t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:edc7754932682d52cf6e7a71806e529ecd5ce660e630e8bd1d37109a2e5f63ba", size = 261874, upload-time = "2026-02-03T14:02:04.727Z" }, - { url = "https://files.pythonhosted.org/packages/ee/a5/6619bc4a6c7b139b16818149a3e74ab2e21599ff9a7b6811b6afde99f8ec/coverage-7.13.3-cp314-cp314t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:d3a16d6398666510a6886f67f43d9537bfd0e13aca299688a19daa84f543122f", size = 264004, upload-time = "2026-02-03T14:02:06.634Z" }, - { url = "https://files.pythonhosted.org/packages/29/b7/90aa3fc645a50c6f07881fca4fd0ba21e3bfb6ce3a7078424ea3a35c74c9/coverage-7.13.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:303d38b19626c1981e1bb067a9928236d88eb0e4479b18a74812f05a82071508", size = 266408, upload-time = "2026-02-03T14:02:09.037Z" }, - { url = "https://files.pythonhosted.org/packages/62/55/08bb2a1e4dcbae384e638f0effef486ba5987b06700e481691891427d879/coverage-7.13.3-cp314-cp314t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:284e06eadfe15ddfee2f4ee56631f164ef897a7d7d5a15bca5f0bb88889fc5ba", size = 260977, upload-time = "2026-02-03T14:02:11.755Z" }, - { url = "https://files.pythonhosted.org/packages/9b/76/8bd4ae055a42d8fb5dd2230e5cf36ff2e05f85f2427e91b11a27fea52ed7/coverage-7.13.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:d401f0864a1d3198422816878e4e84ca89ec1c1bf166ecc0ae01380a39b888cd", size = 263868, upload-time = "2026-02-03T14:02:13.565Z" }, - { url = "https://files.pythonhosted.org/packages/e3/f9/ba000560f11e9e32ec03df5aa8477242c2d95b379c99ac9a7b2e7fbacb1a/coverage-7.13.3-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:3f379b02c18a64de78c4ccdddf1c81c2c5ae1956c72dacb9133d7dd7809794ab", size = 261474, upload-time = "2026-02-03T14:02:16.069Z" }, - { url = "https://files.pythonhosted.org/packages/90/4b/4de4de8f9ca7af4733bfcf4baa440121b7dbb3856daf8428ce91481ff63b/coverage-7.13.3-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:7a482f2da9086971efb12daca1d6547007ede3674ea06e16d7663414445c683e", size = 260317, upload-time = "2026-02-03T14:02:17.996Z" }, - { url = "https://files.pythonhosted.org/packages/05/71/5cd8436e2c21410ff70be81f738c0dddea91bcc3189b1517d26e0102ccb3/coverage-7.13.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:562136b0d401992118d9b49fbee5454e16f95f85b120a4226a04d816e33fe024", size = 262635, upload-time = "2026-02-03T14:02:20.405Z" }, - { url = "https://files.pythonhosted.org/packages/e7/f8/2834bb45bdd70b55a33ec354b8b5f6062fc90e5bb787e14385903a979503/coverage-7.13.3-cp314-cp314t-win32.whl", hash = "sha256:ca46e5c3be3b195098dd88711890b8011a9fa4feca942292bb84714ce5eab5d3", size = 223035, upload-time = "2026-02-03T14:02:22.323Z" }, - { url = "https://files.pythonhosted.org/packages/26/75/f8290f0073c00d9ae14056d2b84ab92dff21d5370e464cb6cb06f52bf580/coverage-7.13.3-cp314-cp314t-win_amd64.whl", hash = "sha256:06d316dbb3d9fd44cca05b2dbcfbef22948493d63a1f28e828d43e6cc505fed8", size = 224142, upload-time = "2026-02-03T14:02:24.143Z" }, - { url = "https://files.pythonhosted.org/packages/03/01/43ac78dfea8946c4a9161bbc034b5549115cb2b56781a4b574927f0d141a/coverage-7.13.3-cp314-cp314t-win_arm64.whl", hash = "sha256:299d66e9218193f9dc6e4880629ed7c4cd23486005166247c283fb98531656c3", size = 222166, upload-time = "2026-02-03T14:02:26.005Z" }, - { url = "https://files.pythonhosted.org/packages/7d/fb/70af542d2d938c778c9373ce253aa4116dbe7c0a5672f78b2b2ae0e1b94b/coverage-7.13.3-py3-none-any.whl", hash = "sha256:90a8af9dba6429b2573199622d72e0ebf024d6276f16abce394ad4d181bb0910", size = 211237, upload-time = "2026-02-03T14:02:27.986Z" }, + { url = "https://files.pythonhosted.org/packages/69/33/e8c48488c29a73fd089f9d71f9653c1be7478f2ad6b5bc870db11a55d23d/coverage-7.13.5-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:e0723d2c96324561b9aa76fb982406e11d93cdb388a7a7da2b16e04719cf7ca5", size = 219255, upload-time = "2026-03-17T10:29:51.081Z" }, + { url = "https://files.pythonhosted.org/packages/da/bd/b0ebe9f677d7f4b74a3e115eec7ddd4bcf892074963a00d91e8b164a6386/coverage-7.13.5-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:52f444e86475992506b32d4e5ca55c24fc88d73bcbda0e9745095b28ef4dc0cf", size = 219772, upload-time = "2026-03-17T10:29:52.867Z" }, + { url = "https://files.pythonhosted.org/packages/48/cc/5cb9502f4e01972f54eedd48218bb203fe81e294be606a2bc93970208013/coverage-7.13.5-cp310-cp310-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:704de6328e3d612a8f6c07000a878ff38181ec3263d5a11da1db294fa6a9bdf8", size = 246532, upload-time = "2026-03-17T10:29:54.688Z" }, + { url = "https://files.pythonhosted.org/packages/7d/d8/3217636d86c7e7b12e126e4f30ef1581047da73140614523af7495ed5f2d/coverage-7.13.5-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:a1a6d79a14e1ec1832cabc833898636ad5f3754a678ef8bb4908515208bf84f4", size = 248333, upload-time = "2026-03-17T10:29:56.221Z" }, + { url = "https://files.pythonhosted.org/packages/2b/30/2002ac6729ba2d4357438e2ed3c447ad8562866c8c63fc16f6dfc33afe56/coverage-7.13.5-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:79060214983769c7ba3f0cee10b54c97609dca4d478fa1aa32b914480fd5738d", size = 250211, upload-time = "2026-03-17T10:29:57.938Z" }, + { url = "https://files.pythonhosted.org/packages/6c/85/552496626d6b9359eb0e2f86f920037c9cbfba09b24d914c6e1528155f7d/coverage-7.13.5-cp310-cp310-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:356e76b46783a98c2a2fe81ec79df4883a1e62895ea952968fb253c114e7f930", size = 252125, upload-time = "2026-03-17T10:29:59.388Z" }, + { url = "https://files.pythonhosted.org/packages/44/21/40256eabdcbccdb6acf6b381b3016a154399a75fe39d406f790ae84d1f3c/coverage-7.13.5-cp310-cp310-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:0cef0cdec915d11254a7f549c1170afecce708d30610c6abdded1f74e581666d", size = 247219, upload-time = "2026-03-17T10:30:01.199Z" }, + { url = "https://files.pythonhosted.org/packages/b1/e8/96e2a6c3f21a0ea77d7830b254a1542d0328acc8d7bdf6a284ba7e529f77/coverage-7.13.5-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:dc022073d063b25a402454e5712ef9e007113e3a676b96c5f29b2bda29352f40", size = 248248, upload-time = "2026-03-17T10:30:03.317Z" }, + { url = "https://files.pythonhosted.org/packages/da/ba/8477f549e554827da390ec659f3c38e4b6d95470f4daafc2d8ff94eaa9c2/coverage-7.13.5-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:9b74db26dfea4f4e50d48a4602207cd1e78be33182bc9cbf22da94f332f99878", size = 246254, upload-time = "2026-03-17T10:30:04.832Z" }, + { url = "https://files.pythonhosted.org/packages/55/59/bc22aef0e6aa179d5b1b001e8b3654785e9adf27ef24c93dc4228ebd5d68/coverage-7.13.5-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:ad146744ca4fd09b50c482650e3c1b1f4dfa1d4792e0a04a369c7f23336f0400", size = 250067, upload-time = "2026-03-17T10:30:06.535Z" }, + { url = "https://files.pythonhosted.org/packages/de/1b/c6a023a160806a5137dca53468fd97530d6acad24a22003b1578a9c2e429/coverage-7.13.5-cp310-cp310-musllinux_1_2_riscv64.whl", hash = "sha256:c555b48be1853fe3997c11c4bd521cdd9a9612352de01fa4508f16ec341e6fe0", size = 246521, upload-time = "2026-03-17T10:30:08.486Z" }, + { url = "https://files.pythonhosted.org/packages/2d/3f/3532c85a55aa2f899fa17c186f831cfa1aa434d88ff792a709636f64130e/coverage-7.13.5-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:7034b5c56a58ae5e85f23949d52c14aca2cfc6848a31764995b7de88f13a1ea0", size = 247126, upload-time = "2026-03-17T10:30:09.966Z" }, + { url = "https://files.pythonhosted.org/packages/aa/2e/b9d56af4a24ef45dfbcda88e06870cb7d57b2b0bfa3a888d79b4c8debd76/coverage-7.13.5-cp310-cp310-win32.whl", hash = "sha256:eb7fdf1ef130660e7415e0253a01a7d5a88c9c4d158bcf75cbbd922fd65a5b58", size = 221860, upload-time = "2026-03-17T10:30:11.393Z" }, + { url = "https://files.pythonhosted.org/packages/9f/cc/d938417e7a4d7f0433ad4edee8bb2acdc60dc7ac5af19e2a07a048ecbee3/coverage-7.13.5-cp310-cp310-win_amd64.whl", hash = "sha256:3e1bb5f6c78feeb1be3475789b14a0f0a5b47d505bfc7267126ccbd50289999e", size = 222788, upload-time = "2026-03-17T10:30:12.886Z" }, + { url = "https://files.pythonhosted.org/packages/4b/37/d24c8f8220ff07b839b2c043ea4903a33b0f455abe673ae3c03bbdb7f212/coverage-7.13.5-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:66a80c616f80181f4d643b0f9e709d97bcea413ecd9631e1dedc7401c8e6695d", size = 219381, upload-time = "2026-03-17T10:30:14.68Z" }, + { url = "https://files.pythonhosted.org/packages/35/8b/cd129b0ca4afe886a6ce9d183c44d8301acbd4ef248622e7c49a23145605/coverage-7.13.5-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:145ede53ccbafb297c1c9287f788d1bc3efd6c900da23bf6931b09eafc931587", size = 219880, upload-time = "2026-03-17T10:30:16.231Z" }, + { url = "https://files.pythonhosted.org/packages/55/2f/e0e5b237bffdb5d6c530ce87cc1d413a5b7d7dfd60fb067ad6d254c35c76/coverage-7.13.5-cp311-cp311-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:0672854dc733c342fa3e957e0605256d2bf5934feeac328da9e0b5449634a642", size = 250303, upload-time = "2026-03-17T10:30:17.748Z" }, + { url = "https://files.pythonhosted.org/packages/92/be/b1afb692be85b947f3401375851484496134c5554e67e822c35f28bf2fbc/coverage-7.13.5-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:ec10e2a42b41c923c2209b846126c6582db5e43a33157e9870ba9fb70dc7854b", size = 252218, upload-time = "2026-03-17T10:30:19.804Z" }, + { url = "https://files.pythonhosted.org/packages/da/69/2f47bb6fa1b8d1e3e5d0c4be8ccb4313c63d742476a619418f85740d597b/coverage-7.13.5-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:be3d4bbad9d4b037791794ddeedd7d64a56f5933a2c1373e18e9e568b9141686", size = 254326, upload-time = "2026-03-17T10:30:21.321Z" }, + { url = "https://files.pythonhosted.org/packages/d5/d0/79db81da58965bd29dabc8f4ad2a2af70611a57cba9d1ec006f072f30a54/coverage-7.13.5-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:4d2afbc5cc54d286bfb54541aa50b64cdb07a718227168c87b9e2fb8f25e1743", size = 256267, upload-time = "2026-03-17T10:30:23.094Z" }, + { url = "https://files.pythonhosted.org/packages/e5/32/d0d7cc8168f91ddab44c0ce4806b969df5f5fdfdbb568eaca2dbc2a04936/coverage-7.13.5-cp311-cp311-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:3ad050321264c49c2fa67bb599100456fc51d004b82534f379d16445da40fb75", size = 250430, upload-time = "2026-03-17T10:30:25.311Z" }, + { url = "https://files.pythonhosted.org/packages/4d/06/a055311d891ddbe231cd69fdd20ea4be6e3603ffebddf8704b8ca8e10a3c/coverage-7.13.5-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:7300c8a6d13335b29bb76d7651c66af6bd8658517c43499f110ddc6717bfc209", size = 252017, upload-time = "2026-03-17T10:30:27.284Z" }, + { url = "https://files.pythonhosted.org/packages/d6/f6/d0fd2d21e29a657b5f77a2fe7082e1568158340dceb941954f776dce1b7b/coverage-7.13.5-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:eb07647a5738b89baab047f14edd18ded523de60f3b30e75c2acc826f79c839a", size = 250080, upload-time = "2026-03-17T10:30:29.481Z" }, + { url = "https://files.pythonhosted.org/packages/4e/ab/0d7fb2efc2e9a5eb7ddcc6e722f834a69b454b7e6e5888c3a8567ecffb31/coverage-7.13.5-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:9adb6688e3b53adffefd4a52d72cbd8b02602bfb8f74dcd862337182fd4d1a4e", size = 253843, upload-time = "2026-03-17T10:30:31.301Z" }, + { url = "https://files.pythonhosted.org/packages/ba/6f/7467b917bbf5408610178f62a49c0ed4377bb16c1657f689cc61470da8ce/coverage-7.13.5-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:7c8d4bc913dd70b93488d6c496c77f3aff5ea99a07e36a18f865bca55adef8bd", size = 249802, upload-time = "2026-03-17T10:30:33.358Z" }, + { url = "https://files.pythonhosted.org/packages/75/2c/1172fb689df92135f5bfbbd69fc83017a76d24ea2e2f3a1154007e2fb9f8/coverage-7.13.5-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:0e3c426ffc4cd952f54ee9ffbdd10345709ecc78a3ecfd796a57236bfad0b9b8", size = 250707, upload-time = "2026-03-17T10:30:35.2Z" }, + { url = "https://files.pythonhosted.org/packages/67/21/9ac389377380a07884e3b48ba7a620fcd9dbfaf1d40565facdc6b36ec9ef/coverage-7.13.5-cp311-cp311-win32.whl", hash = "sha256:259b69bb83ad9894c4b25be2528139eecba9a82646ebdda2d9db1ba28424a6bf", size = 221880, upload-time = "2026-03-17T10:30:36.775Z" }, + { url = "https://files.pythonhosted.org/packages/af/7f/4cd8a92531253f9d7c1bbecd9fa1b472907fb54446ca768c59b531248dc5/coverage-7.13.5-cp311-cp311-win_amd64.whl", hash = "sha256:258354455f4e86e3e9d0d17571d522e13b4e1e19bf0f8596bcf9476d61e7d8a9", size = 222816, upload-time = "2026-03-17T10:30:38.891Z" }, + { url = "https://files.pythonhosted.org/packages/12/a6/1d3f6155fb0010ca68eba7fe48ca6c9da7385058b77a95848710ecf189b1/coverage-7.13.5-cp311-cp311-win_arm64.whl", hash = "sha256:bff95879c33ec8da99fc9b6fe345ddb5be6414b41d6d1ad1c8f188d26f36e028", size = 221483, upload-time = "2026-03-17T10:30:40.463Z" }, + { url = "https://files.pythonhosted.org/packages/a0/c3/a396306ba7db865bf96fc1fb3b7fd29bcbf3d829df642e77b13555163cd6/coverage-7.13.5-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:460cf0114c5016fa841214ff5564aa4864f11948da9440bc97e21ad1f4ba1e01", size = 219554, upload-time = "2026-03-17T10:30:42.208Z" }, + { url = "https://files.pythonhosted.org/packages/a6/16/a68a19e5384e93f811dccc51034b1fd0b865841c390e3c931dcc4699e035/coverage-7.13.5-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0e223ce4b4ed47f065bfb123687686512e37629be25cc63728557ae7db261422", size = 219908, upload-time = "2026-03-17T10:30:43.906Z" }, + { url = "https://files.pythonhosted.org/packages/29/72/20b917c6793af3a5ceb7fb9c50033f3ec7865f2911a1416b34a7cfa0813b/coverage-7.13.5-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:6e3370441f4513c6252bf042b9c36d22491142385049243253c7e48398a15a9f", size = 251419, upload-time = "2026-03-17T10:30:45.545Z" }, + { url = "https://files.pythonhosted.org/packages/8c/49/cd14b789536ac6a4778c453c6a2338bc0a2fb60c5a5a41b4008328b9acc1/coverage-7.13.5-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:03ccc709a17a1de074fb1d11f217342fb0d2b1582ed544f554fc9fc3f07e95f5", size = 254159, upload-time = "2026-03-17T10:30:47.204Z" }, + { url = "https://files.pythonhosted.org/packages/9d/00/7b0edcfe64e2ed4c0340dac14a52ad0f4c9bd0b8b5e531af7d55b703db7c/coverage-7.13.5-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3f4818d065964db3c1c66dc0fbdac5ac692ecbc875555e13374fdbe7eedb4376", size = 255270, upload-time = "2026-03-17T10:30:48.812Z" }, + { url = "https://files.pythonhosted.org/packages/93/89/7ffc4ba0f5d0a55c1e84ea7cee39c9fc06af7b170513d83fbf3bbefce280/coverage-7.13.5-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:012d5319e66e9d5a218834642d6c35d265515a62f01157a45bcc036ecf947256", size = 257538, upload-time = "2026-03-17T10:30:50.77Z" }, + { url = "https://files.pythonhosted.org/packages/81/bd/73ddf85f93f7e6fa83e77ccecb6162d9415c79007b4bc124008a4995e4a7/coverage-7.13.5-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:8dd02af98971bdb956363e4827d34425cb3df19ee550ef92855b0acb9c7ce51c", size = 251821, upload-time = "2026-03-17T10:30:52.5Z" }, + { url = "https://files.pythonhosted.org/packages/a0/81/278aff4e8dec4926a0bcb9486320752811f543a3ce5b602cc7a29978d073/coverage-7.13.5-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:f08fd75c50a760c7eb068ae823777268daaf16a80b918fa58eea888f8e3919f5", size = 253191, upload-time = "2026-03-17T10:30:54.543Z" }, + { url = "https://files.pythonhosted.org/packages/70/ee/fe1621488e2e0a58d7e94c4800f0d96f79671553488d401a612bebae324b/coverage-7.13.5-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:843ea8643cf967d1ac7e8ecd4bb00c99135adf4816c0c0593fdcc47b597fcf09", size = 251337, upload-time = "2026-03-17T10:30:56.663Z" }, + { url = "https://files.pythonhosted.org/packages/37/a6/f79fb37aa104b562207cc23cb5711ab6793608e246cae1e93f26b2236ed9/coverage-7.13.5-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:9d44d7aa963820b1b971dbecd90bfe5fe8f81cff79787eb6cca15750bd2f79b9", size = 255404, upload-time = "2026-03-17T10:30:58.427Z" }, + { url = "https://files.pythonhosted.org/packages/75/f0/ed15262a58ec81ce457ceb717b7f78752a1713556b19081b76e90896e8d4/coverage-7.13.5-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:7132bed4bd7b836200c591410ae7d97bf7ae8be6fc87d160b2bd881df929e7bf", size = 250903, upload-time = "2026-03-17T10:31:00.093Z" }, + { url = "https://files.pythonhosted.org/packages/0f/e9/9129958f20e7e9d4d56d51d42ccf708d15cac355ff4ac6e736e97a9393d2/coverage-7.13.5-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:a698e363641b98843c517817db75373c83254781426e94ada3197cabbc2c919c", size = 252780, upload-time = "2026-03-17T10:31:01.916Z" }, + { url = "https://files.pythonhosted.org/packages/a4/d7/0ad9b15812d81272db94379fe4c6df8fd17781cc7671fdfa30c76ba5ff7b/coverage-7.13.5-cp312-cp312-win32.whl", hash = "sha256:bdba0a6b8812e8c7df002d908a9a2ea3c36e92611b5708633c50869e6d922fdf", size = 222093, upload-time = "2026-03-17T10:31:03.642Z" }, + { url = "https://files.pythonhosted.org/packages/29/3d/821a9a5799fac2556bcf0bd37a70d1d11fa9e49784b6d22e92e8b2f85f18/coverage-7.13.5-cp312-cp312-win_amd64.whl", hash = "sha256:d2c87e0c473a10bffe991502eac389220533024c8082ec1ce849f4218dded810", size = 222900, upload-time = "2026-03-17T10:31:05.651Z" }, + { url = "https://files.pythonhosted.org/packages/d4/fa/2238c2ad08e35cf4f020ea721f717e09ec3152aea75d191a7faf3ef009a8/coverage-7.13.5-cp312-cp312-win_arm64.whl", hash = "sha256:bf69236a9a81bdca3bff53796237aab096cdbf8d78a66ad61e992d9dac7eb2de", size = 221515, upload-time = "2026-03-17T10:31:07.293Z" }, + { url = "https://files.pythonhosted.org/packages/74/8c/74fedc9663dcf168b0a059d4ea756ecae4da77a489048f94b5f512a8d0b3/coverage-7.13.5-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:5ec4af212df513e399cf11610cc27063f1586419e814755ab362e50a85ea69c1", size = 219576, upload-time = "2026-03-17T10:31:09.045Z" }, + { url = "https://files.pythonhosted.org/packages/0c/c9/44fb661c55062f0818a6ffd2685c67aa30816200d5f2817543717d4b92eb/coverage-7.13.5-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:941617e518602e2d64942c88ec8499f7fbd49d3f6c4327d3a71d43a1973032f3", size = 219942, upload-time = "2026-03-17T10:31:10.708Z" }, + { url = "https://files.pythonhosted.org/packages/5f/13/93419671cee82b780bab7ea96b67c8ef448f5f295f36bf5031154ec9a790/coverage-7.13.5-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:da305e9937617ee95c2e39d8ff9f040e0487cbf1ac174f777ed5eddd7a7c1f26", size = 250935, upload-time = "2026-03-17T10:31:12.392Z" }, + { url = "https://files.pythonhosted.org/packages/ac/68/1666e3a4462f8202d836920114fa7a5ee9275d1fa45366d336c551a162dd/coverage-7.13.5-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:78e696e1cc714e57e8b25760b33a8b1026b7048d270140d25dafe1b0a1ee05a3", size = 253541, upload-time = "2026-03-17T10:31:14.247Z" }, + { url = "https://files.pythonhosted.org/packages/4e/5e/3ee3b835647be646dcf3c65a7c6c18f87c27326a858f72ab22c12730773d/coverage-7.13.5-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:02ca0eed225b2ff301c474aeeeae27d26e2537942aa0f87491d3e147e784a82b", size = 254780, upload-time = "2026-03-17T10:31:16.193Z" }, + { url = "https://files.pythonhosted.org/packages/44/b3/cb5bd1a04cfcc49ede6cd8409d80bee17661167686741e041abc7ee1b9a9/coverage-7.13.5-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:04690832cbea4e4663d9149e05dba142546ca05cb1848816760e7f58285c970a", size = 256912, upload-time = "2026-03-17T10:31:17.89Z" }, + { url = "https://files.pythonhosted.org/packages/1b/66/c1dceb7b9714473800b075f5c8a84f4588f887a90eb8645282031676e242/coverage-7.13.5-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:0590e44dd2745c696a778f7bab6aa95256de2cbc8b8cff4f7db8ff09813d6969", size = 251165, upload-time = "2026-03-17T10:31:19.605Z" }, + { url = "https://files.pythonhosted.org/packages/b7/62/5502b73b97aa2e53ea22a39cf8649ff44827bef76d90bf638777daa27a9d/coverage-7.13.5-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:d7cfad2d6d81dd298ab6b89fe72c3b7b05ec7544bdda3b707ddaecff8d25c161", size = 252908, upload-time = "2026-03-17T10:31:21.312Z" }, + { url = "https://files.pythonhosted.org/packages/7d/37/7792c2d69854397ca77a55c4646e5897c467928b0e27f2d235d83b5d08c6/coverage-7.13.5-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:e092b9499de38ae0fbfbc603a74660eb6ff3e869e507b50d85a13b6db9863e15", size = 250873, upload-time = "2026-03-17T10:31:23.565Z" }, + { url = "https://files.pythonhosted.org/packages/a3/23/bc866fb6163be52a8a9e5d708ba0d3b1283c12158cefca0a8bbb6e247a43/coverage-7.13.5-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:48c39bc4a04d983a54a705a6389512883d4a3b9862991b3617d547940e9f52b1", size = 255030, upload-time = "2026-03-17T10:31:25.58Z" }, + { url = "https://files.pythonhosted.org/packages/7d/8b/ef67e1c222ef49860701d346b8bbb70881bef283bd5f6cbba68a39a086c7/coverage-7.13.5-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:2d3807015f138ffea1ed9afeeb8624fd781703f2858b62a8dd8da5a0994c57b6", size = 250694, upload-time = "2026-03-17T10:31:27.316Z" }, + { url = "https://files.pythonhosted.org/packages/46/0d/866d1f74f0acddbb906db212e096dee77a8e2158ca5e6bb44729f9d93298/coverage-7.13.5-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:ee2aa19e03161671ec964004fb74b2257805d9710bf14a5c704558b9d8dbaf17", size = 252469, upload-time = "2026-03-17T10:31:29.472Z" }, + { url = "https://files.pythonhosted.org/packages/7a/f5/be742fec31118f02ce42b21c6af187ad6a344fed546b56ca60caacc6a9a0/coverage-7.13.5-cp313-cp313-win32.whl", hash = "sha256:ce1998c0483007608c8382f4ff50164bfc5bd07a2246dd272aa4043b75e61e85", size = 222112, upload-time = "2026-03-17T10:31:31.526Z" }, + { url = "https://files.pythonhosted.org/packages/66/40/7732d648ab9d069a46e686043241f01206348e2bbf128daea85be4d6414b/coverage-7.13.5-cp313-cp313-win_amd64.whl", hash = "sha256:631efb83f01569670a5e866ceb80fe483e7c159fac6f167e6571522636104a0b", size = 222923, upload-time = "2026-03-17T10:31:33.633Z" }, + { url = "https://files.pythonhosted.org/packages/48/af/fea819c12a095781f6ccd504890aaddaf88b8fab263c4940e82c7b770124/coverage-7.13.5-cp313-cp313-win_arm64.whl", hash = "sha256:f4cd16206ad171cbc2470dbea9103cf9a7607d5fe8c242fdf1edf36174020664", size = 221540, upload-time = "2026-03-17T10:31:35.445Z" }, + { url = "https://files.pythonhosted.org/packages/23/d2/17879af479df7fbbd44bd528a31692a48f6b25055d16482fdf5cdb633805/coverage-7.13.5-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0428cbef5783ad91fe240f673cc1f76b25e74bbfe1a13115e4aa30d3f538162d", size = 220262, upload-time = "2026-03-17T10:31:37.184Z" }, + { url = "https://files.pythonhosted.org/packages/5b/4c/d20e554f988c8f91d6a02c5118f9abbbf73a8768a3048cb4962230d5743f/coverage-7.13.5-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:e0b216a19534b2427cc201a26c25da4a48633f29a487c61258643e89d28200c0", size = 220617, upload-time = "2026-03-17T10:31:39.245Z" }, + { url = "https://files.pythonhosted.org/packages/29/9c/f9f5277b95184f764b24e7231e166dfdb5780a46d408a2ac665969416d61/coverage-7.13.5-cp313-cp313t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:972a9cd27894afe4bc2b1480107054e062df08e671df7c2f18c205e805ccd806", size = 261912, upload-time = "2026-03-17T10:31:41.324Z" }, + { url = "https://files.pythonhosted.org/packages/d5/f6/7f1ab39393eeb50cfe4747ae8ef0e4fc564b989225aa1152e13a180d74f8/coverage-7.13.5-cp313-cp313t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:4b59148601efcd2bac8c4dbf1f0ad6391693ccf7a74b8205781751637076aee3", size = 263987, upload-time = "2026-03-17T10:31:43.724Z" }, + { url = "https://files.pythonhosted.org/packages/a0/d7/62c084fb489ed9c6fbdf57e006752e7c516ea46fd690e5ed8b8617c7d52e/coverage-7.13.5-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:505d7083c8b0c87a8fa8c07370c285847c1f77739b22e299ad75a6af6c32c5c9", size = 266416, upload-time = "2026-03-17T10:31:45.769Z" }, + { url = "https://files.pythonhosted.org/packages/a9/f6/df63d8660e1a0bff6125947afda112a0502736f470d62ca68b288ea762d8/coverage-7.13.5-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:60365289c3741e4db327e7baff2a4aaacf22f788e80fa4683393891b70a89fbd", size = 267558, upload-time = "2026-03-17T10:31:48.293Z" }, + { url = "https://files.pythonhosted.org/packages/5b/02/353ca81d36779bd108f6d384425f7139ac3c58c750dcfaafe5d0bee6436b/coverage-7.13.5-cp313-cp313t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:1b88c69c8ef5d4b6fe7dea66d6636056a0f6a7527c440e890cf9259011f5e606", size = 261163, upload-time = "2026-03-17T10:31:50.125Z" }, + { url = "https://files.pythonhosted.org/packages/2c/16/2e79106d5749bcaf3aee6d309123548e3276517cd7851faa8da213bc61bf/coverage-7.13.5-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:5b13955d31d1633cf9376908089b7cebe7d15ddad7aeaabcbe969a595a97e95e", size = 263981, upload-time = "2026-03-17T10:31:51.961Z" }, + { url = "https://files.pythonhosted.org/packages/29/c7/c29e0c59ffa6942030ae6f50b88ae49988e7e8da06de7ecdbf49c6d4feae/coverage-7.13.5-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:f70c9ab2595c56f81a89620e22899eea8b212a4041bd728ac6f4a28bf5d3ddd0", size = 261604, upload-time = "2026-03-17T10:31:53.872Z" }, + { url = "https://files.pythonhosted.org/packages/40/48/097cdc3db342f34006a308ab41c3a7c11c3f0d84750d340f45d88a782e00/coverage-7.13.5-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:084b84a8c63e8d6fc7e3931b316a9bcafca1458d753c539db82d31ed20091a87", size = 265321, upload-time = "2026-03-17T10:31:55.997Z" }, + { url = "https://files.pythonhosted.org/packages/bb/1f/4994af354689e14fd03a75f8ec85a9a68d94e0188bbdab3fc1516b55e512/coverage-7.13.5-cp313-cp313t-musllinux_1_2_riscv64.whl", hash = "sha256:ad14385487393e386e2ea988b09d62dd42c397662ac2dabc3832d71253eee479", size = 260502, upload-time = "2026-03-17T10:31:58.308Z" }, + { url = "https://files.pythonhosted.org/packages/22/c6/9bb9ef55903e628033560885f5c31aa227e46878118b63ab15dc7ba87797/coverage-7.13.5-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:7f2c47b36fe7709a6e83bfadf4eefb90bd25fbe4014d715224c4316f808e59a2", size = 262688, upload-time = "2026-03-17T10:32:00.141Z" }, + { url = "https://files.pythonhosted.org/packages/14/4f/f5df9007e50b15e53e01edea486814783a7f019893733d9e4d6caad75557/coverage-7.13.5-cp313-cp313t-win32.whl", hash = "sha256:67e9bc5449801fad0e5dff329499fb090ba4c5800b86805c80617b4e29809b2a", size = 222788, upload-time = "2026-03-17T10:32:02.246Z" }, + { url = "https://files.pythonhosted.org/packages/e1/98/aa7fccaa97d0f3192bec013c4e6fd6d294a6ed44b640e6bb61f479e00ed5/coverage-7.13.5-cp313-cp313t-win_amd64.whl", hash = "sha256:da86cdcf10d2519e10cabb8ac2de03da1bcb6e4853790b7fbd48523332e3a819", size = 223851, upload-time = "2026-03-17T10:32:04.416Z" }, + { url = "https://files.pythonhosted.org/packages/3d/8b/e5c469f7352651e5f013198e9e21f97510b23de957dd06a84071683b4b60/coverage-7.13.5-cp313-cp313t-win_arm64.whl", hash = "sha256:0ecf12ecb326fe2c339d93fc131816f3a7367d223db37817208905c89bded911", size = 222104, upload-time = "2026-03-17T10:32:06.65Z" }, + { url = "https://files.pythonhosted.org/packages/8e/77/39703f0d1d4b478bfd30191d3c14f53caf596fac00efb3f8f6ee23646439/coverage-7.13.5-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:fbabfaceaeb587e16f7008f7795cd80d20ec548dc7f94fbb0d4ec2e038ce563f", size = 219621, upload-time = "2026-03-17T10:32:08.589Z" }, + { url = "https://files.pythonhosted.org/packages/e2/3e/51dff36d99ae14639a133d9b164d63e628532e2974d8b1edb99dd1ebc733/coverage-7.13.5-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:9bb2a28101a443669a423b665939381084412b81c3f8c0fcfbac57f4e30b5b8e", size = 219953, upload-time = "2026-03-17T10:32:10.507Z" }, + { url = "https://files.pythonhosted.org/packages/6a/6c/1f1917b01eb647c2f2adc9962bd66c79eb978951cab61bdc1acab3290c07/coverage-7.13.5-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:bd3a2fbc1c6cccb3c5106140d87cc6a8715110373ef42b63cf5aea29df8c217a", size = 250992, upload-time = "2026-03-17T10:32:12.41Z" }, + { url = "https://files.pythonhosted.org/packages/22/e5/06b1f88f42a5a99df42ce61208bdec3bddb3d261412874280a19796fc09c/coverage-7.13.5-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:6c36ddb64ed9d7e496028d1d00dfec3e428e0aabf4006583bb1839958d280510", size = 253503, upload-time = "2026-03-17T10:32:14.449Z" }, + { url = "https://files.pythonhosted.org/packages/80/28/2a148a51e5907e504fa7b85490277734e6771d8844ebcc48764a15e28155/coverage-7.13.5-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:380e8e9084d8eb38db3a9176a1a4f3c0082c3806fa0dc882d1d87abc3c789247", size = 254852, upload-time = "2026-03-17T10:32:16.56Z" }, + { url = "https://files.pythonhosted.org/packages/61/77/50e8d3d85cc0b7ebe09f30f151d670e302c7ff4a1bf6243f71dd8b0981fa/coverage-7.13.5-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e808af52a0513762df4d945ea164a24b37f2f518cbe97e03deaa0ee66139b4d6", size = 257161, upload-time = "2026-03-17T10:32:19.004Z" }, + { url = "https://files.pythonhosted.org/packages/3b/c4/b5fd1d4b7bf8d0e75d997afd3925c59ba629fc8616f1b3aae7605132e256/coverage-7.13.5-cp314-cp314-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:e301d30dd7e95ae068671d746ba8c34e945a82682e62918e41b2679acd2051a0", size = 251021, upload-time = "2026-03-17T10:32:21.344Z" }, + { url = "https://files.pythonhosted.org/packages/f8/66/6ea21f910e92d69ef0b1c3346ea5922a51bad4446c9126db2ae96ee24c4c/coverage-7.13.5-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:800bc829053c80d240a687ceeb927a94fd108bbdc68dfbe505d0d75ab578a882", size = 252858, upload-time = "2026-03-17T10:32:23.506Z" }, + { url = "https://files.pythonhosted.org/packages/9e/ea/879c83cb5d61aa2a35fb80e72715e92672daef8191b84911a643f533840c/coverage-7.13.5-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:0b67af5492adb31940ee418a5a655c28e48165da5afab8c7fa6fd72a142f8740", size = 250823, upload-time = "2026-03-17T10:32:25.516Z" }, + { url = "https://files.pythonhosted.org/packages/8a/fb/616d95d3adb88b9803b275580bdeee8bd1b69a886d057652521f83d7322f/coverage-7.13.5-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:c9136ff29c3a91e25b1d1552b5308e53a1e0653a23e53b6366d7c2dcbbaf8a16", size = 255099, upload-time = "2026-03-17T10:32:27.944Z" }, + { url = "https://files.pythonhosted.org/packages/1c/93/25e6917c90ec1c9a56b0b26f6cad6408e5f13bb6b35d484a0d75c9cf000d/coverage-7.13.5-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:cff784eef7f0b8f6cb28804fbddcfa99f89efe4cc35fb5627e3ac58f91ed3ac0", size = 250638, upload-time = "2026-03-17T10:32:29.914Z" }, + { url = "https://files.pythonhosted.org/packages/fc/7b/dc1776b0464145a929deed214aef9fb1493f159b59ff3c7eeeedf91eddd0/coverage-7.13.5-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:68a4953be99b17ac3c23b6efbc8a38330d99680c9458927491d18700ef23ded0", size = 252295, upload-time = "2026-03-17T10:32:31.981Z" }, + { url = "https://files.pythonhosted.org/packages/ea/fb/99cbbc56a26e07762a2740713f3c8f9f3f3106e3a3dd8cc4474954bccd34/coverage-7.13.5-cp314-cp314-win32.whl", hash = "sha256:35a31f2b1578185fbe6aa2e74cea1b1d0bbf4c552774247d9160d29b80ed56cc", size = 222360, upload-time = "2026-03-17T10:32:34.233Z" }, + { url = "https://files.pythonhosted.org/packages/8d/b7/4758d4f73fb536347cc5e4ad63662f9d60ba9118cb6785e9616b2ce5d7fa/coverage-7.13.5-cp314-cp314-win_amd64.whl", hash = "sha256:2aa055ae1857258f9e0045be26a6d62bdb47a72448b62d7b55f4820f361a2633", size = 223174, upload-time = "2026-03-17T10:32:36.369Z" }, + { url = "https://files.pythonhosted.org/packages/2c/f2/24d84e1dfe70f8ac9fdf30d338239860d0d1d5da0bda528959d0ebc9da28/coverage-7.13.5-cp314-cp314-win_arm64.whl", hash = "sha256:1b11eef33edeae9d142f9b4358edb76273b3bfd30bc3df9a4f95d0e49caf94e8", size = 221739, upload-time = "2026-03-17T10:32:38.736Z" }, + { url = "https://files.pythonhosted.org/packages/60/5b/4a168591057b3668c2428bff25dd3ebc21b629d666d90bcdfa0217940e84/coverage-7.13.5-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:10a0c37f0b646eaff7cce1874c31d1f1ccb297688d4c747291f4f4c70741cc8b", size = 220351, upload-time = "2026-03-17T10:32:41.196Z" }, + { url = "https://files.pythonhosted.org/packages/f5/21/1fd5c4dbfe4a58b6b99649125635df46decdfd4a784c3cd6d410d303e370/coverage-7.13.5-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:b5db73ba3c41c7008037fa731ad5459fc3944cb7452fc0aa9f822ad3533c583c", size = 220612, upload-time = "2026-03-17T10:32:43.204Z" }, + { url = "https://files.pythonhosted.org/packages/d6/fe/2a924b3055a5e7e4512655a9d4609781b0d62334fa0140c3e742926834e2/coverage-7.13.5-cp314-cp314t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:750db93a81e3e5a9831b534be7b1229df848b2e125a604fe6651e48aa070e5f9", size = 261985, upload-time = "2026-03-17T10:32:45.514Z" }, + { url = "https://files.pythonhosted.org/packages/d7/0d/c8928f2bd518c45990fe1a2ab8db42e914ef9b726c975facc4282578c3eb/coverage-7.13.5-cp314-cp314t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:9ddb4f4a5479f2539644be484da179b653273bca1a323947d48ab107b3ed1f29", size = 264107, upload-time = "2026-03-17T10:32:47.971Z" }, + { url = "https://files.pythonhosted.org/packages/ef/ae/4ae35bbd9a0af9d820362751f0766582833c211224b38665c0f8de3d487f/coverage-7.13.5-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d8a7a2049c14f413163e2bdabd37e41179b1d1ccb10ffc6ccc4b7a718429c607", size = 266513, upload-time = "2026-03-17T10:32:50.1Z" }, + { url = "https://files.pythonhosted.org/packages/9c/20/d326174c55af36f74eac6ae781612d9492f060ce8244b570bb9d50d9d609/coverage-7.13.5-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e1c85e0b6c05c592ea6d8768a66a254bfb3874b53774b12d4c89c481eb78cb90", size = 267650, upload-time = "2026-03-17T10:32:52.391Z" }, + { url = "https://files.pythonhosted.org/packages/7a/5e/31484d62cbd0eabd3412e30d74386ece4a0837d4f6c3040a653878bfc019/coverage-7.13.5-cp314-cp314t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:777c4d1eff1b67876139d24288aaf1817f6c03d6bae9c5cc8d27b83bcfe38fe3", size = 261089, upload-time = "2026-03-17T10:32:54.544Z" }, + { url = "https://files.pythonhosted.org/packages/e9/d8/49a72d6de146eebb0b7e48cc0f4bc2c0dd858e3d4790ab2b39a2872b62bd/coverage-7.13.5-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:6697e29b93707167687543480a40f0db8f356e86d9f67ddf2e37e2dfd91a9dab", size = 263982, upload-time = "2026-03-17T10:32:56.803Z" }, + { url = "https://files.pythonhosted.org/packages/06/3b/0351f1bd566e6e4dd39e978efe7958bde1d32f879e85589de147654f57bb/coverage-7.13.5-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:8fdf453a942c3e4d99bd80088141c4c6960bb232c409d9c3558e2dbaa3998562", size = 261579, upload-time = "2026-03-17T10:32:59.466Z" }, + { url = "https://files.pythonhosted.org/packages/5d/ce/796a2a2f4017f554d7810f5c573449b35b1e46788424a548d4d19201b222/coverage-7.13.5-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:32ca0c0114c9834a43f045a87dcebd69d108d8ffb666957ea65aa132f50332e2", size = 265316, upload-time = "2026-03-17T10:33:01.847Z" }, + { url = "https://files.pythonhosted.org/packages/3d/16/d5ae91455541d1a78bc90abf495be600588aff8f6db5c8b0dae739fa39c9/coverage-7.13.5-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:8769751c10f339021e2638cd354e13adeac54004d1941119b2c96fe5276d45ea", size = 260427, upload-time = "2026-03-17T10:33:03.945Z" }, + { url = "https://files.pythonhosted.org/packages/48/11/07f413dba62db21fb3fad5d0de013a50e073cc4e2dc4306e770360f6dfc8/coverage-7.13.5-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:cec2d83125531bd153175354055cdb7a09987af08a9430bd173c937c6d0fba2a", size = 262745, upload-time = "2026-03-17T10:33:06.285Z" }, + { url = "https://files.pythonhosted.org/packages/91/15/d792371332eb4663115becf4bad47e047d16234b1aff687b1b18c58d60ae/coverage-7.13.5-cp314-cp314t-win32.whl", hash = "sha256:0cd9ed7a8b181775459296e402ca4fb27db1279740a24e93b3b41942ebe4b215", size = 223146, upload-time = "2026-03-17T10:33:08.756Z" }, + { url = "https://files.pythonhosted.org/packages/db/51/37221f59a111dca5e85be7dbf09696323b5b9f13ff65e0641d535ed06ea8/coverage-7.13.5-cp314-cp314t-win_amd64.whl", hash = "sha256:301e3b7dfefecaca37c9f1aa6f0049b7d4ab8dd933742b607765d757aca77d43", size = 224254, upload-time = "2026-03-17T10:33:11.174Z" }, + { url = "https://files.pythonhosted.org/packages/54/83/6acacc889de8987441aa7d5adfbdbf33d288dad28704a67e574f1df9bcbb/coverage-7.13.5-cp314-cp314t-win_arm64.whl", hash = "sha256:9dacc2ad679b292709e0f5fc1ac74a6d4d5562e424058962c7bb0c658ad25e45", size = 222276, upload-time = "2026-03-17T10:33:13.466Z" }, + { url = "https://files.pythonhosted.org/packages/9e/ee/a4cf96b8ce1e566ed238f0659ac2d3f007ed1d14b181bcb684e19561a69a/coverage-7.13.5-py3-none-any.whl", hash = "sha256:34b02417cf070e173989b3db962f7ed56d2f644307b2cf9d5a0f258e13084a61", size = 211346, upload-time = "2026-03-17T10:33:15.691Z" }, ] [package.optional-dependencies] @@ -194,9 +226,10 @@ wheels = [ [[package]] name = "llmb-run" -version = "1.10.11" +version = "1.14.4" source = { editable = "." } dependencies = [ + { name = "pydantic" }, { name = "pyyaml" }, { name = "rich" }, { name = "shellingham" }, @@ -216,6 +249,7 @@ dev = [ [package.metadata] requires-dist = [ { name = "black", marker = "extra == 'dev'", specifier = "~=26.1" }, + { name = "pydantic", specifier = ">=2.0,<3" }, { name = "pytest", marker = "extra == 'dev'", specifier = ">=7.0" }, { name = "pytest-cov", marker = "extra == 'dev'", specifier = ">=4.0" }, { name = "pytest-mock", marker = "extra == 'dev'" }, @@ -260,29 +294,29 @@ wheels = [ [[package]] name = "packaging" -version = "26.0" +version = "26.2" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/65/ee/299d360cdc32edc7d2cf530f3accf79c4fca01e96ffc950d8a52213bd8e4/packaging-26.0.tar.gz", hash = "sha256:00243ae351a257117b6a241061796684b084ed1c516a08c48a3f7e147a9d80b4", size = 143416, upload-time = "2026-01-21T20:50:39.064Z" } +sdist = { url = "https://files.pythonhosted.org/packages/d7/f1/e7a6dd94a8d4a5626c03e4e99c87f241ba9e350cd9e6d75123f992427270/packaging-26.2.tar.gz", hash = "sha256:ff452ff5a3e828ce110190feff1178bb1f2ea2281fa2075aadb987c2fb221661", size = 228134, upload-time = "2026-04-24T20:15:23.917Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/b7/b9/c538f279a4e237a006a2c98387d081e9eb060d203d8ed34467cc0f0b9b53/packaging-26.0-py3-none-any.whl", hash = "sha256:b36f1fef9334a5588b4166f8bcd26a14e521f2b55e6b9de3aaa80d3ff7a37529", size = 74366, upload-time = "2026-01-21T20:50:37.788Z" }, + { url = "https://files.pythonhosted.org/packages/df/b2/87e62e8c3e2f4b32e5fe99e0b86d576da1312593b39f47d8ceef365e95ed/packaging-26.2-py3-none-any.whl", hash = "sha256:5fc45236b9446107ff2415ce77c807cee2862cb6fac22b8a73826d0693b0980e", size = 100195, upload-time = "2026-04-24T20:15:22.081Z" }, ] [[package]] name = "pathspec" -version = "1.0.4" +version = "1.1.1" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/fa/36/e27608899f9b8d4dff0617b2d9ab17ca5608956ca44461ac14ac48b44015/pathspec-1.0.4.tar.gz", hash = "sha256:0210e2ae8a21a9137c0d470578cb0e595af87edaa6ebf12ff176f14a02e0e645", size = 131200, upload-time = "2026-01-27T03:59:46.938Z" } +sdist = { url = "https://files.pythonhosted.org/packages/5a/82/42f767fc1c1143d6fd36efb827202a2d997a375e160a71eb2888a925aac1/pathspec-1.1.1.tar.gz", hash = "sha256:17db5ecd524104a120e173814c90367a96a98d07c45b2e10c2f3919fff91bf5a", size = 135180, upload-time = "2026-04-27T01:46:08.907Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/ef/3c/2c197d226f9ea224a9ab8d197933f9da0ae0aac5b6e0f884e2b8d9c8e9f7/pathspec-1.0.4-py3-none-any.whl", hash = "sha256:fb6ae2fd4e7c921a165808a552060e722767cfa526f99ca5156ed2ce45a5c723", size = 55206, upload-time = "2026-01-27T03:59:45.137Z" }, + { url = "https://files.pythonhosted.org/packages/f1/d9/7fb5aa316bc299258e68c73ba3bddbc499654a07f151cba08f6153988714/pathspec-1.1.1-py3-none-any.whl", hash = "sha256:a00ce642f577bf7f473932318056212bc4f8bfdf53128c78bbd5af0b9b20b189", size = 57328, upload-time = "2026-04-27T01:46:07.06Z" }, ] [[package]] name = "platformdirs" -version = "4.5.1" +version = "4.9.6" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/cf/86/0248f086a84f01b37aaec0fa567b397df1a119f73c16f6c7a9aac73ea309/platformdirs-4.5.1.tar.gz", hash = "sha256:61d5cdcc6065745cdd94f0f878977f8de9437be93de97c1c12f853c9c0cdcbda", size = 21715, upload-time = "2025-12-05T13:52:58.638Z" } +sdist = { url = "https://files.pythonhosted.org/packages/9f/4a/0883b8e3802965322523f0b200ecf33d31f10991d0401162f4b23c698b42/platformdirs-4.9.6.tar.gz", hash = "sha256:3bfa75b0ad0db84096ae777218481852c0ebc6c727b3168c1b9e0118e458cf0a", size = 29400, upload-time = "2026-04-09T00:04:10.812Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/cb/28/3bfe2fa5a7b9c46fe7e13c97bda14c895fb10fa2ebf1d0abb90e0cea7ee1/platformdirs-4.5.1-py3-none-any.whl", hash = "sha256:d03afa3963c806a9bed9d5125c8f4cb2fdaf74a55ab60e5d59b3fde758104d31", size = 18731, upload-time = "2025-12-05T13:52:56.823Z" }, + { url = "https://files.pythonhosted.org/packages/75/a6/a0a304dc33b49145b21f4808d763822111e67d1c3a32b524a1baf947b6e1/platformdirs-4.9.6-py3-none-any.whl", hash = "sha256:e61adb1d5e5cb3441b4b7710bea7e4c12250ca49439228cc1021c00dcfac0917", size = 21348, upload-time = "2026-04-09T00:04:09.463Z" }, ] [[package]] @@ -294,18 +328,149 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/54/20/4d324d65cc6d9205fabedc306948156824eb9f0ee1633355a8f7ec5c66bf/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538, upload-time = "2025-05-15T12:30:06.134Z" }, ] +[[package]] +name = "pydantic" +version = "2.13.3" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "annotated-types" }, + { name = "pydantic-core" }, + { name = "typing-extensions" }, + { name = "typing-inspection" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/d9/e4/40d09941a2cebcb20609b86a559817d5b9291c49dd6f8c87e5feffbe703a/pydantic-2.13.3.tar.gz", hash = "sha256:af09e9d1d09f4e7fe37145c1f577e1d61ceb9a41924bf0094a36506285d0a84d", size = 844068, upload-time = "2026-04-20T14:46:43.632Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/f3/0a/fd7d723f8f8153418fb40cf9c940e82004fce7e987026b08a68a36dd3fe7/pydantic-2.13.3-py3-none-any.whl", hash = "sha256:6db14ac8dfc9a1e57f87ea2c0de670c251240f43cb0c30a5130e9720dc612927", size = 471981, upload-time = "2026-04-20T14:46:41.402Z" }, +] + +[[package]] +name = "pydantic-core" +version = "2.46.3" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "typing-extensions" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/2a/ef/f7abb56c49382a246fd2ce9c799691e3c3e7175ec74b14d99e798bcddb1a/pydantic_core-2.46.3.tar.gz", hash = "sha256:41c178f65b8c29807239d47e6050262eb6bf84eb695e41101e62e38df4a5bc2c", size = 471412, upload-time = "2026-04-20T14:40:56.672Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/22/98/b50eb9a411e87483b5c65dba4fa430a06bac4234d3403a40e5a9905ebcd0/pydantic_core-2.46.3-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:1da3786b8018e60349680720158cc19161cc3b4bdd815beb0a321cd5ce1ad5b1", size = 2108971, upload-time = "2026-04-20T14:43:51.945Z" }, + { url = "https://files.pythonhosted.org/packages/08/4b/f364b9d161718ff2217160a4b5d41ce38de60aed91c3689ebffa1c939d23/pydantic_core-2.46.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:cc0988cb29d21bf4a9d5cf2ef970b5c0e38d8d8e107a493278c05dc6c1dda69f", size = 1949588, upload-time = "2026-04-20T14:44:10.386Z" }, + { url = "https://files.pythonhosted.org/packages/8f/8b/30bd03ee83b2f5e29f5ba8e647ab3c456bf56f2ec72fdbcc0215484a0854/pydantic_core-2.46.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:27f9067c3bfadd04c55484b89c0d267981b2f3512850f6f66e1e74204a4e4ce3", size = 1975986, upload-time = "2026-04-20T14:43:57.106Z" }, + { url = "https://files.pythonhosted.org/packages/3c/54/13ccf954d84ec275d5d023d5786e4aa48840bc9f161f2838dc98e1153518/pydantic_core-2.46.3-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:a642ac886ecf6402d9882d10c405dcf4b902abeb2972cd5fb4a48c83cd59279a", size = 2055830, upload-time = "2026-04-20T14:44:15.499Z" }, + { url = "https://files.pythonhosted.org/packages/be/0e/65f38125e660fdbd72aa858e7dfae893645cfa0e7b13d333e174a367cd23/pydantic_core-2.46.3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:79f561438481f28681584b89e2effb22855e2179880314bcddbf5968e935e807", size = 2222340, upload-time = "2026-04-20T14:41:51.353Z" }, + { url = "https://files.pythonhosted.org/packages/d1/88/f3ab7739efe0e7e80777dbb84c59eb98518e3f57ea433206194c2e425272/pydantic_core-2.46.3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:57a973eae4665352a47cf1a99b4ee864620f2fe663a217d7a8da68a1f3a5bfda", size = 2280727, upload-time = "2026-04-20T14:41:30.461Z" }, + { url = "https://files.pythonhosted.org/packages/2a/6d/c228219080817bec4982f9531cadb18da6aaa770fdeb114f49c237ac2c9f/pydantic_core-2.46.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:83d002b97072a53ea150d63e0a3adfae5670cef5aa8a6e490240e482d3b22e57", size = 2092158, upload-time = "2026-04-20T14:44:07.305Z" }, + { url = "https://files.pythonhosted.org/packages/0f/b1/525a16711e7c6d61635fac3b0bd54600b5c5d9f60c6fc5aaab26b64a2297/pydantic_core-2.46.3-cp310-cp310-manylinux_2_31_riscv64.whl", hash = "sha256:b40ddd51e7c44b28cfaef746c9d3c506d658885e0a46f9eeef2ee815cbf8e045", size = 2116626, upload-time = "2026-04-20T14:42:34.118Z" }, + { url = "https://files.pythonhosted.org/packages/ef/7c/17d30673351439a6951bf54f564cf2443ab00ae264ec9df00e2efd710eb5/pydantic_core-2.46.3-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:ac5ec7fb9b87f04ee839af2d53bcadea57ded7d229719f56c0ed895bff987943", size = 2160691, upload-time = "2026-04-20T14:41:14.023Z" }, + { url = "https://files.pythonhosted.org/packages/86/66/af8adbcbc0886ead7f1a116606a534d75a307e71e6e08226000d51b880d2/pydantic_core-2.46.3-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:a3b11c812f61b3129c4905781a2601dfdfdea5fe1e6c1cfb696b55d14e9c054f", size = 2182543, upload-time = "2026-04-20T14:40:48.886Z" }, + { url = "https://files.pythonhosted.org/packages/b0/37/6de71e0f54c54a4190010f57deb749e1ddf75c568ada3b1320b70067f121/pydantic_core-2.46.3-cp310-cp310-musllinux_1_1_armv7l.whl", hash = "sha256:1108da631e602e5b3c38d6d04fe5bb3bfa54349e6918e3ca6cf570b2e2b2f9d4", size = 2324513, upload-time = "2026-04-20T14:42:36.121Z" }, + { url = "https://files.pythonhosted.org/packages/51/b1/9fc74ce94f603d5ef59ff258ca9c2c8fb902fb548d340a96f77f4d1c3b7f/pydantic_core-2.46.3-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:de885175515bcfa98ae618c1df7a072f13d179f81376c8007112af20567fd08a", size = 2361853, upload-time = "2026-04-20T14:43:24.886Z" }, + { url = "https://files.pythonhosted.org/packages/40/d0/4c652fc592db35f100279ee751d5a145aca1b9a7984b9684ba7c1b5b0535/pydantic_core-2.46.3-cp310-cp310-win32.whl", hash = "sha256:d11058e3201527d41bc6b545c79187c9e4bf85e15a236a6007f0e991518882b7", size = 1980465, upload-time = "2026-04-20T14:44:46.239Z" }, + { url = "https://files.pythonhosted.org/packages/27/b8/a920453c38afbe1f355e1ea0b0d94a0a3e0b0879d32d793108755fa171d5/pydantic_core-2.46.3-cp310-cp310-win_amd64.whl", hash = "sha256:3612edf65c8ea67ac13616c4d23af12faef1ae435a8a93e5934c2a0cbbdd1fd6", size = 2073884, upload-time = "2026-04-20T14:43:01.201Z" }, + { url = "https://files.pythonhosted.org/packages/22/a2/1ba90a83e85a3f94c796b184f3efde9c72f2830dcda493eea8d59ba78e6d/pydantic_core-2.46.3-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:ab124d49d0459b2373ecf54118a45c28a1e6d4192a533fbc915e70f556feb8e5", size = 2106740, upload-time = "2026-04-20T14:41:20.932Z" }, + { url = "https://files.pythonhosted.org/packages/b6/f6/99ae893c89a0b9d3daec9f95487aa676709aa83f67643b3f0abaf4ab628a/pydantic_core-2.46.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:cca67d52a5c7a16aed2b3999e719c4bcf644074eac304a5d3d62dd70ae7d4b2c", size = 1948293, upload-time = "2026-04-20T14:43:42.115Z" }, + { url = "https://files.pythonhosted.org/packages/3e/b8/2e8e636dc9e3f16c2e16bf0849e24be82c5ee82c603c65fc0326666328fc/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5c024e08c0ba23e6fd68c771a521e9d6a792f2ebb0fa734296b36394dc30390e", size = 1973222, upload-time = "2026-04-20T14:41:57.841Z" }, + { url = "https://files.pythonhosted.org/packages/34/36/0e730beec4d83c5306f417afbd82ff237d9a21e83c5edf675f31ed84c1fe/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:6645ce7eec4928e29a1e3b3d5c946621d105d3e79f0c9cddf07c2a9770949287", size = 2053852, upload-time = "2026-04-20T14:40:43.077Z" }, + { url = "https://files.pythonhosted.org/packages/4b/f0/3071131f47e39136a17814576e0fada9168569f7f8c0e6ac4d1ede6a4958/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a712c7118e6c5ea96562f7b488435172abb94a3c53c22c9efc1412264a45cbbe", size = 2221134, upload-time = "2026-04-20T14:43:03.349Z" }, + { url = "https://files.pythonhosted.org/packages/2f/a9/a2dc023eec5aa4b02a467874bad32e2446957d2adcab14e107eab502e978/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:69a868ef3ff206343579021c40faf3b1edc64b1cc508ff243a28b0a514ccb050", size = 2279785, upload-time = "2026-04-20T14:41:19.285Z" }, + { url = "https://files.pythonhosted.org/packages/0a/44/93f489d16fb63fbd41c670441536541f6e8cfa1e5a69f40bc9c5d30d8c90/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:cc7e8c32db809aa0f6ea1d6869ebc8518a65d5150fdfad8bcae6a49ae32a22e2", size = 2089404, upload-time = "2026-04-20T14:43:10.108Z" }, + { url = "https://files.pythonhosted.org/packages/2a/78/8692e3aa72b2d004f7a5d937f1dfdc8552ba26caf0bec75f342c40f00dec/pydantic_core-2.46.3-cp311-cp311-manylinux_2_31_riscv64.whl", hash = "sha256:3481bd1341dc85779ee506bc8e1196a277ace359d89d28588a9468c3ecbe63fa", size = 2114898, upload-time = "2026-04-20T14:44:51.475Z" }, + { url = "https://files.pythonhosted.org/packages/6a/62/e83133f2e7832532060175cebf1f13748f4c7e7e7165cdd1f611f174494b/pydantic_core-2.46.3-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:8690eba565c6d68ffd3a8655525cbdd5246510b44a637ee2c6c03a7ebfe64d3c", size = 2157856, upload-time = "2026-04-20T14:43:46.64Z" }, + { url = "https://files.pythonhosted.org/packages/6d/ec/6a500e3ad7718ee50583fae79c8651f5d37e3abce1fa9ae177ae65842c53/pydantic_core-2.46.3-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:4de88889d7e88d50d40ee5b39d5dac0bcaef9ba91f7e536ac064e6b2834ecccf", size = 2180168, upload-time = "2026-04-20T14:42:00.302Z" }, + { url = "https://files.pythonhosted.org/packages/d8/53/8267811054b1aa7fc1dc7ded93812372ef79a839f5e23558136a6afbfde1/pydantic_core-2.46.3-cp311-cp311-musllinux_1_1_armv7l.whl", hash = "sha256:e480080975c1ef7f780b8f99ed72337e7cc5efea2e518a20a692e8e7b278eb8b", size = 2322885, upload-time = "2026-04-20T14:41:05.253Z" }, + { url = "https://files.pythonhosted.org/packages/c8/c1/1c0acdb3aa0856ddc4ecc55214578f896f2de16f400cf51627eb3c26c1c4/pydantic_core-2.46.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:de3a5c376f8cd94da9a1b8fd3dd1c16c7a7b216ed31dc8ce9fd7a22bf13b836e", size = 2360328, upload-time = "2026-04-20T14:41:43.991Z" }, + { url = "https://files.pythonhosted.org/packages/f0/d0/ef39cd0f4a926814f360e71c1adeab48ad214d9727e4deb48eedfb5bce1a/pydantic_core-2.46.3-cp311-cp311-win32.whl", hash = "sha256:fc331a5314ffddd5385b9ee9d0d2fee0b13c27e0e02dad71b1ae5d6561f51eeb", size = 1979464, upload-time = "2026-04-20T14:43:12.215Z" }, + { url = "https://files.pythonhosted.org/packages/18/9c/f41951b0d858e343f1cf09398b2a7b3014013799744f2c4a8ad6a3eec4f2/pydantic_core-2.46.3-cp311-cp311-win_amd64.whl", hash = "sha256:b5b9c6cf08a8a5e502698f5e153056d12c34b8fb30317e0c5fd06f45162a6346", size = 2070837, upload-time = "2026-04-20T14:41:47.707Z" }, + { url = "https://files.pythonhosted.org/packages/9f/1e/264a17cd582f6ed50950d4d03dd5fefd84e570e238afe1cb3e25cf238769/pydantic_core-2.46.3-cp311-cp311-win_arm64.whl", hash = "sha256:5dfd51cf457482f04ec49491811a2b8fd5b843b64b11eecd2d7a1ee596ea78a6", size = 2053647, upload-time = "2026-04-20T14:42:27.535Z" }, + { url = "https://files.pythonhosted.org/packages/4b/cb/5b47425556ecc1f3fe18ed2a0083188aa46e1dd812b06e406475b3a5d536/pydantic_core-2.46.3-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:b11b59b3eee90a80a36701ddb4576d9ae31f93f05cb9e277ceaa09e6bf074a67", size = 2101946, upload-time = "2026-04-20T14:40:52.581Z" }, + { url = "https://files.pythonhosted.org/packages/a1/4f/2fb62c2267cae99b815bbf4a7b9283812c88ca3153ef29f7707200f1d4e5/pydantic_core-2.46.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:af8653713055ea18a3abc1537fe2ebc42f5b0bbb768d1eb79fd74eb47c0ac089", size = 1951612, upload-time = "2026-04-20T14:42:42.996Z" }, + { url = "https://files.pythonhosted.org/packages/50/6e/b7348fd30d6556d132cddd5bd79f37f96f2601fe0608afac4f5fb01ec0b3/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:75a519dab6d63c514f3a81053e5266c549679e4aa88f6ec57f2b7b854aceb1b0", size = 1977027, upload-time = "2026-04-20T14:42:02.001Z" }, + { url = "https://files.pythonhosted.org/packages/82/11/31d60ee2b45540d3fb0b29302a393dbc01cd771c473f5b5147bcd353e593/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:a6cd87cb1575b1ad05ba98894c5b5c96411ef678fa2f6ed2576607095b8d9789", size = 2063008, upload-time = "2026-04-20T14:44:17.952Z" }, + { url = "https://files.pythonhosted.org/packages/8a/db/3a9d1957181b59258f44a2300ab0f0be9d1e12d662a4f57bb31250455c52/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:f80a55484b8d843c8ada81ebf70a682f3f00a3d40e378c06cf17ecb44d280d7d", size = 2233082, upload-time = "2026-04-20T14:40:57.934Z" }, + { url = "https://files.pythonhosted.org/packages/9c/e1/3277c38792aeb5cfb18c2f0c5785a221d9ff4e149abbe1184d53d5f72273/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3861f1731b90c50a3266316b9044f5c9b405eecb8e299b0a7120596334e4fe9c", size = 2304615, upload-time = "2026-04-20T14:42:12.584Z" }, + { url = "https://files.pythonhosted.org/packages/5e/d5/e3d9717c9eba10855325650afd2a9cba8e607321697f18953af9d562da2f/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fb528e295ed31570ac3dcc9bfdd6e0150bc11ce6168ac87a8082055cf1a67395", size = 2094380, upload-time = "2026-04-20T14:43:05.522Z" }, + { url = "https://files.pythonhosted.org/packages/a1/20/abac35dedcbfd66c6f0b03e4e3564511771d6c9b7ede10a362d03e110d9b/pydantic_core-2.46.3-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:367508faa4973b992b271ba1494acaab36eb7e8739d1e47be5035fb1ea225396", size = 2135429, upload-time = "2026-04-20T14:41:55.549Z" }, + { url = "https://files.pythonhosted.org/packages/6c/a5/41bfd1df69afad71b5cf0535055bccc73022715ad362edbc124bc1e021d7/pydantic_core-2.46.3-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:5ad3c826fe523e4becf4fe39baa44286cff85ef137c729a2c5e269afbfd0905d", size = 2174582, upload-time = "2026-04-20T14:41:45.96Z" }, + { url = "https://files.pythonhosted.org/packages/79/65/38d86ea056b29b2b10734eb23329b7a7672ca604df4f2b6e9c02d4ee22fe/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:ec638c5d194ef8af27db69f16c954a09797c0dc25015ad6123eb2c73a4d271ca", size = 2187533, upload-time = "2026-04-20T14:40:55.367Z" }, + { url = "https://files.pythonhosted.org/packages/b6/55/a1129141678a2026badc539ad1dee0a71d06f54c2f06a4bd68c030ac781b/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:28ed528c45446062ee66edb1d33df5d88828ae167de76e773a3c7f64bd14e976", size = 2332985, upload-time = "2026-04-20T14:44:13.05Z" }, + { url = "https://files.pythonhosted.org/packages/d7/60/cb26f4077719f709e54819f4e8e1d43f4091f94e285eb6bd21e1190a7b7c/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:aed19d0c783886d5bd86d80ae5030006b45e28464218747dcf83dabfdd092c7b", size = 2373670, upload-time = "2026-04-20T14:41:53.421Z" }, + { url = "https://files.pythonhosted.org/packages/6b/7e/c3f21882bdf1d8d086876f81b5e296206c69c6082551d776895de7801fa0/pydantic_core-2.46.3-cp312-cp312-win32.whl", hash = "sha256:06d5d8820cbbdb4147578c1fe7ffcd5b83f34508cb9f9ab76e807be7db6ff0a4", size = 1966722, upload-time = "2026-04-20T14:44:30.588Z" }, + { url = "https://files.pythonhosted.org/packages/57/be/6b5e757b859013ebfbd7adba02f23b428f37c86dcbf78b5bb0b4ffd36e99/pydantic_core-2.46.3-cp312-cp312-win_amd64.whl", hash = "sha256:c3212fda0ee959c1dd04c60b601ec31097aaa893573a3a1abd0a47bcac2968c1", size = 2072970, upload-time = "2026-04-20T14:42:54.248Z" }, + { url = "https://files.pythonhosted.org/packages/bf/f8/a989b21cc75e9a32d24192ef700eea606521221a89faa40c919ce884f2b1/pydantic_core-2.46.3-cp312-cp312-win_arm64.whl", hash = "sha256:f1f8338dd7a7f31761f1f1a3c47503a9a3b34eea3c8b01fa6ee96408affb5e72", size = 2035963, upload-time = "2026-04-20T14:44:20.4Z" }, + { url = "https://files.pythonhosted.org/packages/9b/3c/9b5e8eb9821936d065439c3b0fb1490ffa64163bfe7e1595985a47896073/pydantic_core-2.46.3-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:12bc98de041458b80c86c56b24df1d23832f3e166cbaff011f25d187f5c62c37", size = 2102109, upload-time = "2026-04-20T14:41:24.219Z" }, + { url = "https://files.pythonhosted.org/packages/91/97/1c41d1f5a19f241d8069f1e249853bcce378cdb76eec8ab636d7bc426280/pydantic_core-2.46.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:85348b8f89d2c3508b65b16c3c33a4da22b8215138d8b996912bb1532868885f", size = 1951820, upload-time = "2026-04-20T14:42:14.236Z" }, + { url = "https://files.pythonhosted.org/packages/30/b4/d03a7ae14571bc2b6b3c7b122441154720619afe9a336fa3a95434df5e2f/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1105677a6df914b1fb71a81b96c8cce7726857e1717d86001f29be06a25ee6f8", size = 1977785, upload-time = "2026-04-20T14:42:31.648Z" }, + { url = "https://files.pythonhosted.org/packages/ae/0c/4086f808834b59e3c8f1aa26df8f4b6d998cdcf354a143d18ef41529d1fe/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:87082cd65669a33adeba5470769e9704c7cf026cc30afb9cc77fd865578ebaad", size = 2062761, upload-time = "2026-04-20T14:40:37.093Z" }, + { url = "https://files.pythonhosted.org/packages/fa/71/a649be5a5064c2df0db06e0a512c2281134ed2fcc981f52a657936a7527c/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:60e5f66e12c4f5212d08522963380eaaeac5ebd795826cfd19b2dfb0c7a52b9c", size = 2232989, upload-time = "2026-04-20T14:42:59.254Z" }, + { url = "https://files.pythonhosted.org/packages/a2/84/7756e75763e810b3a710f4724441d1ecc5883b94aacb07ca71c5fb5cfb69/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b6cdf19bf84128d5e7c37e8a73a0c5c10d51103a650ac585d42dd6ae233f2b7f", size = 2303975, upload-time = "2026-04-20T14:41:32.287Z" }, + { url = "https://files.pythonhosted.org/packages/6c/35/68a762e0c1e31f35fa0dac733cbd9f5b118042853698de9509c8e5bf128b/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:031bb17f4885a43773c8c763089499f242aee2ea85cf17154168775dccdecf35", size = 2095325, upload-time = "2026-04-20T14:42:47.685Z" }, + { url = "https://files.pythonhosted.org/packages/77/bf/1bf8c9a8e91836c926eae5e3e51dce009bf495a60ca56060689d3df3f340/pydantic_core-2.46.3-cp313-cp313-manylinux_2_31_riscv64.whl", hash = "sha256:bcf2a8b2982a6673693eae7348ef3d8cf3979c1d63b54fca7c397a635cc68687", size = 2133368, upload-time = "2026-04-20T14:41:22.766Z" }, + { url = "https://files.pythonhosted.org/packages/e5/50/87d818d6bab915984995157ceb2380f5aac4e563dddbed6b56f0ed057aba/pydantic_core-2.46.3-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:28e8cf2f52d72ced402a137145923a762cbb5081e48b34312f7a0c8f55928ec3", size = 2173908, upload-time = "2026-04-20T14:42:52.044Z" }, + { url = "https://files.pythonhosted.org/packages/91/88/a311fb306d0bd6185db41fa14ae888fb81d0baf648a761ae760d30819d33/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:17eaface65d9fc5abb940003020309c1bf7a211f5f608d7870297c367e6f9022", size = 2186422, upload-time = "2026-04-20T14:43:29.55Z" }, + { url = "https://files.pythonhosted.org/packages/8f/79/28fd0d81508525ab2054fef7c77a638c8b5b0afcbbaeee493cf7c3fef7e1/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:93fd339f23408a07e98950a89644f92c54d8729719a40b30c0a30bb9ebc55d23", size = 2332709, upload-time = "2026-04-20T14:42:16.134Z" }, + { url = "https://files.pythonhosted.org/packages/b3/21/795bf5fe5c0f379308b8ef19c50dedab2e7711dbc8d0c2acf08f1c7daa05/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:23cbdb3aaa74dfe0837975dbf69b469753bbde8eacace524519ffdb6b6e89eb7", size = 2372428, upload-time = "2026-04-20T14:41:10.974Z" }, + { url = "https://files.pythonhosted.org/packages/45/b3/ed14c659cbe7605e3ef063077680a64680aec81eb1a04763a05190d49b7f/pydantic_core-2.46.3-cp313-cp313-win32.whl", hash = "sha256:610eda2e3838f401105e6326ca304f5da1e15393ae25dacae5c5c63f2c275b13", size = 1965601, upload-time = "2026-04-20T14:41:42.128Z" }, + { url = "https://files.pythonhosted.org/packages/ef/bb/adb70d9a762ddd002d723fbf1bd492244d37da41e3af7b74ad212609027e/pydantic_core-2.46.3-cp313-cp313-win_amd64.whl", hash = "sha256:68cc7866ed863db34351294187f9b729964c371ba33e31c26f478471c52e1ed0", size = 2071517, upload-time = "2026-04-20T14:43:36.096Z" }, + { url = "https://files.pythonhosted.org/packages/52/eb/66faefabebfe68bd7788339c9c9127231e680b11906368c67ce112fdb47f/pydantic_core-2.46.3-cp313-cp313-win_arm64.whl", hash = "sha256:f64b5537ac62b231572879cd08ec05600308636a5d63bcbdb15063a466977bec", size = 2035802, upload-time = "2026-04-20T14:43:38.507Z" }, + { url = "https://files.pythonhosted.org/packages/7f/db/a7bcb4940183fda36022cd18ba8dd12f2dff40740ec7b58ce7457befa416/pydantic_core-2.46.3-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:afa3aa644f74e290cdede48a7b0bee37d1c35e71b05105f6b340d484af536d9b", size = 2097614, upload-time = "2026-04-20T14:44:38.374Z" }, + { url = "https://files.pythonhosted.org/packages/24/35/e4066358a22e3e99519db370494c7528f5a2aa1367370e80e27e20283543/pydantic_core-2.46.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:ced3310e51aa425f7f77da8bbbb5212616655bedbe82c70944320bc1dbe5e018", size = 1951896, upload-time = "2026-04-20T14:40:53.996Z" }, + { url = "https://files.pythonhosted.org/packages/87/92/37cf4049d1636996e4b888c05a501f40a43ff218983a551d57f9d5e14f0d/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e29908922ce9da1a30b4da490bd1d3d82c01dcfdf864d2a74aacee674d0bfa34", size = 1979314, upload-time = "2026-04-20T14:41:49.446Z" }, + { url = "https://files.pythonhosted.org/packages/d8/36/9ff4d676dfbdfb2d591cf43f3d90ded01e15b1404fd101180ed2d62a2fd3/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:0c9ff69140423eea8ed2d5477df3ba037f671f5e897d206d921bc9fdc39613e7", size = 2056133, upload-time = "2026-04-20T14:42:23.574Z" }, + { url = "https://files.pythonhosted.org/packages/bc/f0/405b442a4d7ba855b06eec8b2bf9c617d43b8432d099dfdc7bf999293495/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b675ab0a0d5b1c8fdb81195dc5bcefea3f3c240871cdd7ff9a2de8aa50772eb2", size = 2228726, upload-time = "2026-04-20T14:44:22.816Z" }, + { url = "https://files.pythonhosted.org/packages/e7/f8/65cd92dd5a0bd89ba277a98ecbfaf6fc36bbd3300973c7a4b826d6ab1391/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0087084960f209a9a4af50ecd1fb063d9ad3658c07bb81a7a53f452dacbfb2ba", size = 2301214, upload-time = "2026-04-20T14:44:48.792Z" }, + { url = "https://files.pythonhosted.org/packages/fd/86/ef96a4c6e79e7a2d0410826a68fbc0eccc0fd44aa733be199d5fcac3bb87/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ed42e6cc8e1b0e2b9b96e2276bad70ae625d10d6d524aed0c93de974ae029f9f", size = 2099927, upload-time = "2026-04-20T14:41:40.196Z" }, + { url = "https://files.pythonhosted.org/packages/6d/53/269caf30e0096e0a8a8f929d1982a27b3879872cca2d917d17c2f9fdf4fe/pydantic_core-2.46.3-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:f1771ce258afb3e4201e67d154edbbae712a76a6081079fe247c2f53c6322c22", size = 2128789, upload-time = "2026-04-20T14:41:15.868Z" }, + { url = "https://files.pythonhosted.org/packages/00/b0/1a6d9b6a587e118482910c244a1c5acf4d192604174132efd12bf0ac486f/pydantic_core-2.46.3-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:a7610b6a5242a6c736d8ad47fd5fff87fcfe8f833b281b1c409c3d6835d9227f", size = 2173815, upload-time = "2026-04-20T14:44:25.152Z" }, + { url = "https://files.pythonhosted.org/packages/87/56/e7e00d4041a7e62b5a40815590114db3b535bf3ca0bf4dca9f16cef25246/pydantic_core-2.46.3-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:ff5e7783bcc5476e1db448bf268f11cb257b1c276d3e89f00b5727be86dd0127", size = 2181608, upload-time = "2026-04-20T14:41:28.933Z" }, + { url = "https://files.pythonhosted.org/packages/e8/22/4bd23c3d41f7c185d60808a1de83c76cf5aeabf792f6c636a55c3b1ec7f9/pydantic_core-2.46.3-cp314-cp314-musllinux_1_1_armv7l.whl", hash = "sha256:9d2e32edcc143bc01e95300671915d9ca052d4f745aa0a49c48d4803f8a85f2c", size = 2326968, upload-time = "2026-04-20T14:42:03.962Z" }, + { url = "https://files.pythonhosted.org/packages/24/ac/66cd45129e3915e5ade3b292cb3bc7fd537f58f8f8dbdaba6170f7cabb74/pydantic_core-2.46.3-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:6e42d83d1c6b87fa56b521479cff237e626a292f3b31b6345c15a99121b454c1", size = 2369842, upload-time = "2026-04-20T14:41:35.52Z" }, + { url = "https://files.pythonhosted.org/packages/a2/51/dd4248abb84113615473aa20d5545b7c4cd73c8644003b5259686f93996c/pydantic_core-2.46.3-cp314-cp314-win32.whl", hash = "sha256:07bc6d2a28c3adb4f7c6ae46aa4f2d2929af127f587ed44057af50bf1ce0f505", size = 1959661, upload-time = "2026-04-20T14:41:00.042Z" }, + { url = "https://files.pythonhosted.org/packages/20/eb/59980e5f1ae54a3b86372bd9f0fa373ea2d402e8cdcd3459334430f91e91/pydantic_core-2.46.3-cp314-cp314-win_amd64.whl", hash = "sha256:8940562319bc621da30714617e6a7eaa6b98c84e8c685bcdc02d7ed5e7c7c44e", size = 2071686, upload-time = "2026-04-20T14:43:16.471Z" }, + { url = "https://files.pythonhosted.org/packages/8c/db/1cf77e5247047dfee34bc01fa9bca134854f528c8eb053e144298893d370/pydantic_core-2.46.3-cp314-cp314-win_arm64.whl", hash = "sha256:5dcbbcf4d22210ced8f837c96db941bdb078f419543472aca5d9a0bb7cddc7df", size = 2026907, upload-time = "2026-04-20T14:43:31.732Z" }, + { url = "https://files.pythonhosted.org/packages/57/c0/b3df9f6a543276eadba0a48487b082ca1f201745329d97dbfa287034a230/pydantic_core-2.46.3-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:d0fe3dce1e836e418f912c1ad91c73357d03e556a4d286f441bf34fed2dbeecf", size = 2095047, upload-time = "2026-04-20T14:42:37.982Z" }, + { url = "https://files.pythonhosted.org/packages/66/57/886a938073b97556c168fd99e1a7305bb363cd30a6d2c76086bf0587b32a/pydantic_core-2.46.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:9ce92e58abc722dac1bf835a6798a60b294e48eb0e625ec9fd994b932ac5feee", size = 1934329, upload-time = "2026-04-20T14:43:49.655Z" }, + { url = "https://files.pythonhosted.org/packages/0b/7c/b42eaa5c34b13b07ecb51da21761297a9b8eb43044c864a035999998f328/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a03e6467f0f5ab796a486146d1b887b2dc5e5f9b3288898c1b1c3ad974e53e4a", size = 1974847, upload-time = "2026-04-20T14:42:10.737Z" }, + { url = "https://files.pythonhosted.org/packages/e6/9b/92b42db6543e7de4f99ae977101a2967b63122d4b6cf7773812da2d7d5b5/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:2798b6ba041b9d70acfb9071a2ea13c8456dd1e6a5555798e41ba7b0790e329c", size = 2041742, upload-time = "2026-04-20T14:40:44.262Z" }, + { url = "https://files.pythonhosted.org/packages/0f/19/46fbe1efabb5aa2834b43b9454e70f9a83ad9c338c1291e48bdc4fecf167/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9be3e221bdc6d69abf294dcf7aff6af19c31a5cdcc8f0aa3b14be29df4bd03b1", size = 2236235, upload-time = "2026-04-20T14:41:27.307Z" }, + { url = "https://files.pythonhosted.org/packages/77/da/b3f95bc009ad60ec53120f5d16c6faa8cabdbe8a20d83849a1f2b8728148/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f13936129ce841f2a5ddf6f126fea3c43cd128807b5a59588c37cf10178c2e64", size = 2282633, upload-time = "2026-04-20T14:44:33.271Z" }, + { url = "https://files.pythonhosted.org/packages/cc/6e/401336117722e28f32fb8220df676769d28ebdf08f2f4469646d404c43a3/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:28b5f2ef03416facccb1c6ef744c69793175fd27e44ef15669201601cf423acb", size = 2109679, upload-time = "2026-04-20T14:44:41.065Z" }, + { url = "https://files.pythonhosted.org/packages/fc/53/b289f9bc8756a32fe718c46f55afaeaf8d489ee18d1a1e7be1db73f42cc4/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:830d1247d77ad23852314f069e9d7ddafeec5f684baf9d7e7065ed46a049c4e6", size = 2108342, upload-time = "2026-04-20T14:42:50.144Z" }, + { url = "https://files.pythonhosted.org/packages/10/5b/8292fc7c1f9111f1b2b7c1b0dcf1179edcd014fc3ea4517499f50b829d71/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d0793c90c1a3c74966e7975eaef3ed30ebdff3260a0f815a62a22adc17e4c01c", size = 2157208, upload-time = "2026-04-20T14:42:08.133Z" }, + { url = "https://files.pythonhosted.org/packages/2b/9e/f80044e9ec07580f057a89fc131f78dda7a58751ddf52bbe05eaf31db50f/pydantic_core-2.46.3-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:d2d0aead851b66f5245ec0c4fb2612ef457f8bbafefdf65a2bf9d6bac6140f47", size = 2167237, upload-time = "2026-04-20T14:42:25.412Z" }, + { url = "https://files.pythonhosted.org/packages/f8/84/6781a1b037f3b96be9227edbd1101f6d3946746056231bf4ac48cdff1a8d/pydantic_core-2.46.3-cp314-cp314t-musllinux_1_1_armv7l.whl", hash = "sha256:2f40e4246676beb31c5ce77c38a55ca4e465c6b38d11ea1bd935420568e0b1ab", size = 2312540, upload-time = "2026-04-20T14:40:40.313Z" }, + { url = "https://files.pythonhosted.org/packages/3e/db/19c0839feeb728e7df03255581f198dfdf1c2aeb1e174a8420b63c5252e5/pydantic_core-2.46.3-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:cf489cf8986c543939aeee17a09c04d6ffb43bfef8ca16fcbcc5cfdcbed24dba", size = 2369556, upload-time = "2026-04-20T14:41:09.427Z" }, + { url = "https://files.pythonhosted.org/packages/e0/15/3228774cb7cd45f5f721ddf1b2242747f4eb834d0c491f0c02d606f09fed/pydantic_core-2.46.3-cp314-cp314t-win32.whl", hash = "sha256:ffe0883b56cfc05798bf994164d2b2ff03efe2d22022a2bb080f3b626176dd56", size = 1949756, upload-time = "2026-04-20T14:41:25.717Z" }, + { url = "https://files.pythonhosted.org/packages/b8/2a/c79cf53fd91e5a87e30d481809f52f9a60dd221e39de66455cf04deaad37/pydantic_core-2.46.3-cp314-cp314t-win_amd64.whl", hash = "sha256:706d9d0ce9cf4593d07270d8e9f53b161f90c57d315aeec4fb4fd7a8b10240d8", size = 2051305, upload-time = "2026-04-20T14:43:18.627Z" }, + { url = "https://files.pythonhosted.org/packages/0b/db/d8182a7f1d9343a032265aae186eb063fe26ca4c40f256b21e8da4498e89/pydantic_core-2.46.3-cp314-cp314t-win_arm64.whl", hash = "sha256:77706aeb41df6a76568434701e0917da10692da28cb69d5fb6919ce5fdb07374", size = 2026310, upload-time = "2026-04-20T14:41:01.778Z" }, + { url = "https://files.pythonhosted.org/packages/66/7f/03dbad45cd3aa9083fbc93c210ae8b005af67e4136a14186950a747c6874/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-macosx_10_12_x86_64.whl", hash = "sha256:9715525891ed524a0a1eb6d053c74d4d4ad5017677fb00af0b7c2644a31bae46", size = 2105683, upload-time = "2026-04-20T14:42:19.779Z" }, + { url = "https://files.pythonhosted.org/packages/26/22/4dc186ac8ea6b257e9855031f51b62a9637beac4d68ac06bee02f046f836/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-macosx_11_0_arm64.whl", hash = "sha256:9d2f400712a99a013aff420ef1eb9be077f8189a36c1e3ef87660b4e1088a874", size = 1940052, upload-time = "2026-04-20T14:43:59.274Z" }, + { url = "https://files.pythonhosted.org/packages/0d/ca/d376391a5aff1f2e8188960d7873543608130a870961c2b6b5236627c116/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bd2aab0e2e9dc2daf36bd2686c982535d5e7b1d930a1344a7bb6e82baab42a76", size = 1988172, upload-time = "2026-04-20T14:41:17.469Z" }, + { url = "https://files.pythonhosted.org/packages/0e/6b/523b9f85c23788755d6ab949329de692a2e3a584bc6beb67fef5e035aa9d/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4e9d76736da5f362fabfeea6a69b13b7f2be405c6d6966f06b2f6bfff7e64531", size = 2128596, upload-time = "2026-04-20T14:40:41.707Z" }, + { url = "https://files.pythonhosted.org/packages/34/42/f426db557e8ab2791bc7562052299944a118655496fbff99914e564c0a94/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:b12dd51f1187c2eb489af8e20f880362db98e954b54ab792fa5d92e8bcc6b803", size = 2091877, upload-time = "2026-04-20T14:43:27.091Z" }, + { url = "https://files.pythonhosted.org/packages/5c/4f/86a832a9d14df58e663bfdf4627dc00d3317c2bd583c4fb23390b0f04b8e/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:f00a0961b125f1a47af7bcc17f00782e12f4cd056f83416006b30111d941dfa3", size = 1932428, upload-time = "2026-04-20T14:40:45.781Z" }, + { url = "https://files.pythonhosted.org/packages/11/1a/fe857968954d93fb78e0d4b6df5c988c74c4aaa67181c60be7cfe327c0ca/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:57697d7c056aca4bbb680200f96563e841a6386ac1129370a0102592f4dddff5", size = 1997550, upload-time = "2026-04-20T14:44:02.425Z" }, + { url = "https://files.pythonhosted.org/packages/17/eb/9d89ad2d9b0ba8cd65393d434471621b98912abb10fbe1df08e480ba57b5/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fd35aa21299def8db7ef4fe5c4ff862941a9a158ca7b63d61e66fe67d30416b4", size = 2137657, upload-time = "2026-04-20T14:42:45.149Z" }, + { url = "https://files.pythonhosted.org/packages/1f/da/99d40830684f81dec901cac521b5b91c095394cc1084b9433393cde1c2df/pydantic_core-2.46.3-pp311-pypy311_pp73-macosx_10_12_x86_64.whl", hash = "sha256:13afdd885f3d71280cf286b13b310ee0f7ccfefd1dbbb661514a474b726e2f25", size = 2107973, upload-time = "2026-04-20T14:42:06.175Z" }, + { url = "https://files.pythonhosted.org/packages/99/a5/87024121818d75bbb2a98ddbaf638e40e7a18b5e0f5492c9ca4b1b316107/pydantic_core-2.46.3-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:f91c0aff3e3ee0928edd1232c57f643a7a003e6edf1860bc3afcdc749cb513f3", size = 1947191, upload-time = "2026-04-20T14:43:14.319Z" }, + { url = "https://files.pythonhosted.org/packages/60/62/0c1acfe10945b83a6a59d19fbaa92f48825381509e5701b855c08f13db76/pydantic_core-2.46.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6529d1d128321a58d30afcc97b49e98836542f68dd41b33c2e972bb9e5290536", size = 2123791, upload-time = "2026-04-20T14:43:22.766Z" }, + { url = "https://files.pythonhosted.org/packages/75/3e/3b2393b4c8f44285561dc30b00cf307a56a2eff7c483a824db3b8221ca51/pydantic_core-2.46.3-pp311-pypy311_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:975c267cff4f7e7272eacbe50f6cc03ca9a3da4c4fbd66fffd89c94c1e311aa1", size = 2153197, upload-time = "2026-04-20T14:44:27.932Z" }, + { url = "https://files.pythonhosted.org/packages/ba/75/5af02fb35505051eee727c061f2881c555ab4f8ddb2d42da715a42c9731b/pydantic_core-2.46.3-pp311-pypy311_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:2b8e4f2bbdf71415c544b4b1138b8060db7b6611bc927e8064c769f64bed651c", size = 2181073, upload-time = "2026-04-20T14:43:20.729Z" }, + { url = "https://files.pythonhosted.org/packages/10/92/7e0e1bd9ca3c68305db037560ca2876f89b2647deb2f8b6319005de37505/pydantic_core-2.46.3-pp311-pypy311_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:e61ea8e9fff9606d09178f577ff8ccdd7206ff73d6552bcec18e1033c4254b85", size = 2315886, upload-time = "2026-04-20T14:44:04.826Z" }, + { url = "https://files.pythonhosted.org/packages/b8/d8/101655f27eaf3e44558ead736b2795d12500598beed4683f279396fa186e/pydantic_core-2.46.3-pp311-pypy311_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:b504bda01bafc69b6d3c7a0c7f039dcf60f47fab70e06fe23f57b5c75bdc82b8", size = 2360528, upload-time = "2026-04-20T14:40:47.431Z" }, + { url = "https://files.pythonhosted.org/packages/07/0f/1c34a74c8d07136f0d729ffe5e1fdab04fbdaa7684f61a92f92511a84a15/pydantic_core-2.46.3-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:b00b76f7142fc60c762ce579bd29c8fa44aaa56592dd3c54fab3928d0d4ca6ff", size = 2184144, upload-time = "2026-04-20T14:42:57Z" }, +] + [[package]] name = "pygments" -version = "2.19.2" +version = "2.20.0" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/b0/77/a5b8c569bf593b0140bde72ea885a803b82086995367bf2037de0159d924/pygments-2.19.2.tar.gz", hash = "sha256:636cb2477cec7f8952536970bc533bc43743542f70392ae026374600add5b887", size = 4968631, upload-time = "2025-06-21T13:39:12.283Z" } +sdist = { url = "https://files.pythonhosted.org/packages/c3/b2/bc9c9196916376152d655522fdcebac55e66de6603a76a02bca1b6414f6c/pygments-2.20.0.tar.gz", hash = "sha256:6757cd03768053ff99f3039c1a36d6c0aa0b263438fcab17520b30a303a82b5f", size = 4955991, upload-time = "2026-03-29T13:29:33.898Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217, upload-time = "2025-06-21T13:39:07.939Z" }, + { url = "https://files.pythonhosted.org/packages/f4/7e/a72dd26f3b0f4f2bf1dd8923c85f7ceb43172af56d63c7383eb62b332364/pygments-2.20.0-py3-none-any.whl", hash = "sha256:81a9e26dd42fd28a23a2d169d86d7ac03b46e2f8b59ed4698fb4785f946d0176", size = 1231151, upload-time = "2026-03-29T13:29:30.038Z" }, ] [[package]] name = "pytest" -version = "9.0.2" +version = "9.0.3" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "colorama", marker = "sys_platform == 'win32'" }, @@ -316,23 +481,23 @@ dependencies = [ { name = "pygments" }, { name = "tomli", marker = "python_full_version < '3.11'" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/d1/db/7ef3487e0fb0049ddb5ce41d3a49c235bf9ad299b6a25d5780a89f19230f/pytest-9.0.2.tar.gz", hash = "sha256:75186651a92bd89611d1d9fc20f0b4345fd827c41ccd5c299a868a05d70edf11", size = 1568901, upload-time = "2025-12-06T21:30:51.014Z" } +sdist = { url = "https://files.pythonhosted.org/packages/7d/0d/549bd94f1a0a402dc8cf64563a117c0f3765662e2e668477624baeec44d5/pytest-9.0.3.tar.gz", hash = "sha256:b86ada508af81d19edeb213c681b1d48246c1a91d304c6c81a427674c17eb91c", size = 1572165, upload-time = "2026-04-07T17:16:18.027Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/3b/ab/b3226f0bd7cdcf710fbede2b3548584366da3b19b5021e74f5bde2a8fa3f/pytest-9.0.2-py3-none-any.whl", hash = "sha256:711ffd45bf766d5264d487b917733b453d917afd2b0ad65223959f59089f875b", size = 374801, upload-time = "2025-12-06T21:30:49.154Z" }, + { url = "https://files.pythonhosted.org/packages/d4/24/a372aaf5c9b7208e7112038812994107bc65a84cd00e0354a88c2c77a617/pytest-9.0.3-py3-none-any.whl", hash = "sha256:2c5efc453d45394fdd706ade797c0a81091eccd1d6e4bccfcd476e2b8e0ab5d9", size = 375249, upload-time = "2026-04-07T17:16:16.13Z" }, ] [[package]] name = "pytest-cov" -version = "7.0.0" +version = "7.1.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "coverage", extra = ["toml"] }, { name = "pluggy" }, { name = "pytest" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/5e/f7/c933acc76f5208b3b00089573cf6a2bc26dc80a8aece8f52bb7d6b1855ca/pytest_cov-7.0.0.tar.gz", hash = "sha256:33c97eda2e049a0c5298e91f519302a1334c26ac65c1a483d6206fd458361af1", size = 54328, upload-time = "2025-09-09T10:57:02.113Z" } +sdist = { url = "https://files.pythonhosted.org/packages/b1/51/a849f96e117386044471c8ec2bd6cfebacda285da9525c9106aeb28da671/pytest_cov-7.1.0.tar.gz", hash = "sha256:30674f2b5f6351aa09702a9c8c364f6a01c27aae0c1366ae8016160d1efc56b2", size = 55592, upload-time = "2026-03-21T20:11:16.284Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/ee/49/1377b49de7d0c1ce41292161ea0f721913fa8722c19fb9c1e3aa0367eecb/pytest_cov-7.0.0-py3-none-any.whl", hash = "sha256:3b8e9558b16cc1479da72058bdecf8073661c7f57f7d3c5f22a1c23507f2d861", size = 22424, upload-time = "2025-09-09T10:57:00.695Z" }, + { url = "https://files.pythonhosted.org/packages/9d/7a/d968e294073affff457b041c2be9868a40c1c71f4a35fcc1e45e5493067b/pytest_cov-7.1.0-py3-none-any.whl", hash = "sha256:a0461110b7865f9a271aa1b51e516c9a95de9d696734a2f71e3e78f46e1d4678", size = 22876, upload-time = "2026-03-21T20:11:14.438Z" }, ] [[package]] @@ -452,40 +617,40 @@ wheels = [ [[package]] name = "rich" -version = "14.3.2" +version = "14.3.4" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "markdown-it-py" }, { name = "pygments" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/74/99/a4cab2acbb884f80e558b0771e97e21e939c5dfb460f488d19df485e8298/rich-14.3.2.tar.gz", hash = "sha256:e712f11c1a562a11843306f5ed999475f09ac31ffb64281f73ab29ffdda8b3b8", size = 230143, upload-time = "2026-02-01T16:20:47.908Z" } +sdist = { url = "https://files.pythonhosted.org/packages/e9/67/cae617f1351490c25a4b8ac3b8b63a4dda609295d8222bad12242dfdc629/rich-14.3.4.tar.gz", hash = "sha256:817e02727f2b25b40ef56f5aa2217f400c8489f79ca8f46ea2b70dd5e14558a9", size = 230524, upload-time = "2026-04-11T02:57:45.419Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/ef/45/615f5babd880b4bd7d405cc0dc348234c5ffb6ed1ea33e152ede08b2072d/rich-14.3.2-py3-none-any.whl", hash = "sha256:08e67c3e90884651da3239ea668222d19bea7b589149d8014a21c633420dbb69", size = 309963, upload-time = "2026-02-01T16:20:46.078Z" }, + { url = "https://files.pythonhosted.org/packages/b3/76/6d163cfac87b632216f71879e6b2cf17163f773ff59c00b5ff4900a80fa3/rich-14.3.4-py3-none-any.whl", hash = "sha256:07e7adb4690f68864777b1450859253bed81a99a31ac321ac1817b2313558952", size = 310480, upload-time = "2026-04-11T02:57:47.484Z" }, ] [[package]] name = "ruff" -version = "0.15.0" +version = "0.15.12" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/c8/39/5cee96809fbca590abea6b46c6d1c586b49663d1d2830a751cc8fc42c666/ruff-0.15.0.tar.gz", hash = "sha256:6bdea47cdbea30d40f8f8d7d69c0854ba7c15420ec75a26f463290949d7f7e9a", size = 4524893, upload-time = "2026-02-03T17:53:35.357Z" } +sdist = { url = "https://files.pythonhosted.org/packages/99/43/3291f1cc9106f4c63bdce7a8d0df5047fe8422a75b091c16b5e9355e0b11/ruff-0.15.12.tar.gz", hash = "sha256:ecea26adb26b4232c0c2ca19ccbc0083a68344180bba2a600605538ce51a40a6", size = 4643852, upload-time = "2026-04-24T18:17:14.305Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/bc/88/3fd1b0aa4b6330d6aaa63a285bc96c9f71970351579152d231ed90914586/ruff-0.15.0-py3-none-linux_armv6l.whl", hash = "sha256:aac4ebaa612a82b23d45964586f24ae9bc23ca101919f5590bdb368d74ad5455", size = 10354332, upload-time = "2026-02-03T17:52:54.892Z" }, - { url = "https://files.pythonhosted.org/packages/72/f6/62e173fbb7eb75cc29fe2576a1e20f0a46f671a2587b5f604bfb0eaf5f6f/ruff-0.15.0-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:dcd4be7cc75cfbbca24a98d04d0b9b36a270d0833241f776b788d59f4142b14d", size = 10767189, upload-time = "2026-02-03T17:53:19.778Z" }, - { url = "https://files.pythonhosted.org/packages/99/e4/968ae17b676d1d2ff101d56dc69cf333e3a4c985e1ec23803df84fc7bf9e/ruff-0.15.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:d747e3319b2bce179c7c1eaad3d884dc0a199b5f4d5187620530adf9105268ce", size = 10075384, upload-time = "2026-02-03T17:53:29.241Z" }, - { url = "https://files.pythonhosted.org/packages/a2/bf/9843c6044ab9e20af879c751487e61333ca79a2c8c3058b15722386b8cae/ruff-0.15.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:650bd9c56ae03102c51a5e4b554d74d825ff3abe4db22b90fd32d816c2e90621", size = 10481363, upload-time = "2026-02-03T17:52:43.332Z" }, - { url = "https://files.pythonhosted.org/packages/55/d9/4ada5ccf4cd1f532db1c8d44b6f664f2208d3d93acbeec18f82315e15193/ruff-0.15.0-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:a6664b7eac559e3048223a2da77769c2f92b43a6dfd4720cef42654299a599c9", size = 10187736, upload-time = "2026-02-03T17:53:00.522Z" }, - { url = "https://files.pythonhosted.org/packages/86/e2/f25eaecd446af7bb132af0a1d5b135a62971a41f5366ff41d06d25e77a91/ruff-0.15.0-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6f811f97b0f092b35320d1556f3353bf238763420ade5d9e62ebd2b73f2ff179", size = 10968415, upload-time = "2026-02-03T17:53:15.705Z" }, - { url = "https://files.pythonhosted.org/packages/e7/dc/f06a8558d06333bf79b497d29a50c3a673d9251214e0d7ec78f90b30aa79/ruff-0.15.0-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:761ec0a66680fab6454236635a39abaf14198818c8cdf691e036f4bc0f406b2d", size = 11809643, upload-time = "2026-02-03T17:53:23.031Z" }, - { url = "https://files.pythonhosted.org/packages/dd/45/0ece8db2c474ad7df13af3a6d50f76e22a09d078af63078f005057ca59eb/ruff-0.15.0-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:940f11c2604d317e797b289f4f9f3fa5555ffe4fb574b55ed006c3d9b6f0eb78", size = 11234787, upload-time = "2026-02-03T17:52:46.432Z" }, - { url = "https://files.pythonhosted.org/packages/8a/d9/0e3a81467a120fd265658d127db648e4d3acfe3e4f6f5d4ea79fac47e587/ruff-0.15.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bcbca3d40558789126da91d7ef9a7c87772ee107033db7191edefa34e2c7f1b4", size = 11112797, upload-time = "2026-02-03T17:52:49.274Z" }, - { url = "https://files.pythonhosted.org/packages/b2/cb/8c0b3b0c692683f8ff31351dfb6241047fa873a4481a76df4335a8bff716/ruff-0.15.0-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:9a121a96db1d75fa3eb39c4539e607f628920dd72ff1f7c5ee4f1b768ac62d6e", size = 11033133, upload-time = "2026-02-03T17:53:33.105Z" }, - { url = "https://files.pythonhosted.org/packages/f8/5e/23b87370cf0f9081a8c89a753e69a4e8778805b8802ccfe175cc410e50b9/ruff-0.15.0-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:5298d518e493061f2eabd4abd067c7e4fb89e2f63291c94332e35631c07c3662", size = 10442646, upload-time = "2026-02-03T17:53:06.278Z" }, - { url = "https://files.pythonhosted.org/packages/e1/9a/3c94de5ce642830167e6d00b5c75aacd73e6347b4c7fc6828699b150a5ee/ruff-0.15.0-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:afb6e603d6375ff0d6b0cee563fa21ab570fd15e65c852cb24922cef25050cf1", size = 10195750, upload-time = "2026-02-03T17:53:26.084Z" }, - { url = "https://files.pythonhosted.org/packages/30/15/e396325080d600b436acc970848d69df9c13977942fb62bb8722d729bee8/ruff-0.15.0-py3-none-musllinux_1_2_i686.whl", hash = "sha256:77e515f6b15f828b94dc17d2b4ace334c9ddb7d9468c54b2f9ed2b9c1593ef16", size = 10676120, upload-time = "2026-02-03T17:53:09.363Z" }, - { url = "https://files.pythonhosted.org/packages/8d/c9/229a23d52a2983de1ad0fb0ee37d36e0257e6f28bfd6b498ee2c76361874/ruff-0.15.0-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:6f6e80850a01eb13b3e42ee0ebdf6e4497151b48c35051aab51c101266d187a3", size = 11201636, upload-time = "2026-02-03T17:52:57.281Z" }, - { url = "https://files.pythonhosted.org/packages/6f/b0/69adf22f4e24f3677208adb715c578266842e6e6a3cc77483f48dd999ede/ruff-0.15.0-py3-none-win32.whl", hash = "sha256:238a717ef803e501b6d51e0bdd0d2c6e8513fe9eec14002445134d3907cd46c3", size = 10465945, upload-time = "2026-02-03T17:53:12.591Z" }, - { url = "https://files.pythonhosted.org/packages/51/ad/f813b6e2c97e9b4598be25e94a9147b9af7e60523b0cb5d94d307c15229d/ruff-0.15.0-py3-none-win_amd64.whl", hash = "sha256:dd5e4d3301dc01de614da3cdffc33d4b1b96fb89e45721f1598e5532ccf78b18", size = 11564657, upload-time = "2026-02-03T17:52:51.893Z" }, - { url = "https://files.pythonhosted.org/packages/f6/b0/2d823f6e77ebe560f4e397d078487e8d52c1516b331e3521bc75db4272ca/ruff-0.15.0-py3-none-win_arm64.whl", hash = "sha256:c480d632cc0ca3f0727acac8b7d053542d9e114a462a145d0b00e7cd658c515a", size = 10865753, upload-time = "2026-02-03T17:53:03.014Z" }, + { url = "https://files.pythonhosted.org/packages/c3/6e/e78ffb61d4686f3d96ba3df2c801161843746dcbcbb17a1e927d4829312b/ruff-0.15.12-py3-none-linux_armv6l.whl", hash = "sha256:f86f176e188e94d6bdbc09f09bfd9dc729059ad93d0e7390b5a73efe19f8861c", size = 10640713, upload-time = "2026-04-24T18:17:22.841Z" }, + { url = "https://files.pythonhosted.org/packages/ae/08/a317bc231fb9e7b93e4ef3089501e51922ff88d6936ce5cf870c4fe55419/ruff-0.15.12-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:e3bcd123364c3770b8e1b7baaf343cc99a35f197c5c6e8af79015c666c423a6c", size = 11069267, upload-time = "2026-04-24T18:17:30.105Z" }, + { url = "https://files.pythonhosted.org/packages/aa/a4/f828e9718d3dce1f5f11c39c4f65afd32783c8b2aebb2e3d259e492c47bd/ruff-0.15.12-py3-none-macosx_11_0_arm64.whl", hash = "sha256:fe87510d000220aa1ed530d4448a7c696a0cae1213e5ec30e5874287b66557b5", size = 10397182, upload-time = "2026-04-24T18:17:07.177Z" }, + { url = "https://files.pythonhosted.org/packages/71/e0/3310fc6d1b5e1fdea22bf3b1b807c7e187b581021b0d7d4514cccdb5fb71/ruff-0.15.12-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:84a1630093121375a3e2a95b4a6dc7b59e2b4ee76216e32d81aae550a832d002", size = 10758012, upload-time = "2026-04-24T18:16:55.759Z" }, + { url = "https://files.pythonhosted.org/packages/11/c1/a606911aee04c324ddaa883ae418f3569792fd3c4a10c50e0dd0a2311e1e/ruff-0.15.12-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:fb129f40f114f089ebe0ca56c0d251cf2061b17651d464bb6478dc01e69f11f5", size = 10447479, upload-time = "2026-04-24T18:16:51.677Z" }, + { url = "https://files.pythonhosted.org/packages/9d/68/4201e8444f0894f21ab4aeeaee68aa4f10b51613514a20d80bd628d57e88/ruff-0.15.12-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:b0c862b172d695db7598426b8af465e7e9ac00a3ea2a3630ee67eb82e366aaa6", size = 11234040, upload-time = "2026-04-24T18:17:16.529Z" }, + { url = "https://files.pythonhosted.org/packages/34/ff/8a6d6cf4ccc23fd67060874e832c18919d1557a0611ebef03fdb01fff11e/ruff-0.15.12-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:2849ea9f3484c3aca43a82f484210370319e7170df4dfe4843395ddf6c57bc33", size = 12087377, upload-time = "2026-04-24T18:17:04.944Z" }, + { url = "https://files.pythonhosted.org/packages/85/f6/c669cf73f5152f623d34e69866a46d5e6185816b19fcd5b6dd8a2d299922/ruff-0.15.12-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:9e77c7e51c07fe396826d5969a5b846d9cd4c402535835fb6e21ce8b28fef847", size = 11367784, upload-time = "2026-04-24T18:17:25.409Z" }, + { url = "https://files.pythonhosted.org/packages/e8/39/c61d193b8a1daaa8977f7dea9e8d8ba866e02ea7b65d32f6861693aa4c12/ruff-0.15.12-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:83b2f4f2f3b1026b5fb449b467d9264bf22067b600f7b6f41fc5958909f449d0", size = 11344088, upload-time = "2026-04-24T18:17:12.258Z" }, + { url = "https://files.pythonhosted.org/packages/c2/8d/49afab3645e31e12c590acb6d3b5b69d7aab5b81926dbaf7461f9441f37a/ruff-0.15.12-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:9ba3b8f1afd7e2e43d8943e55f249e13f9682fde09711644a6e7290eb4f3e339", size = 11271770, upload-time = "2026-04-24T18:17:02.457Z" }, + { url = "https://files.pythonhosted.org/packages/46/06/33f41fe94403e2b755481cdfb9b7ef3e4e0ed031c4581124658d935d52b4/ruff-0.15.12-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e852ba9fdc890655e1d78f2df1499efbe0e54126bd405362154a75e2bde159c5", size = 10719355, upload-time = "2026-04-24T18:17:27.648Z" }, + { url = "https://files.pythonhosted.org/packages/0d/59/18aa4e014debbf559670e4048e39260a85c7fcee84acfd761ac01e7b8d35/ruff-0.15.12-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:dd8aed930da53780d22fc70bdf84452c843cf64f8cb4eb38984319c24c5cd5fd", size = 10462758, upload-time = "2026-04-24T18:17:32.347Z" }, + { url = "https://files.pythonhosted.org/packages/25/e7/cc9f16fd0f3b5fddcbd7ec3d6ae30c8f3fde1047f32a4093a98d633c6570/ruff-0.15.12-py3-none-musllinux_1_2_i686.whl", hash = "sha256:01da3988d225628b709493d7dc67c3b9b12c0210016b08690ef9bd27970b262b", size = 10953498, upload-time = "2026-04-24T18:17:20.674Z" }, + { url = "https://files.pythonhosted.org/packages/72/7a/a9ba7f98c7a575978698f4230c5e8cc54bbc761af34f560818f933dafa0c/ruff-0.15.12-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:9cae0f92bd5700d1213188b31cd3bdd2b315361296d10b96b8e2337d3d11f53e", size = 11447765, upload-time = "2026-04-24T18:17:09.755Z" }, + { url = "https://files.pythonhosted.org/packages/ea/f9/0ae446942c846b8266059ad8a30702a35afae55f5cdc54c5adf8d7afdc27/ruff-0.15.12-py3-none-win32.whl", hash = "sha256:d0185894e038d7043ba8fd6aee7499ece6462dc0ea9f1e260c7451807c714c20", size = 10657277, upload-time = "2026-04-24T18:17:18.591Z" }, + { url = "https://files.pythonhosted.org/packages/33/f1/9614e03e1cdcbf9437570b5400ced8a720b5db22b28d8e0f1bda429f660d/ruff-0.15.12-py3-none-win_amd64.whl", hash = "sha256:c87a162d61ab3adca47c03f7f717c68672edec7d1b5499e652331780fe74950d", size = 11837758, upload-time = "2026-04-24T18:17:00.113Z" }, + { url = "https://files.pythonhosted.org/packages/c0/98/6beb4b351e472e5f4c4613f7c35a5290b8be2497e183825310c4c3a3984b/ruff-0.15.12-py3-none-win_arm64.whl", hash = "sha256:a538f7a82d061cee7be55542aca1d86d1393d55d81d4fcc314370f4340930d4f", size = 11120821, upload-time = "2026-04-24T18:16:57.979Z" }, ] [[package]] @@ -499,71 +664,71 @@ wheels = [ [[package]] name = "tomli" -version = "2.4.0" +version = "2.4.1" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/82/30/31573e9457673ab10aa432461bee537ce6cef177667deca369efb79df071/tomli-2.4.0.tar.gz", hash = "sha256:aa89c3f6c277dd275d8e243ad24f3b5e701491a860d5121f2cdd399fbb31fc9c", size = 17477, upload-time = "2026-01-11T11:22:38.165Z" } +sdist = { url = "https://files.pythonhosted.org/packages/22/de/48c59722572767841493b26183a0d1cc411d54fd759c5607c4590b6563a6/tomli-2.4.1.tar.gz", hash = "sha256:7c7e1a961a0b2f2472c1ac5b69affa0ae1132c39adcb67aba98568702b9cc23f", size = 17543, upload-time = "2026-03-25T20:22:03.828Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/3c/d9/3dc2289e1f3b32eb19b9785b6a006b28ee99acb37d1d47f78d4c10e28bf8/tomli-2.4.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b5ef256a3fd497d4973c11bf142e9ed78b150d36f5773f1ca6088c230ffc5867", size = 153663, upload-time = "2026-01-11T11:21:45.27Z" }, - { url = "https://files.pythonhosted.org/packages/51/32/ef9f6845e6b9ca392cd3f64f9ec185cc6f09f0a2df3db08cbe8809d1d435/tomli-2.4.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:5572e41282d5268eb09a697c89a7bee84fae66511f87533a6f88bd2f7b652da9", size = 148469, upload-time = "2026-01-11T11:21:46.873Z" }, - { url = "https://files.pythonhosted.org/packages/d6/c2/506e44cce89a8b1b1e047d64bd495c22c9f71f21e05f380f1a950dd9c217/tomli-2.4.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:551e321c6ba03b55676970b47cb1b73f14a0a4dce6a3e1a9458fd6d921d72e95", size = 236039, upload-time = "2026-01-11T11:21:48.503Z" }, - { url = "https://files.pythonhosted.org/packages/b3/40/e1b65986dbc861b7e986e8ec394598187fa8aee85b1650b01dd925ca0be8/tomli-2.4.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5e3f639a7a8f10069d0e15408c0b96a2a828cfdec6fca05296ebcdcc28ca7c76", size = 243007, upload-time = "2026-01-11T11:21:49.456Z" }, - { url = "https://files.pythonhosted.org/packages/9c/6f/6e39ce66b58a5b7ae572a0f4352ff40c71e8573633deda43f6a379d56b3e/tomli-2.4.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:1b168f2731796b045128c45982d3a4874057626da0e2ef1fdd722848b741361d", size = 240875, upload-time = "2026-01-11T11:21:50.755Z" }, - { url = "https://files.pythonhosted.org/packages/aa/ad/cb089cb190487caa80204d503c7fd0f4d443f90b95cf4ef5cf5aa0f439b0/tomli-2.4.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:133e93646ec4300d651839d382d63edff11d8978be23da4cc106f5a18b7d0576", size = 246271, upload-time = "2026-01-11T11:21:51.81Z" }, - { url = "https://files.pythonhosted.org/packages/0b/63/69125220e47fd7a3a27fd0de0c6398c89432fec41bc739823bcc66506af6/tomli-2.4.0-cp311-cp311-win32.whl", hash = "sha256:b6c78bdf37764092d369722d9946cb65b8767bfa4110f902a1b2542d8d173c8a", size = 96770, upload-time = "2026-01-11T11:21:52.647Z" }, - { url = "https://files.pythonhosted.org/packages/1e/0d/a22bb6c83f83386b0008425a6cd1fa1c14b5f3dd4bad05e98cf3dbbf4a64/tomli-2.4.0-cp311-cp311-win_amd64.whl", hash = "sha256:d3d1654e11d724760cdb37a3d7691f0be9db5fbdaef59c9f532aabf87006dbaa", size = 107626, upload-time = "2026-01-11T11:21:53.459Z" }, - { url = "https://files.pythonhosted.org/packages/2f/6d/77be674a3485e75cacbf2ddba2b146911477bd887dda9d8c9dfb2f15e871/tomli-2.4.0-cp311-cp311-win_arm64.whl", hash = "sha256:cae9c19ed12d4e8f3ebf46d1a75090e4c0dc16271c5bce1c833ac168f08fb614", size = 94842, upload-time = "2026-01-11T11:21:54.831Z" }, - { url = "https://files.pythonhosted.org/packages/3c/43/7389a1869f2f26dba52404e1ef13b4784b6b37dac93bac53457e3ff24ca3/tomli-2.4.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:920b1de295e72887bafa3ad9f7a792f811847d57ea6b1215154030cf131f16b1", size = 154894, upload-time = "2026-01-11T11:21:56.07Z" }, - { url = "https://files.pythonhosted.org/packages/e9/05/2f9bf110b5294132b2edf13fe6ca6ae456204f3d749f623307cbb7a946f2/tomli-2.4.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:7d6d9a4aee98fac3eab4952ad1d73aee87359452d1c086b5ceb43ed02ddb16b8", size = 149053, upload-time = "2026-01-11T11:21:57.467Z" }, - { url = "https://files.pythonhosted.org/packages/e8/41/1eda3ca1abc6f6154a8db4d714a4d35c4ad90adc0bcf700657291593fbf3/tomli-2.4.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:36b9d05b51e65b254ea6c2585b59d2c4cb91c8a3d91d0ed0f17591a29aaea54a", size = 243481, upload-time = "2026-01-11T11:21:58.661Z" }, - { url = "https://files.pythonhosted.org/packages/d2/6d/02ff5ab6c8868b41e7d4b987ce2b5f6a51d3335a70aa144edd999e055a01/tomli-2.4.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1c8a885b370751837c029ef9bc014f27d80840e48bac415f3412e6593bbc18c1", size = 251720, upload-time = "2026-01-11T11:22:00.178Z" }, - { url = "https://files.pythonhosted.org/packages/7b/57/0405c59a909c45d5b6f146107c6d997825aa87568b042042f7a9c0afed34/tomli-2.4.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:8768715ffc41f0008abe25d808c20c3d990f42b6e2e58305d5da280ae7d1fa3b", size = 247014, upload-time = "2026-01-11T11:22:01.238Z" }, - { url = "https://files.pythonhosted.org/packages/2c/0e/2e37568edd944b4165735687cbaf2fe3648129e440c26d02223672ee0630/tomli-2.4.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:7b438885858efd5be02a9a133caf5812b8776ee0c969fea02c45e8e3f296ba51", size = 251820, upload-time = "2026-01-11T11:22:02.727Z" }, - { url = "https://files.pythonhosted.org/packages/5a/1c/ee3b707fdac82aeeb92d1a113f803cf6d0f37bdca0849cb489553e1f417a/tomli-2.4.0-cp312-cp312-win32.whl", hash = "sha256:0408e3de5ec77cc7f81960c362543cbbd91ef883e3138e81b729fc3eea5b9729", size = 97712, upload-time = "2026-01-11T11:22:03.777Z" }, - { url = "https://files.pythonhosted.org/packages/69/13/c07a9177d0b3bab7913299b9278845fc6eaaca14a02667c6be0b0a2270c8/tomli-2.4.0-cp312-cp312-win_amd64.whl", hash = "sha256:685306e2cc7da35be4ee914fd34ab801a6acacb061b6a7abca922aaf9ad368da", size = 108296, upload-time = "2026-01-11T11:22:04.86Z" }, - { url = "https://files.pythonhosted.org/packages/18/27/e267a60bbeeee343bcc279bb9e8fbed0cbe224bc7b2a3dc2975f22809a09/tomli-2.4.0-cp312-cp312-win_arm64.whl", hash = "sha256:5aa48d7c2356055feef06a43611fc401a07337d5b006be13a30f6c58f869e3c3", size = 94553, upload-time = "2026-01-11T11:22:05.854Z" }, - { url = "https://files.pythonhosted.org/packages/34/91/7f65f9809f2936e1f4ce6268ae1903074563603b2a2bd969ebbda802744f/tomli-2.4.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:84d081fbc252d1b6a982e1870660e7330fb8f90f676f6e78b052ad4e64714bf0", size = 154915, upload-time = "2026-01-11T11:22:06.703Z" }, - { url = "https://files.pythonhosted.org/packages/20/aa/64dd73a5a849c2e8f216b755599c511badde80e91e9bc2271baa7b2cdbb1/tomli-2.4.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:9a08144fa4cba33db5255f9b74f0b89888622109bd2776148f2597447f92a94e", size = 149038, upload-time = "2026-01-11T11:22:07.56Z" }, - { url = "https://files.pythonhosted.org/packages/9e/8a/6d38870bd3d52c8d1505ce054469a73f73a0fe62c0eaf5dddf61447e32fa/tomli-2.4.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c73add4bb52a206fd0c0723432db123c0c75c280cbd67174dd9d2db228ebb1b4", size = 242245, upload-time = "2026-01-11T11:22:08.344Z" }, - { url = "https://files.pythonhosted.org/packages/59/bb/8002fadefb64ab2669e5b977df3f5e444febea60e717e755b38bb7c41029/tomli-2.4.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1fb2945cbe303b1419e2706e711b7113da57b7db31ee378d08712d678a34e51e", size = 250335, upload-time = "2026-01-11T11:22:09.951Z" }, - { url = "https://files.pythonhosted.org/packages/a5/3d/4cdb6f791682b2ea916af2de96121b3cb1284d7c203d97d92d6003e91c8d/tomli-2.4.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:bbb1b10aa643d973366dc2cb1ad94f99c1726a02343d43cbc011edbfac579e7c", size = 245962, upload-time = "2026-01-11T11:22:11.27Z" }, - { url = "https://files.pythonhosted.org/packages/f2/4a/5f25789f9a460bd858ba9756ff52d0830d825b458e13f754952dd15fb7bb/tomli-2.4.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:4cbcb367d44a1f0c2be408758b43e1ffb5308abe0ea222897d6bfc8e8281ef2f", size = 250396, upload-time = "2026-01-11T11:22:12.325Z" }, - { url = "https://files.pythonhosted.org/packages/aa/2f/b73a36fea58dfa08e8b3a268750e6853a6aac2a349241a905ebd86f3047a/tomli-2.4.0-cp313-cp313-win32.whl", hash = "sha256:7d49c66a7d5e56ac959cb6fc583aff0651094ec071ba9ad43df785abc2320d86", size = 97530, upload-time = "2026-01-11T11:22:13.865Z" }, - { url = "https://files.pythonhosted.org/packages/3b/af/ca18c134b5d75de7e8dc551c5234eaba2e8e951f6b30139599b53de9c187/tomli-2.4.0-cp313-cp313-win_amd64.whl", hash = "sha256:3cf226acb51d8f1c394c1b310e0e0e61fecdd7adcb78d01e294ac297dd2e7f87", size = 108227, upload-time = "2026-01-11T11:22:15.224Z" }, - { url = "https://files.pythonhosted.org/packages/22/c3/b386b832f209fee8073c8138ec50f27b4460db2fdae9ffe022df89a57f9b/tomli-2.4.0-cp313-cp313-win_arm64.whl", hash = "sha256:d20b797a5c1ad80c516e41bc1fb0443ddb5006e9aaa7bda2d71978346aeb9132", size = 94748, upload-time = "2026-01-11T11:22:16.009Z" }, - { url = "https://files.pythonhosted.org/packages/f3/c4/84047a97eb1004418bc10bdbcfebda209fca6338002eba2dc27cc6d13563/tomli-2.4.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:26ab906a1eb794cd4e103691daa23d95c6919cc2fa9160000ac02370cc9dd3f6", size = 154725, upload-time = "2026-01-11T11:22:17.269Z" }, - { url = "https://files.pythonhosted.org/packages/a8/5d/d39038e646060b9d76274078cddf146ced86dc2b9e8bbf737ad5983609a0/tomli-2.4.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:20cedb4ee43278bc4f2fee6cb50daec836959aadaf948db5172e776dd3d993fc", size = 148901, upload-time = "2026-01-11T11:22:18.287Z" }, - { url = "https://files.pythonhosted.org/packages/73/e5/383be1724cb30f4ce44983d249645684a48c435e1cd4f8b5cded8a816d3c/tomli-2.4.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:39b0b5d1b6dd03684b3fb276407ebed7090bbec989fa55838c98560c01113b66", size = 243375, upload-time = "2026-01-11T11:22:19.154Z" }, - { url = "https://files.pythonhosted.org/packages/31/f0/bea80c17971c8d16d3cc109dc3585b0f2ce1036b5f4a8a183789023574f2/tomli-2.4.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a26d7ff68dfdb9f87a016ecfd1e1c2bacbe3108f4e0f8bcd2228ef9a766c787d", size = 250639, upload-time = "2026-01-11T11:22:20.168Z" }, - { url = "https://files.pythonhosted.org/packages/2c/8f/2853c36abbb7608e3f945d8a74e32ed3a74ee3a1f468f1ffc7d1cb3abba6/tomli-2.4.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:20ffd184fb1df76a66e34bd1b36b4a4641bd2b82954befa32fe8163e79f1a702", size = 246897, upload-time = "2026-01-11T11:22:21.544Z" }, - { url = "https://files.pythonhosted.org/packages/49/f0/6c05e3196ed5337b9fe7ea003e95fd3819a840b7a0f2bf5a408ef1dad8ed/tomli-2.4.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:75c2f8bbddf170e8effc98f5e9084a8751f8174ea6ccf4fca5398436e0320bc8", size = 254697, upload-time = "2026-01-11T11:22:23.058Z" }, - { url = "https://files.pythonhosted.org/packages/f3/f5/2922ef29c9f2951883525def7429967fc4d8208494e5ab524234f06b688b/tomli-2.4.0-cp314-cp314-win32.whl", hash = "sha256:31d556d079d72db7c584c0627ff3a24c5d3fb4f730221d3444f3efb1b2514776", size = 98567, upload-time = "2026-01-11T11:22:24.033Z" }, - { url = "https://files.pythonhosted.org/packages/7b/31/22b52e2e06dd2a5fdbc3ee73226d763b184ff21fc24e20316a44ccc4d96b/tomli-2.4.0-cp314-cp314-win_amd64.whl", hash = "sha256:43e685b9b2341681907759cf3a04e14d7104b3580f808cfde1dfdb60ada85475", size = 108556, upload-time = "2026-01-11T11:22:25.378Z" }, - { url = "https://files.pythonhosted.org/packages/48/3d/5058dff3255a3d01b705413f64f4306a141a8fd7a251e5a495e3f192a998/tomli-2.4.0-cp314-cp314-win_arm64.whl", hash = "sha256:3d895d56bd3f82ddd6faaff993c275efc2ff38e52322ea264122d72729dca2b2", size = 96014, upload-time = "2026-01-11T11:22:26.138Z" }, - { url = "https://files.pythonhosted.org/packages/b8/4e/75dab8586e268424202d3a1997ef6014919c941b50642a1682df43204c22/tomli-2.4.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:5b5807f3999fb66776dbce568cc9a828544244a8eb84b84b9bafc080c99597b9", size = 163339, upload-time = "2026-01-11T11:22:27.143Z" }, - { url = "https://files.pythonhosted.org/packages/06/e3/b904d9ab1016829a776d97f163f183a48be6a4deb87304d1e0116a349519/tomli-2.4.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:c084ad935abe686bd9c898e62a02a19abfc9760b5a79bc29644463eaf2840cb0", size = 159490, upload-time = "2026-01-11T11:22:28.399Z" }, - { url = "https://files.pythonhosted.org/packages/e3/5a/fc3622c8b1ad823e8ea98a35e3c632ee316d48f66f80f9708ceb4f2a0322/tomli-2.4.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0f2e3955efea4d1cfbcb87bc321e00dc08d2bcb737fd1d5e398af111d86db5df", size = 269398, upload-time = "2026-01-11T11:22:29.345Z" }, - { url = "https://files.pythonhosted.org/packages/fd/33/62bd6152c8bdd4c305ad9faca48f51d3acb2df1f8791b1477d46ff86e7f8/tomli-2.4.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0e0fe8a0b8312acf3a88077a0802565cb09ee34107813bba1c7cd591fa6cfc8d", size = 276515, upload-time = "2026-01-11T11:22:30.327Z" }, - { url = "https://files.pythonhosted.org/packages/4b/ff/ae53619499f5235ee4211e62a8d7982ba9e439a0fb4f2f351a93d67c1dd2/tomli-2.4.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:413540dce94673591859c4c6f794dfeaa845e98bf35d72ed59636f869ef9f86f", size = 273806, upload-time = "2026-01-11T11:22:32.56Z" }, - { url = "https://files.pythonhosted.org/packages/47/71/cbca7787fa68d4d0a9f7072821980b39fbb1b6faeb5f5cf02f4a5559fa28/tomli-2.4.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:0dc56fef0e2c1c470aeac5b6ca8cc7b640bb93e92d9803ddaf9ea03e198f5b0b", size = 281340, upload-time = "2026-01-11T11:22:33.505Z" }, - { url = "https://files.pythonhosted.org/packages/f5/00/d595c120963ad42474cf6ee7771ad0d0e8a49d0f01e29576ee9195d9ecdf/tomli-2.4.0-cp314-cp314t-win32.whl", hash = "sha256:d878f2a6707cc9d53a1be1414bbb419e629c3d6e67f69230217bb663e76b5087", size = 108106, upload-time = "2026-01-11T11:22:34.451Z" }, - { url = "https://files.pythonhosted.org/packages/de/69/9aa0c6a505c2f80e519b43764f8b4ba93b5a0bbd2d9a9de6e2b24271b9a5/tomli-2.4.0-cp314-cp314t-win_amd64.whl", hash = "sha256:2add28aacc7425117ff6364fe9e06a183bb0251b03f986df0e78e974047571fd", size = 120504, upload-time = "2026-01-11T11:22:35.764Z" }, - { url = "https://files.pythonhosted.org/packages/b3/9f/f1668c281c58cfae01482f7114a4b88d345e4c140386241a1a24dcc9e7bc/tomli-2.4.0-cp314-cp314t-win_arm64.whl", hash = "sha256:2b1e3b80e1d5e52e40e9b924ec43d81570f0e7d09d11081b797bc4692765a3d4", size = 99561, upload-time = "2026-01-11T11:22:36.624Z" }, - { url = "https://files.pythonhosted.org/packages/23/d1/136eb2cb77520a31e1f64cbae9d33ec6df0d78bdf4160398e86eec8a8754/tomli-2.4.0-py3-none-any.whl", hash = "sha256:1f776e7d669ebceb01dee46484485f43a4048746235e683bcdffacdf1fb4785a", size = 14477, upload-time = "2026-01-11T11:22:37.446Z" }, + { url = "https://files.pythonhosted.org/packages/f4/11/db3d5885d8528263d8adc260bb2d28ebf1270b96e98f0e0268d32b8d9900/tomli-2.4.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:f8f0fc26ec2cc2b965b7a3b87cd19c5c6b8c5e5f436b984e85f486d652285c30", size = 154704, upload-time = "2026-03-25T20:21:10.473Z" }, + { url = "https://files.pythonhosted.org/packages/6d/f7/675db52c7e46064a9aa928885a9b20f4124ecb9bc2e1ce74c9106648d202/tomli-2.4.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:4ab97e64ccda8756376892c53a72bd1f964e519c77236368527f758fbc36a53a", size = 149454, upload-time = "2026-03-25T20:21:12.036Z" }, + { url = "https://files.pythonhosted.org/packages/61/71/81c50943cf953efa35bce7646caab3cf457a7d8c030b27cfb40d7235f9ee/tomli-2.4.1-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:96481a5786729fd470164b47cdb3e0e58062a496f455ee41b4403be77cb5a076", size = 237561, upload-time = "2026-03-25T20:21:13.098Z" }, + { url = "https://files.pythonhosted.org/packages/48/c1/f41d9cb618acccca7df82aaf682f9b49013c9397212cb9f53219e3abac37/tomli-2.4.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5a881ab208c0baf688221f8cecc5401bd291d67e38a1ac884d6736cbcd8247e9", size = 243824, upload-time = "2026-03-25T20:21:14.569Z" }, + { url = "https://files.pythonhosted.org/packages/22/e4/5a816ecdd1f8ca51fb756ef684b90f2780afc52fc67f987e3c61d800a46d/tomli-2.4.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:47149d5bd38761ac8be13a84864bf0b7b70bc051806bc3669ab1cbc56216b23c", size = 242227, upload-time = "2026-03-25T20:21:15.712Z" }, + { url = "https://files.pythonhosted.org/packages/6b/49/2b2a0ef529aa6eec245d25f0c703e020a73955ad7edf73e7f54ddc608aa5/tomli-2.4.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ec9bfaf3ad2df51ace80688143a6a4ebc09a248f6ff781a9945e51937008fcbc", size = 247859, upload-time = "2026-03-25T20:21:17.001Z" }, + { url = "https://files.pythonhosted.org/packages/83/bd/6c1a630eaca337e1e78c5903104f831bda934c426f9231429396ce3c3467/tomli-2.4.1-cp311-cp311-win32.whl", hash = "sha256:ff2983983d34813c1aeb0fa89091e76c3a22889ee83ab27c5eeb45100560c049", size = 97204, upload-time = "2026-03-25T20:21:18.079Z" }, + { url = "https://files.pythonhosted.org/packages/42/59/71461df1a885647e10b6bb7802d0b8e66480c61f3f43079e0dcd315b3954/tomli-2.4.1-cp311-cp311-win_amd64.whl", hash = "sha256:5ee18d9ebdb417e384b58fe414e8d6af9f4e7a0ae761519fb50f721de398dd4e", size = 108084, upload-time = "2026-03-25T20:21:18.978Z" }, + { url = "https://files.pythonhosted.org/packages/b8/83/dceca96142499c069475b790e7913b1044c1a4337e700751f48ed723f883/tomli-2.4.1-cp311-cp311-win_arm64.whl", hash = "sha256:c2541745709bad0264b7d4705ad453b76ccd191e64aa6f0fc66b69a293a45ece", size = 95285, upload-time = "2026-03-25T20:21:20.309Z" }, + { url = "https://files.pythonhosted.org/packages/c1/ba/42f134a3fe2b370f555f44b1d72feebb94debcab01676bf918d0cb70e9aa/tomli-2.4.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:c742f741d58a28940ce01d58f0ab2ea3ced8b12402f162f4d534dfe18ba1cd6a", size = 155924, upload-time = "2026-03-25T20:21:21.626Z" }, + { url = "https://files.pythonhosted.org/packages/dc/c7/62d7a17c26487ade21c5422b646110f2162f1fcc95980ef7f63e73c68f14/tomli-2.4.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:7f86fd587c4ed9dd76f318225e7d9b29cfc5a9d43de44e5754db8d1128487085", size = 150018, upload-time = "2026-03-25T20:21:23.002Z" }, + { url = "https://files.pythonhosted.org/packages/5c/05/79d13d7c15f13bdef410bdd49a6485b1c37d28968314eabee452c22a7fda/tomli-2.4.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ff18e6a727ee0ab0388507b89d1bc6a22b138d1e2fa56d1ad494586d61d2eae9", size = 244948, upload-time = "2026-03-25T20:21:24.04Z" }, + { url = "https://files.pythonhosted.org/packages/10/90/d62ce007a1c80d0b2c93e02cab211224756240884751b94ca72df8a875ca/tomli-2.4.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:136443dbd7e1dee43c68ac2694fde36b2849865fa258d39bf822c10e8068eac5", size = 253341, upload-time = "2026-03-25T20:21:25.177Z" }, + { url = "https://files.pythonhosted.org/packages/1a/7e/caf6496d60152ad4ed09282c1885cca4eea150bfd007da84aea07bcc0a3e/tomli-2.4.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:5e262d41726bc187e69af7825504c933b6794dc3fbd5945e41a79bb14c31f585", size = 248159, upload-time = "2026-03-25T20:21:26.364Z" }, + { url = "https://files.pythonhosted.org/packages/99/e7/c6f69c3120de34bbd882c6fba7975f3d7a746e9218e56ab46a1bc4b42552/tomli-2.4.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:5cb41aa38891e073ee49d55fbc7839cfdb2bc0e600add13874d048c94aadddd1", size = 253290, upload-time = "2026-03-25T20:21:27.46Z" }, + { url = "https://files.pythonhosted.org/packages/d6/2f/4a3c322f22c5c66c4b836ec58211641a4067364f5dcdd7b974b4c5da300c/tomli-2.4.1-cp312-cp312-win32.whl", hash = "sha256:da25dc3563bff5965356133435b757a795a17b17d01dbc0f42fb32447ddfd917", size = 98141, upload-time = "2026-03-25T20:21:28.492Z" }, + { url = "https://files.pythonhosted.org/packages/24/22/4daacd05391b92c55759d55eaee21e1dfaea86ce5c571f10083360adf534/tomli-2.4.1-cp312-cp312-win_amd64.whl", hash = "sha256:52c8ef851d9a240f11a88c003eacb03c31fc1c9c4ec64a99a0f922b93874fda9", size = 108847, upload-time = "2026-03-25T20:21:29.386Z" }, + { url = "https://files.pythonhosted.org/packages/68/fd/70e768887666ddd9e9f5d85129e84910f2db2796f9096aa02b721a53098d/tomli-2.4.1-cp312-cp312-win_arm64.whl", hash = "sha256:f758f1b9299d059cc3f6546ae2af89670cb1c4d48ea29c3cacc4fe7de3058257", size = 95088, upload-time = "2026-03-25T20:21:30.677Z" }, + { url = "https://files.pythonhosted.org/packages/07/06/b823a7e818c756d9a7123ba2cda7d07bc2dd32835648d1a7b7b7a05d848d/tomli-2.4.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:36d2bd2ad5fb9eaddba5226aa02c8ec3fa4f192631e347b3ed28186d43be6b54", size = 155866, upload-time = "2026-03-25T20:21:31.65Z" }, + { url = "https://files.pythonhosted.org/packages/14/6f/12645cf7f08e1a20c7eb8c297c6f11d31c1b50f316a7e7e1e1de6e2e7b7e/tomli-2.4.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:eb0dc4e38e6a1fd579e5d50369aa2e10acfc9cace504579b2faabb478e76941a", size = 149887, upload-time = "2026-03-25T20:21:33.028Z" }, + { url = "https://files.pythonhosted.org/packages/5c/e0/90637574e5e7212c09099c67ad349b04ec4d6020324539297b634a0192b0/tomli-2.4.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c7f2c7f2b9ca6bdeef8f0fa897f8e05085923eb091721675170254cbc5b02897", size = 243704, upload-time = "2026-03-25T20:21:34.51Z" }, + { url = "https://files.pythonhosted.org/packages/10/8f/d3ddb16c5a4befdf31a23307f72828686ab2096f068eaf56631e136c1fdd/tomli-2.4.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f3c6818a1a86dd6dca7ddcaaf76947d5ba31aecc28cb1b67009a5877c9a64f3f", size = 251628, upload-time = "2026-03-25T20:21:36.012Z" }, + { url = "https://files.pythonhosted.org/packages/e3/f1/dbeeb9116715abee2485bf0a12d07a8f31af94d71608c171c45f64c0469d/tomli-2.4.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:d312ef37c91508b0ab2cee7da26ec0b3ed2f03ce12bd87a588d771ae15dcf82d", size = 247180, upload-time = "2026-03-25T20:21:37.136Z" }, + { url = "https://files.pythonhosted.org/packages/d3/74/16336ffd19ed4da28a70959f92f506233bd7cfc2332b20bdb01591e8b1d1/tomli-2.4.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:51529d40e3ca50046d7606fa99ce3956a617f9b36380da3b7f0dd3dd28e68cb5", size = 251674, upload-time = "2026-03-25T20:21:38.298Z" }, + { url = "https://files.pythonhosted.org/packages/16/f9/229fa3434c590ddf6c0aa9af64d3af4b752540686cace29e6281e3458469/tomli-2.4.1-cp313-cp313-win32.whl", hash = "sha256:2190f2e9dd7508d2a90ded5ed369255980a1bcdd58e52f7fe24b8162bf9fedbd", size = 97976, upload-time = "2026-03-25T20:21:39.316Z" }, + { url = "https://files.pythonhosted.org/packages/6a/1e/71dfd96bcc1c775420cb8befe7a9d35f2e5b1309798f009dca17b7708c1e/tomli-2.4.1-cp313-cp313-win_amd64.whl", hash = "sha256:8d65a2fbf9d2f8352685bc1364177ee3923d6baf5e7f43ea4959d7d8bc326a36", size = 108755, upload-time = "2026-03-25T20:21:40.248Z" }, + { url = "https://files.pythonhosted.org/packages/83/7a/d34f422a021d62420b78f5c538e5b102f62bea616d1d75a13f0a88acb04a/tomli-2.4.1-cp313-cp313-win_arm64.whl", hash = "sha256:4b605484e43cdc43f0954ddae319fb75f04cc10dd80d830540060ee7cd0243cd", size = 95265, upload-time = "2026-03-25T20:21:41.219Z" }, + { url = "https://files.pythonhosted.org/packages/3c/fb/9a5c8d27dbab540869f7c1f8eb0abb3244189ce780ba9cd73f3770662072/tomli-2.4.1-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:fd0409a3653af6c147209d267a0e4243f0ae46b011aa978b1080359fddc9b6cf", size = 155726, upload-time = "2026-03-25T20:21:42.23Z" }, + { url = "https://files.pythonhosted.org/packages/62/05/d2f816630cc771ad836af54f5001f47a6f611d2d39535364f148b6a92d6b/tomli-2.4.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:a120733b01c45e9a0c34aeef92bf0cf1d56cfe81ed9d47d562f9ed591a9828ac", size = 149859, upload-time = "2026-03-25T20:21:43.386Z" }, + { url = "https://files.pythonhosted.org/packages/ce/48/66341bdb858ad9bd0ceab5a86f90eddab127cf8b046418009f2125630ecb/tomli-2.4.1-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:559db847dc486944896521f68d8190be1c9e719fced785720d2216fe7022b662", size = 244713, upload-time = "2026-03-25T20:21:44.474Z" }, + { url = "https://files.pythonhosted.org/packages/df/6d/c5fad00d82b3c7a3ab6189bd4b10e60466f22cfe8a08a9394185c8a8111c/tomli-2.4.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:01f520d4f53ef97964a240a035ec2a869fe1a37dde002b57ebc4417a27ccd853", size = 252084, upload-time = "2026-03-25T20:21:45.62Z" }, + { url = "https://files.pythonhosted.org/packages/00/71/3a69e86f3eafe8c7a59d008d245888051005bd657760e96d5fbfb0b740c2/tomli-2.4.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:7f94b27a62cfad8496c8d2513e1a222dd446f095fca8987fceef261225538a15", size = 247973, upload-time = "2026-03-25T20:21:46.937Z" }, + { url = "https://files.pythonhosted.org/packages/67/50/361e986652847fec4bd5e4a0208752fbe64689c603c7ae5ea7cb16b1c0ca/tomli-2.4.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:ede3e6487c5ef5d28634ba3f31f989030ad6af71edfb0055cbbd14189ff240ba", size = 256223, upload-time = "2026-03-25T20:21:48.467Z" }, + { url = "https://files.pythonhosted.org/packages/8c/9a/b4173689a9203472e5467217e0154b00e260621caa227b6fa01feab16998/tomli-2.4.1-cp314-cp314-win32.whl", hash = "sha256:3d48a93ee1c9b79c04bb38772ee1b64dcf18ff43085896ea460ca8dec96f35f6", size = 98973, upload-time = "2026-03-25T20:21:49.526Z" }, + { url = "https://files.pythonhosted.org/packages/14/58/640ac93bf230cd27d002462c9af0d837779f8773bc03dee06b5835208214/tomli-2.4.1-cp314-cp314-win_amd64.whl", hash = "sha256:88dceee75c2c63af144e456745e10101eb67361050196b0b6af5d717254dddf7", size = 109082, upload-time = "2026-03-25T20:21:50.506Z" }, + { url = "https://files.pythonhosted.org/packages/d5/2f/702d5e05b227401c1068f0d386d79a589bb12bf64c3d2c72ce0631e3bc49/tomli-2.4.1-cp314-cp314-win_arm64.whl", hash = "sha256:b8c198f8c1805dc42708689ed6864951fd2494f924149d3e4bce7710f8eb5232", size = 96490, upload-time = "2026-03-25T20:21:51.474Z" }, + { url = "https://files.pythonhosted.org/packages/45/4b/b877b05c8ba62927d9865dd980e34a755de541eb65fffba52b4cc495d4d2/tomli-2.4.1-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:d4d8fe59808a54658fcc0160ecfb1b30f9089906c50b23bcb4c69eddc19ec2b4", size = 164263, upload-time = "2026-03-25T20:21:52.543Z" }, + { url = "https://files.pythonhosted.org/packages/24/79/6ab420d37a270b89f7195dec5448f79400d9e9c1826df982f3f8e97b24fd/tomli-2.4.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:7008df2e7655c495dd12d2a4ad038ff878d4ca4b81fccaf82b714e07eae4402c", size = 160736, upload-time = "2026-03-25T20:21:53.674Z" }, + { url = "https://files.pythonhosted.org/packages/02/e0/3630057d8eb170310785723ed5adcdfb7d50cb7e6455f85ba8a3deed642b/tomli-2.4.1-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1d8591993e228b0c930c4bb0db464bdad97b3289fb981255d6c9a41aedc84b2d", size = 270717, upload-time = "2026-03-25T20:21:55.129Z" }, + { url = "https://files.pythonhosted.org/packages/7a/b4/1613716072e544d1a7891f548d8f9ec6ce2faf42ca65acae01d76ea06bb0/tomli-2.4.1-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:734e20b57ba95624ecf1841e72b53f6e186355e216e5412de414e3c51e5e3c41", size = 278461, upload-time = "2026-03-25T20:21:56.228Z" }, + { url = "https://files.pythonhosted.org/packages/05/38/30f541baf6a3f6df77b3df16b01ba319221389e2da59427e221ef417ac0c/tomli-2.4.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:8a650c2dbafa08d42e51ba0b62740dae4ecb9338eefa093aa5c78ceb546fcd5c", size = 274855, upload-time = "2026-03-25T20:21:57.653Z" }, + { url = "https://files.pythonhosted.org/packages/77/a3/ec9dd4fd2c38e98de34223b995a3b34813e6bdadf86c75314c928350ed14/tomli-2.4.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:504aa796fe0569bb43171066009ead363de03675276d2d121ac1a4572397870f", size = 283144, upload-time = "2026-03-25T20:21:59.089Z" }, + { url = "https://files.pythonhosted.org/packages/ef/be/605a6261cac79fba2ec0c9827e986e00323a1945700969b8ee0b30d85453/tomli-2.4.1-cp314-cp314t-win32.whl", hash = "sha256:b1d22e6e9387bf4739fbe23bfa80e93f6b0373a7f1b96c6227c32bef95a4d7a8", size = 108683, upload-time = "2026-03-25T20:22:00.214Z" }, + { url = "https://files.pythonhosted.org/packages/12/64/da524626d3b9cc40c168a13da8335fe1c51be12c0a63685cc6db7308daae/tomli-2.4.1-cp314-cp314t-win_amd64.whl", hash = "sha256:2c1c351919aca02858f740c6d33adea0c5deea37f9ecca1cc1ef9e884a619d26", size = 121196, upload-time = "2026-03-25T20:22:01.169Z" }, + { url = "https://files.pythonhosted.org/packages/5a/cd/e80b62269fc78fc36c9af5a6b89c835baa8af28ff5ad28c7028d60860320/tomli-2.4.1-cp314-cp314t-win_arm64.whl", hash = "sha256:eab21f45c7f66c13f2a9e0e1535309cee140182a9cdae1e041d02e47291e8396", size = 100393, upload-time = "2026-03-25T20:22:02.137Z" }, + { url = "https://files.pythonhosted.org/packages/7b/61/cceae43728b7de99d9b847560c262873a1f6c98202171fd5ed62640b494b/tomli-2.4.1-py3-none-any.whl", hash = "sha256:0d85819802132122da43cb86656f8d1f8c6587d54ae7dcaf30e90533028b49fe", size = 14583, upload-time = "2026-03-25T20:22:03.012Z" }, ] [[package]] name = "typer" -version = "0.21.1" +version = "0.25.1" source = { registry = "https://pypi.org/simple" } dependencies = [ + { name = "annotated-doc" }, { name = "click" }, { name = "rich" }, { name = "shellingham" }, - { name = "typing-extensions" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/36/bf/8825b5929afd84d0dabd606c67cd57b8388cb3ec385f7ef19c5cc2202069/typer-0.21.1.tar.gz", hash = "sha256:ea835607cd752343b6b2b7ce676893e5a0324082268b48f27aa058bdb7d2145d", size = 110371, upload-time = "2026-01-06T11:21:10.989Z" } +sdist = { url = "https://files.pythonhosted.org/packages/e4/51/9aed62104cea109b820bbd6c14245af756112017d309da813ef107d42e7e/typer-0.25.1.tar.gz", hash = "sha256:9616eb8853a09ffeabab1698952f33c6f29ffdbceb4eaeecf571880e8d7664cc", size = 122276, upload-time = "2026-04-30T19:32:16.964Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/a0/1d/d9257dd49ff2ca23ea5f132edf1281a0c4f9de8a762b9ae399b670a59235/typer-0.21.1-py3-none-any.whl", hash = "sha256:7985e89081c636b88d172c2ee0cfe33c253160994d47bdfdc302defd7d1f1d01", size = 47381, upload-time = "2026-01-06T11:21:09.824Z" }, + { url = "https://files.pythonhosted.org/packages/3f/f9/2b3ff4e56e5fa7debfaf9eb135d0da96f3e9a1d5b27222223c7296336e5f/typer-0.25.1-py3-none-any.whl", hash = "sha256:75caa44ed46a03fb2dab8808753ffacdbfea88495e74c85a28c5eefcf5f39c89", size = 58409, upload-time = "2026-04-30T19:32:18.271Z" }, ] [[package]] @@ -575,6 +740,18 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/18/67/36e9267722cc04a6b9f15c7f3441c2363321a3ea07da7ae0c0707beb2a9c/typing_extensions-4.15.0-py3-none-any.whl", hash = "sha256:f0fa19c6845758ab08074a0cfa8b7aecb71c999ca73d62883bc25cc018c4e548", size = 44614, upload-time = "2025-08-25T13:49:24.86Z" }, ] +[[package]] +name = "typing-inspection" +version = "0.4.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "typing-extensions" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/55/e3/70399cb7dd41c10ac53367ae42139cf4b1ca5f36bb3dc6c9d33acdb43655/typing_inspection-0.4.2.tar.gz", hash = "sha256:ba561c48a67c5958007083d386c3295464928b01faa735ab8547c5692e87f464", size = 75949, upload-time = "2025-10-01T02:14:41.687Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/dc/9b/47798a6c91d8bdb567fe2698fe81e0c6b7cb7ef4d13da4114b41d239f65d/typing_inspection-0.4.2-py3-none-any.whl", hash = "sha256:4ed1cacbdc298c220f1bd249ed5287caa16f34d44ef4e9c3d0cbad5b521545e7", size = 14611, upload-time = "2025-10-01T02:14:40.154Z" }, +] + [[package]] name = "zstandard" version = "0.25.0" diff --git a/common/b300_numa_cpu_pinning.patch b/common/b300_numa_cpu_pinning.patch new file mode 100644 index 0000000..9cbe8bd --- /dev/null +++ b/common/b300_numa_cpu_pinning.patch @@ -0,0 +1,21 @@ +# Restore optional Priority Core Turbo-related fixed-core CPU binding for the B300 reference configuration. +# Apply only on systems where Intel Granite Rapids Priority Core Turbo Technology (PCT) is available and enabled and the PCT +# high-priority core IDs match this B300 reference binding pattern. Validate the +# affected workload before using the patched configuration for benchmark submissions. +# +# Apply from the root of the Megatron-Bridge checkout: +# git apply /path/to/b300_numa_cpu_pinning.patch +# +--- a/scripts/performance/utils/executors.py ++++ b/scripts/performance/utils/executors.py +@@ -124,6 +124,8 @@ def slurm_executor( + segment = segment_candidate + break + + numa_divisor = 2 if gpu.lower() in ["gb200", "gb300"] else 4 + numa_cmd = f"numactl --cpunodebind=$((SLURM_LOCALID/{numa_divisor})) --membind=$((SLURM_LOCALID/{numa_divisor}))" ++ if gpu.lower() in ["b300"]: ++ numa_cmd += " -C $((SLURM_LOCALID * 16)),$((SLURM_LOCALID * 16 + 1))" + custom_bash_cmds.append(numa_cmd) + + launcher = SlurmTemplate( diff --git a/common/parse_train_timing_mbridge.sh b/common/parse_train_timing_mbridge.sh index 1253cda..804fd97 100755 --- a/common/parse_train_timing_mbridge.sh +++ b/common/parse_train_timing_mbridge.sh @@ -107,6 +107,7 @@ output_result() { local tflops_mean="$5" local tflops_std_dev="$6" local max_iter="$7" + local invalid_iter="${8:-}" local display_name display_name=$(shorten_filename "$filename") @@ -121,6 +122,12 @@ output_result() { else printf "%-90s %8s %13s %12s %19s %18s\n" "$display_name" "Failed" "-" "-" "-" "-" fi + elif [[ $status == "Invalid" ]]; then + if [[ -n $invalid_iter && $invalid_iter != "unknown" ]]; then + printf "%-90s %8s %13s %12s %19s %18s\n" "$display_name" "Invalid" "grad_norm=nan" "iter $invalid_iter" "-" "-" + else + printf "%-90s %8s %13s %12s %19s %18s\n" "$display_name" "Invalid" "grad_norm=nan" "-" "-" "-" + fi fi ;; csv) @@ -132,12 +139,24 @@ output_result() { else echo "$filename,Failed,,,,," fi + elif [[ $status == "Invalid" ]]; then + if [[ -n $invalid_iter && $invalid_iter != "unknown" ]]; then + echo "$filename,Invalid,,,,,grad_norm=nan@iteration_$invalid_iter" + else + echo "$filename,Invalid,,,,,grad_norm=nan" + fi fi ;; json) # JSON entries are collected in json_results in the main loop if [[ $status == "Success" ]]; then json_results+=("{\"filename\": \"$filename\", \"status\": \"Success\", \"time_mean_ms\": $time_mean, \"time_std_ms\": $time_std_dev, \"tflops_mean\": ${tflops_mean:-null}, \"tflops_std\": ${tflops_std_dev:-null}}") + elif [[ $status == "Invalid" ]]; then + if [[ -n $invalid_iter && $invalid_iter != "unknown" ]]; then + json_results+=("{\"filename\": \"$filename\", \"status\": \"Invalid\", \"reason\": \"grad_norm=nan\", \"invalid_iteration\": $invalid_iter}") + else + json_results+=("{\"filename\": \"$filename\", \"status\": \"Invalid\", \"reason\": \"grad_norm=nan\"}") + fi else if [[ -n $max_iter ]]; then json_results+=("{\"filename\": \"$filename\", \"status\": \"Failed\", \"max_iteration\": $max_iter}") @@ -178,9 +197,10 @@ output_footer() { local files_processed="$1" local incomplete_count="$2" local failed_early_count="$3" - local total_experiment_files="$4" + local invalid_count="$4" + local total_experiment_files="$5" - local failed_count=$((incomplete_count + failed_early_count)) + local failed_count=$((incomplete_count + failed_early_count + invalid_count)) case "$OUTPUT_FORMAT" in table) @@ -188,6 +208,9 @@ output_footer() { echo "Summary:" echo " Success experiments: $files_processed" echo " Failed experiments: $failed_count" + if [[ $invalid_count -gt 0 ]]; then + echo " Invalid grad norm experiments: $invalid_count" + fi if [[ $total_experiment_files -gt 0 ]]; then echo " Success rate: $((files_processed * 100 / total_experiment_files))%" else @@ -196,7 +219,11 @@ output_footer() { ;; csv) # CSV doesn't need footer for parsing, but we can add a comment - echo "# Summary: $files_processed success, $failed_count failed, $total_experiment_files total" + if [[ $invalid_count -gt 0 ]]; then + echo "# Summary: $files_processed success, $failed_count failed ($invalid_count invalid_grad_norm), $total_experiment_files total" + else + echo "# Summary: $files_processed success, $failed_count failed, $total_experiment_files total" + fi ;; json) # Remove trailing comma from last entry and close JSON @@ -204,6 +231,7 @@ output_footer() { echo ' "summary": {' echo " \"success_experiments\": $files_processed," echo " \"failed_experiments\": $failed_count," + echo " \"invalid_grad_norm_experiments\": $invalid_count," if [[ $total_experiment_files -gt 0 ]]; then echo " \"success_rate\": $((files_processed * 100 / total_experiment_files))" else @@ -232,6 +260,7 @@ fi files_processed=0 incomplete_count=0 failed_early_count=0 +invalid_count=0 # Store results for JSON formatting declare -a json_results @@ -248,8 +277,8 @@ while IFS= read -r file; do ;; esac - # Check if file contains any of the new-format timing data - has_timing_data=$(grep -q -E "elapsed time per iteration \(ms\):|MODEL_TFLOP\/s\/GPU|TFLOP\/s\/GPU" "$file" 2> /dev/null && echo "yes" || echo "no") + # Check if file contains parseable timing data or an invalid grad norm marker. + has_timing_data=$(grep -q -i -E "elapsed time per iteration \(ms\):|MODEL_TFLOP\/s\/GPU|TFLOP\/s\/GPU|grad[ _]norm[[:space:]]*:[[:space:]]*nan" "$file" 2> /dev/null && echo "yes" || echo "no") if [[ $has_timing_data == "yes" ]]; then # AWK now: @@ -258,6 +287,17 @@ while IFS= read -r file; do # - when an iteration line with "elapsed time per iteration (ms)" is found within the iteration window, # it pairs that elapsed time with the last_tflop (and then clears last_tflop so it isn't reused). result=$(awk -v min_iter="$MIN_ITERATION" -v max_iter="$MAX_ITERATION" ' + # Reject completed-looking jobs with invalid gradients anywhere in the step output. + tolower($0) ~ /grad[ _]norm[[:space:]]*:[[:space:]]*nan([[:space:]]|[|]|$)/ { + if (invalid_grad_norm_iter == "") { + if (match($0, /iteration[[:space:]]*([0-9]+)/, invalid_iter_arr)) { + invalid_grad_norm_iter = invalid_iter_arr[1] + 0 + } else { + invalid_grad_norm_iter = "unknown" + } + } + } + # capture the numeric token right before MODEL_TFLOP/s/GPU (handles 1234.5 and scientific) /MODEL_TFLOP\/s\/GPU/ || /TFLOP\/s\/GPU/ { if (match($0, /([0-9]+(\.[0-9]+)?([eE][+-]?[0-9]+)?)\s*(MODEL_TFLOP\/s\/GPU|TFLOP\/s\/GPU)/, tf_arr)) { @@ -289,7 +329,9 @@ while IFS= read -r file; do } END { - if (count > 0) { + if (invalid_grad_norm_iter != "") { + print "INVALID_GRAD_NORM:" invalid_grad_norm_iter + } else if (count > 0) { if (max_found < max_iter) { print "INCOMPLETE:" max_found } else { @@ -326,6 +368,18 @@ while IFS= read -r file; do output_result "$filename" "Failed" "" "" "" "" "$max_found" fi incomplete_count=$((incomplete_count + 1)) + elif [[ $result == INVALID_GRAD_NORM:* ]]; then + invalid_iter=${result#INVALID_GRAD_NORM:} + if [[ $OUTPUT_FORMAT == "json" ]]; then + if [[ $invalid_iter =~ ^[0-9]+$ ]]; then + json_results+=("{\"filename\": \"$filename\", \"status\": \"Invalid\", \"reason\": \"grad_norm=nan\", \"invalid_iteration\": $invalid_iter}") + else + json_results+=("{\"filename\": \"$filename\", \"status\": \"Invalid\", \"reason\": \"grad_norm=nan\"}") + fi + else + output_result "$filename" "Invalid" "" "" "" "" "" "$invalid_iter" + fi + invalid_count=$((invalid_count + 1)) elif [[ $result == COMPLETE:* ]]; then # Parse mean and std dev from result stats=${result#COMPLETE:} @@ -360,8 +414,8 @@ while IFS= read -r file; do fi done <<< "$out_files" -# Calculate total experiment files (complete + incomplete + failed early) -total_experiment_files=$((files_processed + incomplete_count + failed_early_count)) +# Calculate total experiment files (complete + incomplete + failed early + invalid) +total_experiment_files=$((files_processed + incomplete_count + failed_early_count + invalid_count)) # Output JSON results without trailing comma if [[ $OUTPUT_FORMAT == "json" ]]; then @@ -374,7 +428,7 @@ if [[ $OUTPUT_FORMAT == "json" ]]; then done fi -output_footer "$files_processed" "$incomplete_count" "$failed_early_count" "$total_experiment_files" +output_footer "$files_processed" "$incomplete_count" "$failed_early_count" "$invalid_count" "$total_experiment_files" if [ $files_processed -eq 0 ]; then echo "Error: No valid complete elapsed-time and MODEL_TFLOPS data found in any .out files" >&2 diff --git a/deepseek_v3/pretrain/megatron_bridge/README.md b/deepseek_v3/pretrain/megatron_bridge/README.md index 7582f00..e891068 100644 --- a/deepseek_v3/pretrain/megatron_bridge/README.md +++ b/deepseek_v3/pretrain/megatron_bridge/README.md @@ -6,27 +6,30 @@ This recipe contains information and scripts to produce performance results for | Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | | --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | -| BF16 | 128 | 4096 | 61 | 1 | 4 | 1 | 32 | 32 | 4 | 1 | 2048 | 64 | -| BF16 | 256 | 4096 | 61 | 1 | 4 | 1 | 64 | 64 | 4 | 1 | 4096 | 64 | -| BF16 | 512 | 4096 | 61 | 1 | 4 | 1 | 64 | 128 | 4 | 1 | 8192 | 64 | +| NVFP4 | 128 | 4096 | 61 | 1 | 4 | 1 | 32 | 32 | 4 | 2 | 2048 | 32 | | FP8 | 128 | 4096 | 61 | 1 | 4 | 1 | 32 | 32 | 4 | 2 | 2048 | 32 | +| BF16 | 128 | 4096 | 61 | 1 | 4 | 1 | 32 | 32 | 4 | 1 | 2048 | 64 | +| NVFP4 | 256 | 4096 | 61 | 1 | 2 | 1 | 32 | 128 | 8 | 2 | 4096 | 16 | | FP8 | 256 | 4096 | 61 | 1 | 2 | 1 | 32 | 128 | 8 | 2 | 4096 | 16 | +| BF16 | 256 | 4096 | 61 | 1 | 4 | 1 | 64 | 64 | 4 | 1 | 4096 | 64 | +| NVFP4 | 512 | 4096 | 61 | 1 | 2 | 1 | 32 | 256 | 8 | 2 | 8192 | 16 | | FP8 | 512 | 4096 | 61 | 1 | 2 | 1 | 32 | 256 | 8 | 2 | 8192 | 16 | +| BF16 | 512 | 4096 | 61 | 1 | 4 | 1 | 64 | 128 | 4 | 1 | 8192 | 64 | ## GB200 -| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | -| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | -| BF16/FP8 | 256 | 4096 | 61 | 1 | 4 | 1 | 64 | 64 | 4 | 1 | 4096 | 64 | -| BF16/FP8 | 512 | 4096 | 61 | 1 | 4 | 1 | 64 | 128 | 4 | 1 | 8192 | 64 | +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| -------------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| NVFP4/FP8/BF16 | 256 | 4096 | 61 | 1 | 4 | 1 | 64 | 64 | 4 | 1 | 4096 | 64 | +| NVFP4/FP8/BF16 | 512 | 4096 | 61 | 1 | 4 | 1 | 64 | 128 | 4 | 1 | 8192 | 64 | ## B300 | Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | | --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | -| BF16 | 128 | 4096 | 61 | 1 | 8 | 1 | 8 | 16 | N/A | 1 | 2048 | 128 | -| BF16 | 256 | 4096 | 61 | 1 | 8 | 1 | 8 | 32 | N/A | 1 | 4096 | 128 | -| BF16 | 512 | 4096 | 61 | 1 | 8 | 1 | 8 | 64 | N/A | 1 | 8192 | 128 | +| BF16 | 128 | 4096 | 61 | 1 | 8 | 1 | 8 | 16 | 1 | 1 | 2048 | 128 | +| BF16 | 256 | 4096 | 61 | 1 | 8 | 1 | 8 | 32 | 1 | 1 | 4096 | 128 | +| BF16 | 512 | 4096 | 61 | 1 | 8 | 1 | 8 | 64 | 1 | 1 | 8192 | 128 | ## B200 @@ -44,74 +47,58 @@ This recipe contains information and scripts to produce performance results for # Performance Measurement and Analysis -Performance for Deepseek-v3 training is measured by the achieved GPU FLOPS via the `TFLOPS_per_GPU` metric, which indicates computational throughput efficiency. Additionally, training step timing (seconds per iteration) is captured and logged for every training step in the main training log file [see Output Locations](#output-locations). - -Since the early training steps typically take much longer time (with input prefetch, activation memory allocation, and JIT compilation), we use the `parse_train_timing_mbridge.sh` script to analyze iterations 35-44 and calculate mean and standard deviation for reliable performance metrics for both TFLOPS per GPU and timing measurements. +Performance is reported as: -### Running the parse_train_timing_mbridge.sh script +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU -To analyze training timing from your experiment results, run the script from the workload directory. In an installed environment, recipe files are available under `$LLMB_INSTALL/llmb_repo` (a copy created by the installer). +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). -```bash -# Basic usage - parses results in the directory named 'experiments' in the current folder -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh +## Viewing results with `llmb-run jobs` -# Specify a different experiments directory -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh /path/to/experiments +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: -# Output in CSV format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=csv +```bash +# List all jobs you've submitted, with parsed metrics +llmb-run jobs -# Output in JSON format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=json +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show -# Show full filenames instead of shortened versions -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --full-names +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log ``` -Example output: +Example `llmb-run jobs` output (illustrative values): -```shell -Elapsed Time (ms) and TFLOPS/GPU Analysis (iterations 35-44) -================================================================================ -Experiment Status Time Mean (ms) Time Std (ms) TFLOPS_per_GPU Mean TFLOPS_per_GPU Std ------------------------------------------------------------------------------------------- -------- ------------- ------------ ------------------- ------------------ -pretrain_deepseek_v3_bf16_gpus256_tp1_pp4_cp1_vp4_ep64_mbs1_gbs2048_992591 Success 11071.480 8.236 769.50 0.58 +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 ``` -To obtain throughput as a tokens per second measurement, follow this formula: +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../../../cli/llmb-run/README.md#jobs-command) for the full command reference. -```shell -(throughput in tokens per second) = (sequence length) * (global batch size) / training_step_timing -``` - -E.g. 4096 * 2048 / 11.072 = 757641 +## Derived metrics -To calculate time to train estimate: +To convert step time into tokens per second: -```shell -(time to train in days) = (total tokens) / (throughput in tokens per second) / (number of seconds in a day) +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) ``` -E.g. 1e12 / 757641 / 86400 = 15.28 days +To estimate time-to-train for a target token budget: -To calculate the model flops utilization (MFU): - -```shell -MFU = (achieved TFLOPS_per_GPU) / (peak GPU FLOPS) +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 ``` -E.g. DeepSeek-V3 BF16 on 256x GB200 GPUs (GBS=2048) - -```shell -peak FLOPS for GB200 BF16 = 2.45 PFLOPS -achieved TFLOPS_per_GPU = 769.50 TFLOPS +To compute model FLOPs utilization (MFU): -MFU = 769.50e+12 / 2.45e+15 = 31.41% +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) ``` -**Peak theoretical throughput across GPUs and Data Types (in TFLOPS)** - For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../../../README.md#peak-theoretical-throughput) section in the main README. # Prerequisites @@ -173,31 +160,33 @@ llmb-run submit -w pretrain_deepseek-v3 --dtype fp8 --scale 512 ### Additional SLURM Parameters -Use a SLURM reservation: +For `llmb-run submit`, use the built-in Slurm flags instead of `ADDITIONAL_SLURM_PARAMS`. + +Use a Slurm reservation: ```bash -ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 --reservation my_reservation ``` Run on specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 --nodelist node001,node002 ``` Exclude specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 --exclude node003,node004 ``` -Combine multiple parameters (semicolon-separated): +Combine multiple parameters: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive ``` -For more details on llmb-run usage, see the [llmb-run documentation](../../../cli/llmb-run/README.md). +For more details on `llmb-run` usage, see the [llmb-run documentation](../../../cli/llmb-run/README.md). ## Direct Method @@ -270,7 +259,7 @@ The `` typically follows the pattern: `pretrain_deepseek_v3_671 **Key files:** -- `log-.out` - Contains training step timing and performance metrics analyzed by `parse_train_timing_mbridge.sh` +- `log-.out` - Contains training step timing and performance metrics parsed by `llmb-run jobs` - `nsys_profile/` - Contains profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true` # Profiling @@ -334,10 +323,17 @@ PyTorch Profiling is intended for rare, advanced debugging scenarios such as NCC > **Note:** This option is mutually exclusive with Nsight profiling (`ENABLE_PROFILE`). Both cannot be enabled at the same time. +> **Note:** PyTorch profiling is **not supported on H100** (nemo:25.09.00). The launch script will abort with an error if `ENABLE_PYTORCH_PROFILE=true` is set for H100. + **Example command:** ```shell ENABLE_PYTORCH_PROFILE=true llmb-run submit -w pretrain_deepseek-v3 --dtype bf16 --scale 256 ``` -For details on the PyTorch Profiler and how to view resulting traces, see the [PyTorch Profiler documentation](https://docs.pytorch.org/tutorials/recipes/recipes/profiler_recipe.html). +The trace file location depends on the GPU type and the NeMo version used: + +- **GB300, GB200, B300** (nemo:26.04.00): `torch_profile/rank-N.json.gz` +- **B200** (nemo:26.02.01): `pytorch_profile/` + +In both cases `N` is the rank number. For details on the PyTorch Profiler and how to view resulting traces, see the [PyTorch Profiler documentation](https://docs.pytorch.org/tutorials/recipes/recipes/profiler_recipe.html). diff --git a/deepseek_v3/pretrain/megatron_bridge/launch.sh b/deepseek_v3/pretrain/megatron_bridge/launch.sh index b0f624c..b563384 100755 --- a/deepseek_v3/pretrain/megatron_bridge/launch.sh +++ b/deepseek_v3/pretrain/megatron_bridge/launch.sh @@ -43,8 +43,10 @@ DTYPE=${DTYPE,,} if [[ $GPU_TYPE == "h100" ]]; then FW_VERSION=25.09.00 -else +elif [[ $GPU_TYPE == "b300" ]] || [[ $GPU_TYPE == "b200" ]]; then FW_VERSION=26.02.01 +else + FW_VERSION=26.04.00 fi if [[ $DTYPE == "fp8" ]]; then @@ -93,7 +95,10 @@ if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" fi -CONFIG_OVERRIDES="" +CONFIG_OVERRIDES="${CONFIG_OVERRIDES:-}" +if [[ -n ${CONFIG_OVERRIDES} ]]; then + CONFIG_OVERRIDES+=" " +fi if [[ -n ${CONTAINER_MOUNTS} ]]; then CONFIG_OVERRIDES+=" --custom_mounts $CONTAINER_MOUNTS" fi @@ -103,11 +108,16 @@ if [[ $PROFILE_ENABLED == "true" ]] && [[ $PYTORCH_PROFILE_ENABLED == "true" ]]; exit 1 fi +if [[ $PYTORCH_PROFILE_ENABLED == "true" ]] && [[ $GPU_TYPE == "h100" ]]; then + echo "Error: PyTorch profiling is not supported on H100 (nemo:25.09.00). Use GB300, GB200, B300, or B200." >&2 + exit 1 +fi + if [[ $PROFILE_ENABLED == "true" ]]; then CONFIG_OVERRIDES+=" --enable_nsys " CONFIG_OVERRIDES+=" --profiling_start_step=$PROFILE_START_STEP " CONFIG_OVERRIDES+=" --profiling_stop_step=$PROFILE_STOP_STEP " - if [[ $FW_VERSION == "26.02.01" ]]; then + if [[ $FW_VERSION == "26.04.00" ]] || [[ $FW_VERSION == "26.02.01" ]]; then PROFILE_RANKS=$(seq -s, 0 $((JOB_TOTAL_GPUS - 1))) CONFIG_OVERRIDES+=" --profiling_ranks=$PROFILE_RANKS" CONFIG_OVERRIDES+=" --nsys_trace=cuda " @@ -149,9 +159,13 @@ if [[ $GPU_TYPE == "h100" ]] && [[ $DTYPE == "fp8" ]]; then CONFIG_OVERRIDES+=" --fp8_recipe cs " fi +if [[ $FW_VERSION == "26.04.00" ]]; then + CONFIG_OVERRIDES+=" --packager none " +fi + # run command pushd $LLMB_WORKLOAD/Megatron-Bridge -if [[ $FW_VERSION == "26.02.01" ]]; then +if [[ $FW_VERSION == "26.04.00" ]] || [[ $FW_VERSION == "26.02.01" ]]; then python3 scripts/performance/setup_experiment.py \ --container_image $IMAGE \ --compute_dtype $COMPUTE_TYPE \ diff --git a/deepseek_v3/pretrain/megatron_bridge/metadata.yaml b/deepseek_v3/pretrain/megatron_bridge/metadata.yaml index 72b159b..4e5deb5 100644 --- a/deepseek_v3/pretrain/megatron_bridge/metadata.yaml +++ b/deepseek_v3/pretrain/megatron_bridge/metadata.yaml @@ -29,6 +29,10 @@ container: images: by_gpu: default: + - 'nvcr.io#nvidia/nemo:26.04.00' + b300: + - 'nvcr.io#nvidia/nemo:26.02.01' + b200: - 'nvcr.io#nvidia/nemo:26.02.01' h100: - 'nvcr.io#nvidia/nemo:25.09.00' @@ -43,7 +47,21 @@ repositories: default: megatron_bridge: url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "aeead1ae667d795ebe725cb4a608581a21f402cc" + commit: "fab68031197b64934027e188c0cb417fdf1e1d7a" + nemo_run: + url: "https://github.com/NVIDIA-NeMo/Run.git" + commit: "64b91e0187b93475ea0d54028317e349ced7ac1b" + b300: + megatron_bridge: + url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" + commit: "f07871e23f637f4ae87d92256babc51fdbb12f39" + nemo_run: + url: "https://github.com/NVIDIA-NeMo/Run.git" + commit: "525d68bfce2d6baed86ed3d7d0edbae07833ea0d" + b200: + megatron_bridge: + url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" + commit: "f07871e23f637f4ae87d92256babc51fdbb12f39" nemo_run: url: "https://github.com/NVIDIA-NeMo/Run.git" commit: "525d68bfce2d6baed86ed3d7d0edbae07833ea0d" @@ -100,14 +118,14 @@ run: gb300: model_configs: - model_size: '671b' - dtypes: ['bf16', 'fp8'] + dtypes: ['bf16', 'fp8', 'nvfp4'] scales: [128, 256, 512] proxy_scales: [64] gb200: model_configs: - model_size: '671b' - dtypes: ['bf16', 'fp8'] + dtypes: ['bf16', 'fp8', 'nvfp4'] scales: [256, 512] proxy_scales: [64] @@ -129,3 +147,4 @@ run: dtypes: bf16: [1024] fp8: [512, 1024] + diff --git a/deepseek_v3/pretrain/torchtitan/README.md b/deepseek_v3/pretrain/torchtitan/README.md index fbbe004..8d6cf99 100644 --- a/deepseek_v3/pretrain/torchtitan/README.md +++ b/deepseek_v3/pretrain/torchtitan/README.md @@ -125,31 +125,33 @@ llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype fp8 --scale 256 ### Additional SLURM Parameters -Use a SLURM reservation: +For `llmb-run submit`, use the built-in Slurm flags. This workload uses the `configured_sbatch` launcher, so these flags are applied to the outer `sbatch` submission. + +Use a Slurm reservation: ```bash -ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 --reservation my_reservation ``` Run on specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 --nodelist node001,node002 ``` Exclude specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 --exclude node003,node004 ``` -Combine multiple parameters (semicolon-separated): +Combine multiple parameters: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 +llmb-run submit -w pretrain_deepseek-v3-torchtitan --dtype bf16 --scale 256 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive ``` -For more details on llmb-run usage, see the [llmb-run documentation](../../../cli/llmb-run/README.md). +For more details on `llmb-run` usage, see the [llmb-run documentation](../../../cli/llmb-run/README.md). Most environment-variable overrides documented below can also be passed through `llmb-run` by prefixing them on the submit command. For example: @@ -197,8 +199,8 @@ TRAINING_STEPS=5000 LOCAL_BATCH_SIZE=8 llmb-run submit -w pretrain_deepseek-v3-t - `EP_COMM_BACKEND`: Expert parallel communication backend (default: `deepep` for H100/B200, `hybridep` for GB200) - `RUN_CONF_IMAGE`: Override container image path - `RUN_CONF_MOUNTS`: Additional container mounts -- `ADDITIONAL_SLURM_PARAMS`: Extra `srun` flags (for example `nodelist=...`, `reservation=...`, `exclusive`), semicolon-separated - - Example: `"nodelist=node001,node002;reservation=my_reservation;exclusive"` + +To control the batch allocation in the direct method, pass flags directly to `sbatch` or use the corresponding `SBATCH_*` environment variables. ## Running the Launch Script @@ -208,6 +210,12 @@ TRAINING_STEPS=5000 LOCAL_BATCH_SIZE=8 llmb-run submit -w pretrain_deepseek-v3-t GPU_TYPE= JOB_TOTAL_GPUS= sbatch launch.sh ``` +For example, to use a reservation with the direct method: + +```bash +GPU_TYPE= JOB_TOTAL_GPUS= sbatch --reservation=my_reservation launch.sh +``` + ### Example Commands Train on H100 GPUs (minimum configuration): diff --git a/deepseek_v3/pretrain/torchtitan/launch.sh b/deepseek_v3/pretrain/torchtitan/launch.sh index 19b148c..47076dc 100644 --- a/deepseek_v3/pretrain/torchtitan/launch.sh +++ b/deepseek_v3/pretrain/torchtitan/launch.sh @@ -118,30 +118,6 @@ export DATASET_PATH=${DATASET_PATH:-$LLMB_INSTALL/datasets/c4} export SEQ_LEN=${SEQ_LEN:-4096} export TRAINING_STEPS=${TRAINING_STEPS:-60} -# Handle additional SLURM parameters from environment variable -ADDITIONAL_SLURM_PARAMS=${ADDITIONAL_SLURM_PARAMS:-""} -ADDITIONAL_SRUN_ARGS="" -if [ -n "$ADDITIONAL_SLURM_PARAMS" ]; then - # Parse semicolon-separated params: key=value pairs or standalone flags - IFS=';' read -ra PARAMS <<< "$ADDITIONAL_SLURM_PARAMS" - for param in "${PARAMS[@]}"; do - param=$(echo "$param" | xargs) # Trim whitespace - if [ -n "$param" ]; then - if [[ $param == *"="* ]]; then - # Key=value pair - key=$(echo "$param" | cut -d'=' -f1 | xargs) - value=$(echo "$param" | cut -d'=' -f2- | xargs) - if [ -n "$key" ] && [ -n "$value" ]; then - ADDITIONAL_SRUN_ARGS+=" --${key}=${value}" - fi - else - # Standalone flag (no value) - ADDITIONAL_SRUN_ARGS+=" --${param}" - fi - fi - done -fi - CONTAINER_MOUNTS="$TORCHTITAN_HOME:$TORCHTITAN_HOME:rw,$SLURM_LOG_DIR:$SLURM_LOG_DIR:rw,$LLMB_OUTPUT_DIR:$LLMB_OUTPUT_DIR:rw,$LLMB_REPO:$LLMB_REPO:ro,$DATASET_PATH:$DATASET_PATH:ro" if [[ -n ${RUN_CONF_MOUNTS:-} ]]; then CONTAINER_MOUNTS+=",${RUN_CONF_MOUNTS}" @@ -173,6 +149,11 @@ if [[ $DTYPE == "fp8" ]]; then MXFP8_TRAIN_ARGS="--model.converters=quantize.linear.mx,quantize.grouped_mm.mx --quantize.linear.mx.recipe_name=mxfp8_cublas --quantize.grouped_mm.mx.fqns=experts --quantize.grouped_mm.mx.recipe_name=mxfp8 " fi +CONTAINER_ENV_KEYS="LOG_RANK" +if [[ -n ${LLMB_CONTAINER_ENV:-} ]]; then + CONTAINER_ENV_KEYS+=",${LLMB_CONTAINER_ENV}" +fi + TRAIN_CMD="\ cd $TORCHTITAN_HOME; \ ulimit -c 0; \ @@ -203,9 +184,8 @@ $PROFILE_ARGS" srun --container-image="$IMAGE" \ --container-name=deepseek-v3-torchtitan \ --container-mounts="$CONTAINER_MOUNTS" \ - --container-env="LOG_RANK" \ + --container-env="$CONTAINER_ENV_KEYS" \ --output "$SLURM_LOG_DIR/log-torchtitan_${MODEL_NAME}_${JOB_TOTAL_GPUS}gpus_%j_${SLURM_RESTART_COUNT:-0}.out" \ --no-container-mount-home \ --container-writable \ - $ADDITIONAL_SRUN_ARGS \ bash -c "$TRAIN_CMD" diff --git a/devzone-repro.md b/devzone-repro.md index 924cc95..3857e3f 100644 --- a/devzone-repro.md +++ b/devzone-repro.md @@ -4,24 +4,23 @@ This repository provides instructions reproduce inference performance data from ## Prerequisites -Before configuring the orchestrator, ensure you have downloaded the required model weights from Hugging Face: +Before configuring the orchestrator, ensure you have downloaded the required NVFP4 model weights from Hugging Face: - **DeepSeek-R1 (DSR1):** [DeepSeek-R1-0528-NVFP4-v2](https://huggingface.co/nvidia/DeepSeek-R1-0528-NVFP4-v2) -- **DeepSeek-V4-Pro (DSv4-Pro):** [DeepSeek-V4-Pro](https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro) -- **gpt-oss-120b:** [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) -- **Qwen3.5-397B (NVFP4):** [Qwen3.5-397B-A17B-NVFP4](https://huggingface.co/nvidia/Qwen3.5-397B-A17B-NVFP4) +- **Qwen3.5-397B:** [Qwen/Qwen3.5-397B-A17B](https://huggingface.co/Qwen/Qwen3.5-397B-A17B) - **Kimi-K2.5:** [Kimi-K2.5-NVFP4](https://huggingface.co/nvidia/Kimi-K2.5-NVFP4) ## Environment Setup -Benchmarking is orchestrated using [srt-slurm](https://github.com/NVIDIA/srt-slurm), a command-line tool for distributed LLM inference benchmarks on SLURM clusters. (Support for benchmarking Kubernetes clusters coming soon.) +Benchmarking is orchestrated using [srt-slurm](https://github.com/ishandhanani/srt-slurm), a command-line tool for distributed LLM inference benchmarks on SLURM clusters. (Support for benchmarking Kubernetes clusters coming soon.) 1. **Clone and Install:** ```bash # Enter a directory on NFS, accessible by all nodes of your cluster. -git clone https://github.com/NVIDIA/srt-slurm.git +git clone https://github.com/ishandhanani/srt-slurm.git cd srt-slurm +git checkout recipes/moe # Initialize virtual environment and install dependencies (not shown) uv venv @@ -37,17 +36,16 @@ make setup ARCH=aarch64 # or ARCH=x86_64 ``` 3. **Configure Model Paths:** - The setup script will generate an `srtslurm.yaml` file. Edit this file to append your local model paths. The alias on the left must match the `model.path` value used in the recipe YAMLs: + The setup script will generate an `srtslurm.yaml` file. Edit this file to append your local model paths: ```yaml -model_paths: +model_path: dsr1: /path/to/local/dsr1 - dsv4-pro: /path/to/local/dsv4-pro - qwen3.5-nvfp4: /path/to/local/qwen3.5-397b-nvfp4 - kimi-k25-nvfp4: /path/to/local/kimi-k2.5-nvfp4 + qwen3.5-397b: /path/to/local/qwen3.5-397b + kimi-k2.5: /path/to/local/kimi-k2.5 ``` -Depending on your cluster configuration, you may need to specify additional arguments in srtslurm.yaml (SLURM account/partition, container image aliases, GPU-per-node defaults, etc.). See https://github.com/NVIDIA/srt-slurm/blob/main/srtslurm.yaml.example for the full reference. +Depending on your cluster configuration, you may need to specify additional arguments in srtslurm.yaml. See https://github.com/ishandhanani/srt-slurm/blob/main/srtslurm.yaml.example for details. ## Running the Benchmarks @@ -59,13 +57,12 @@ srtctl apply -f Available benchmarking configurations for published performance data are mapped below. Select the recipe that matches your target performance profile. -| Model | 1K/1K | 8K/1K | 128K/8K | -| :--------------- | :------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| **DSR1** | [GB300](https://github.com/NVIDIA/srt-slurm/blob/main/recipes/gb300-fp4/1k1k/max_tpt.yaml) | [GB300](https://github.com/NVIDIA/srt-slurm/blob/main/recipes/gb300-fp4/8k1k/max_tpt.yaml) | [GB300](https://github.com/NVIDIA/srt-slurm/blob/main/recipes/gb300-fp4/128k8k/maxthroughput-ctx3_pp4_gen1_dep8_batch32_eplb0_mtp0.yaml) | -| **DSv4-Pro** | [GB300](https://github.com/NVIDIA/srt-slurm/blob/main/recipes/dsv4-pro/sglang/gb300-fp4/1k1k/agg/stp/agg-max-tpt-tep.yaml) | _Coming soon_ | | -| **gpt-oss-120b** | [B200](https://github.com/NVIDIA/srt-slurm/blob/main/recipes/trtllm/gpt-oss-120b/b200-fp4/1k1k/agg-tp1.yaml) | [B200](https://github.com/NVIDIA/srt-slurm/blob/main/recipes/trtllm/gpt-oss-120b/b200-fp4/8k1k/agg-tp1.yaml) | | -| **Kimi-K2.5** | _Coming soon_ | _Coming soon_ | | -| **Qwen3.5-397B** | [GB200](https://github.com/NVIDIA/srt-slurm/blob/main/recipes/qwen3.5/nvfp4/agg/stp_prefix_off/tp4.yaml) | _Coming soon_ | | +| Model | 1K/1K | 8K/1K | 1K/8K | 128K/8K | +| :--------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------- | +| **DSR1** | | [GB300](https://github.com/ishandhanani/srt-slurm/blob/main/recipes/gb300-fp4/1k1k/max_tpt.yaml) | [GB300](https://github.com/ishandhanani/srt-slurm/tree/main/recipes/gb300-fp4/8k1k) | [GB300](https://github.com/ishandhanani/srt-slurm/blob/main/recipes/gb300-fp4/1k8k/max-tpt.yaml) | +| **gpt-oss-120b** | [B200](https://github.com/ishandhanani/srt-slurm/tree/main/recipes/trtllm/b200-fp4/1k1k/mtp), [H200](https://github.com/ishandhanani/srt-slurm/tree/main/recipes/trtllm/h200/1k1k) | [B200](https://github.com/ishandhanani/srt-slurm/tree/main/recipes/trtllm/b200-fp4/8k1k/mtp), [H200](https://github.com/ishandhanani/srt-slurm/tree/main/recipes/trtllm/h200/8k1k/mtp) | | | +| **Kimi-K2.5** | [B200](https://github.com/ishandhanani/srt-slurm/tree/recipes/moe/recipes/kimi-k2.5/b200/1k1k) | [B200](https://github.com/ishandhanani/srt-slurm/tree/recipes/moe/recipes/kimi-k2.5/b200/8k1k) | [B200](https://github.com/ishandhanani/srt-slurm/tree/recipes/moe/recipes/kimi-k2.5/b200/1k8k) | | +| **Qwen3.5-397B** | [B200](https://github.com/ishandhanani/srt-slurm/tree/recipes/moe/recipes/qwen3.5-397b/b200/1k1k) | [B200](https://github.com/ishandhanani/srt-slurm/tree/recipes/moe/recipes/qwen3.5-397b/b200/8k1k) | [B200](https://github.com/ishandhanani/srt-slurm/tree/recipes/moe/recipes/qwen3.5-397b/b200/1k8k) | | ## Support diff --git a/exemplar.yaml b/exemplar.yaml index 6b475cc..081904b 100644 --- a/exemplar.yaml +++ b/exemplar.yaml @@ -20,41 +20,39 @@ # DEALINGS IN THE SOFTWARE. config: scale: 512 - repeats: 3 - profile: true + repeats: 1 + profile: false workloads: gb300: - pretrain_deepseek-v3_671b: - dtypes: [bf16, fp8] + dtypes: [bf16, fp8, nvfp4] - pretrain_gpt_oss_120b: dtypes: [bf16] - - pretrain_grok1_314b: - dtypes: [bf16, fp8] + - pretrain_kimi-k2_1t: + dtypes: [fp8] - pretrain_llama3.1_405b: dtypes: [fp8, nvfp4] - pretrain_llama3.1_70b: dtypes: [fp8, nvfp4] - pretrain_nemotron-h_56b: dtypes: [fp8] - - pretrain_nemotron4-340b_340b: - dtypes: [bf16, fp8] + - pretrain_nemotron_3_120b: + dtypes: [bf16, fp8, nvfp4] - pretrain_qwen3_235b: dtypes: [bf16] gb200: - pretrain_deepseek-v3_671b: - dtypes: [bf16, fp8] + dtypes: [bf16, fp8, nvfp4] - pretrain_gpt_oss_120b: dtypes: [bf16] - - pretrain_grok1_314b: - dtypes: [bf16, fp8] + - pretrain_kimi-k2_1t: + dtypes: [fp8] - pretrain_llama3.1_405b: dtypes: [fp8, nvfp4] - pretrain_llama3.1_70b: - dtypes: [fp8] + dtypes: [fp8, nvfp4] - pretrain_nemotron-h_56b: dtypes: [fp8] - - pretrain_nemotron4-340b_340b: - dtypes: [bf16, fp8] - pretrain_qwen3_235b: dtypes: [bf16] b300: @@ -63,11 +61,13 @@ workloads: - pretrain_gpt_oss_120b: dtypes: [bf16] - pretrain_llama3.1_405b: - dtypes: [fp8] + dtypes: [fp8, nvfp4] - pretrain_llama3.1_70b: - dtypes: [fp8] + dtypes: [fp8, nvfp4] - pretrain_nemotron-h_56b: dtypes: [fp8] + - pretrain_nemotron_3_120b: + dtypes: [bf16] - pretrain_qwen3_235b: dtypes: [bf16] b200: @@ -75,15 +75,15 @@ workloads: dtypes: [bf16, fp8] - pretrain_gpt_oss_120b: dtypes: [bf16] - - pretrain_grok1_314b: - dtypes: [bf16, fp8] + - pretrain_kimi-k2_1t: + dtypes: [fp8] - pretrain_llama3.1_405b: dtypes: [fp8, nvfp4] - pretrain_llama3.1_70b: dtypes: [fp8, nvfp4] - pretrain_nemotron-h_56b: dtypes: [fp8] - - pretrain_nemotron4-340b_340b: + - pretrain_nemotron_3_120b: dtypes: [bf16, fp8] - pretrain_qwen3_235b: dtypes: [bf16] @@ -92,13 +92,9 @@ workloads: dtypes: [fp8] - pretrain_gpt_oss_120b: dtypes: [bf16] - - pretrain_grok1_314b: - dtypes: [bf16, fp8] - pretrain_llama3.1_70b: dtypes: [bf16, fp8] - pretrain_nemotron-h_56b: dtypes: [fp8] - - pretrain_nemotron4-340b_340b: - dtypes: [bf16, fp8] - pretrain_qwen3_235b: dtypes: [bf16] diff --git a/gpt-oss/pretrain/README.md b/gpt-oss/pretrain/README.md index ed8135c..d7b3a3d 100644 --- a/gpt-oss/pretrain/README.md +++ b/gpt-oss/pretrain/README.md @@ -34,61 +34,58 @@ This recipe contains information and scripts to produce performance results for # Performance Measurement and Analysis -Performance for GPT OSS training is measured by the achieved GPU FLOPS via the `TFLOPS_per_GPU` metric, which indicates computational throughput efficiency. Additionally, training step timing (milliseconds per iteration) is captured and logged for every training step in the main training log file [see Output Locations](#output-locations). +Performance is reported as: -Since the early training steps typically take much longer time (with input prefetch, activation memory allocation, and JIT compilation), we use the `parse_train_timing_mbridge.sh` script to analyze iterations 35-44 and calculate mean and standard deviation for reliable performance metrics for both TFLOPS per GPU and timing measurements. +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU -### Running the parse_train_timing_mbridge.sh script +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). -To analyze training timing from your experiment results, run the script from the workload directory. In an installed environment, recipe files are available under `$LLMB_INSTALL/llmb_repo` (a copy created by the installer). +## Viewing results with `llmb-run jobs` + +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: ```bash -# Basic usage - parses results in the directory named 'experiments' in the current folder -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh +# List all jobs you've submitted, with parsed metrics +llmb-run jobs -# Specify a different experiments directory -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh /path/to/experiments +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show -# Output in CSV format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=csv +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log +``` -# Output in JSON format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=json +Example `llmb-run jobs` output (illustrative values): -# Show full filenames instead of shortened versions -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --full-names +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 ``` -Example output: +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../../cli/llmb-run/README.md#jobs-command) for the full command reference. -```shell -Elapsed Time (ms) and MODEL_TFLOPS/GPU Analysis (iterations 35-44) -================================================================================ -Experiment Status Time Mean (ms) Time Std (ms) MODEL_TFLOPS_per_GPU Mean MODEL_TFLOPS_per_GPU Std ------------------------------------------------------------------------------------------- -------- ------------- ------------ ------------------- ------------------ -pretrain_gpt_oss_120b_bf16_gpus64_tp1_pp1_cp1_vpNone_ep64_etp1_mbs4_gbs1280_1697683 Success 5197.940 7.013 428.11 0.59 -``` +## Derived metrics -To obtain throughput as a tokens per second measurement, follow this formula: +To convert step time into tokens per second: -```shell -(throughput in tokens per second) = (sequence length) * (global batch size) / training_step_timing +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) ``` -To calculate time to train estimate: +To estimate time-to-train for a target token budget: -```shell -(time to train in days) = (total tokens) / (throughput in tokens per second) / (number of seconds in a day) +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 ``` -To calculate the model flops utilization (MFU): +To compute model FLOPs utilization (MFU): -```shell -MFU = (achieved TFLOPS_per_GPU) / (peak GPU FLOPS) +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) ``` -**Peak theoretical throughput across GPUs and Data Types (in TFLOPS)** - For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../../README.md#peak-theoretical-throughput) section in the main README. # Prerequisites @@ -138,31 +135,33 @@ llmb-run submit -w pretrain_gpt_oss --scale 64 ### Additional SLURM Parameters -Use a SLURM reservation: +For `llmb-run submit`, use the built-in Slurm flags instead of `ADDITIONAL_SLURM_PARAMS`. + +Use a Slurm reservation: ```bash -ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w pretrain_gpt_oss --scale 128 +llmb-run submit -w pretrain_gpt_oss --scale 128 --reservation my_reservation ``` Run on specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w pretrain_gpt_oss --scale 128 +llmb-run submit -w pretrain_gpt_oss --scale 128 --nodelist node001,node002 ``` Exclude specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w pretrain_gpt_oss --scale 128 +llmb-run submit -w pretrain_gpt_oss --scale 128 --exclude node003,node004 ``` -Combine multiple parameters (semicolon-separated): +Combine multiple parameters: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w pretrain_gpt_oss --scale 128 +llmb-run submit -w pretrain_gpt_oss --scale 128 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive ``` -For more details on llmb-run usage, see the [llmb-run documentation](../../cli/llmb-run/README.md). +For more details on `llmb-run` usage, see the [llmb-run documentation](../../cli/llmb-run/README.md). ## Direct Method @@ -226,7 +225,7 @@ The `` typically follows these patterns: **Key files:** -- `log-.out` - Contains training step timing and performance metrics analyzed by `parse_train_timing_mbridge.sh` +- `log-.out` - Contains training step timing and performance metrics parsed by `llmb-run jobs` - `nsys_profile/` - Contains profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true` # Profiling diff --git a/gpt-oss/pretrain/launch.sh b/gpt-oss/pretrain/launch.sh index 55e2e5f..e1bc832 100755 --- a/gpt-oss/pretrain/launch.sh +++ b/gpt-oss/pretrain/launch.sh @@ -87,7 +87,10 @@ if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" fi -CONFIG_OVERRIDES="" +CONFIG_OVERRIDES="${CONFIG_OVERRIDES:-}" +if [[ -n ${CONFIG_OVERRIDES} ]]; then + CONFIG_OVERRIDES+=" " +fi TIME_LIMIT=${TIME_LIMIT:-"00:30:00"} MAX_STEPS=${MAX_STEPS:-50} diff --git a/gpt-oss/pretrain/metadata.yaml b/gpt-oss/pretrain/metadata.yaml index 63ad649..39e45a1 100644 --- a/gpt-oss/pretrain/metadata.yaml +++ b/gpt-oss/pretrain/metadata.yaml @@ -32,7 +32,7 @@ container: repositories: megatron_bridge: url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "aeead1ae667d795ebe725cb4a608581a21f402cc" + commit: "f07871e23f637f4ae87d92256babc51fdbb12f39" nemo_run: url: "https://github.com/NVIDIA-NeMo/Run.git" commit: "525d68bfce2d6baed86ed3d7d0edbae07833ea0d" diff --git a/install.sh b/install.sh index 5f6d995..a2be776 100755 --- a/install.sh +++ b/install.sh @@ -160,7 +160,7 @@ get_python_version() { # Check if we're in a virtual environment (standard venv/virtualenv or conda) in_virtual_env() { - [[ -n ${VIRTUAL_ENV:-} ]] || [[ -n ${CONDA_DEFAULT_ENV:-} ]] + [[ -n ${VIRTUAL_ENV:-} ]] || [[ -n ${CONDA_DEFAULT_ENV:-} ]] || [[ -n ${CONDA_PREFIX:-} ]] } # Create a temporary summary file for llmb-install to write structured output. @@ -429,12 +429,31 @@ install_package() { popd > /dev/null } +install_optional_package() { + local package_name="$1" + local package_dir="$2" + + if [[ ! -d $package_dir ]]; then + return 0 + fi + + if [[ ! -f $package_dir/pyproject.toml ]]; then + echo "⚠️ Skipping optional $package_name: missing $package_dir/pyproject.toml" >&2 + return 0 + fi + + if ! install_package "$package_name" "$package_dir"; then + echo "⚠️ Optional package $package_name failed to install; continuing." >&2 + fi +} + echo "" echo "📦 Installing core tools..." # Install runner and installer dependencies install_package "llmb-run" "cli/llmb-run" install_package "llmb-install" "cli/llmb-install" +install_optional_package "llmb-collector" "cli/llmb-collector" echo "✅ Core tools installed successfully" diff --git a/kimi-k2/README.md b/kimi-k2/README.md new file mode 100644 index 0000000..0a52c77 --- /dev/null +++ b/kimi-k2/README.md @@ -0,0 +1,304 @@ +# Overview + +This recipe contains information and scripts to produce performance results for the Kimi-K2 (1T parameter MoE) pre-training workloads. The scripts help perform environment setup and launch benchmark jobs. Configurations use weak scaling methodology (global batch size scales proportionally with GPU count). + +The tables below list the GPU counts in `metadata.yaml` for this recipe (256 and 512), with parallelism and batch sizes from Megatron-Bridge `configs/kimi/kimi_workload_base_configs.py` (recipe `kimi_k2`). + +## GB300 + +| Precision | GPUs | SeqLen | TP | PP | CP | EP | DP | VP | MBS | GBS | +| --------- | :--: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | +| FP8 (MX) | 256 | 4096 | 1 | 4 | 1 | 64 | 64 | 4 | 2 | 4096 | +| FP8 (MX) | 512 | 4096 | 1 | 4 | 1 | 64 | 128 | 4 | 2 | 8192 | + +## GB200 + +| Precision | GPUs | SeqLen | TP | PP | CP | EP | DP | VP | MBS | GBS | +| --------- | :--: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | +| FP8 (MX) | 256 | 4096 | 1 | 4 | 1 | 64 | 64 | 4 | 1 | 2048 | +| FP8 (MX) | 512 | 4096 | 1 | 4 | 1 | 64 | 128 | 4 | 1 | 4096 | + +## B200 + +| Precision | GPUs | SeqLen | TP | PP | CP | EP | DP | VP | MBS | GBS | +| --------- | :--: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | +| FP8 (MX) | 256 | 4096 | 1 | 16 | 1 | 16 | 1 | N/A | 1 | 2048 | +| FP8 (MX) | 512 | 4096 | 1 | 16 | 1 | 16 | 2 | N/A | 1 | 4096 | + +# Performance Measurement and Analysis + +Performance is reported as: + +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU + +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). + +## Viewing results with `llmb-run jobs` + +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: + +```bash +# List all jobs you've submitted, with parsed metrics +llmb-run jobs + +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show + +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log +``` + +Example `llmb-run jobs` output (illustrative values): + +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 +``` + +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../cli/llmb-run/README.md#jobs-command) for the full command reference. + +## Derived metrics + +To convert step time into tokens per second: + +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) +``` + +To estimate time-to-train for a target token budget: + +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 +``` + +To compute model FLOPs utilization (MFU): + +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) +``` + +For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../README.md#peak-theoretical-throughput) section in the main README. + +# Prerequisites + +Requires Python 3.12.x, or conda. + +## Request Access + +No special access required to run this benchmark. + +## Slurm + +We reference a number of Slurm commands and parameters in this document. A brief summary is included below. These are a guide and might not apply to all environments. Consult your system administrator for parameters specific to your system. + +**Common parameters:** + +- `SBATCH_PARTITION` or `-p` – Partition (or queue) to use. +- `SBATCH_ACCOUNT` or `-A` – Slurm account for accounting. +- `SBATCH_GPUS_PER_NODE` or `--gres=gpu:` – Set to all GPUs per node if your cluster uses GRES. + +These can be set via environment variables or the corresponding `sbatch` flags. + +## Prepare environment + +Use the **installer** referenced in the [main README](../README.md) to prepare the recipe environment. + +Directory layout and key variables: + +- `LLMB_INSTALL`: Top-level directory for all benchmarking artifacts (images, datasets, venvs, workloads, etc.). +- `LLMB_WORKLOAD`: Workload-specific directory, e.g. `${LLMB_INSTALL}/workloads/pretrain_kimi-k2`. +- Results, logs, and checkpoints are stored under subfolders of `LLMB_WORKLOAD` (see [Output Locations](#output-locations)). + +# Prepare Dataset + +Kimi-K2 training in this recipe uses synthetic data; no dataset preparation is required. + +# Run Training + +After the environment is prepared, run training. The run executes for the first 50 steps and then stops. Logs and results are written under `${LLMB_WORKLOAD}/experiments/` (see [Output Locations](#output-locations)). + +## Using llmb-run (Recommended) + +The easiest way to run benchmarks is using the llmb-run launcher tool. This method handles configuration automatically and provides a streamlined interface. + +```bash +# Navigate to your installation directory +cd $LLMB_INSTALL + +# Example: Kimi-K2 1T, FP8 MX, 256 GB300 GPUs (cluster `gpu_type` in llmb-run config must match, e.g. gb300) +llmb-run submit -w pretrain_kimi-k2 -s 1t --dtype fp8 --scale 256 + +# 512 GPUs (same pattern; set `gpu_type` in your llmb-run cluster config to match the cluster) +llmb-run submit -w pretrain_kimi-k2 -s 1t --dtype fp8 --scale 512 +``` + +### Additional SLURM Parameters + +For `llmb-run submit`, use the built-in Slurm flags instead of `ADDITIONAL_SLURM_PARAMS`. + +Use a Slurm reservation: + +```bash +llmb-run submit -w pretrain_kimi-k2 -s 1t --dtype fp8 --scale 256 --reservation my_reservation +``` + +Run on specific nodes: + +```bash +llmb-run submit -w pretrain_kimi-k2 -s 1t --dtype fp8 --scale 256 --nodelist node001,node002 +``` + +Exclude specific nodes: + +```bash +llmb-run submit -w pretrain_kimi-k2 -s 1t --dtype fp8 --scale 256 --exclude node003,node004 +``` + +Combine multiple parameters: + +```bash +llmb-run submit -w pretrain_kimi-k2 -s 1t --dtype fp8 --scale 256 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive +``` + +For more details on `llmb-run` usage, see the [llmb-run documentation](../cli/llmb-run/README.md). + +## Direct Method + +Alternatively, you can run training directly using the launch script. This method provides more control over individual parameters and environment variables. + +**Important**: + +- Ensure your virtual environment is activated before running the training commands below. If you used the installer with conda, run `conda activate $LLMB_INSTALL/venvs/`. If you used the installer with python venv, run `source $LLMB_INSTALL/venvs//bin/activate`. +- Run the launch script from the installed recipe directory: `cd $LLMB_INSTALL/llmb_repo/kimi-k2/` + +### Command Template + +```shell +JOB_TOTAL_GPUS= GPU_TYPE= [DTYPE=] [ADDITIONAL_SLURM_PARAMS=] ./launch.sh +``` + +### Environment Variables + +**Required:** + +- `JOB_TOTAL_GPUS`: Number of GPUs to use. +- `GPU_TYPE`: `gb300`, `gb200`, or `b200` (listed in `metadata.yaml`). Megatron-Bridge also defines an `h100` Kimi-K2 preset; use `GPU_TYPE=h100` with eight GPUs per node if you run that recipe directly. + +**Optional:** + +- `DTYPE`: Precision format (fixed: `fp8`). + +- `FP8_RECIPE`: FP8 recipe (fixed: `mx`; default in `launch.sh`). + +- `ADDITIONAL_SLURM_PARAMS`: Extra `sbatch` flags (e.g. `--nodelist`, `--reservation`), semicolon-separated + + - Example: `"nodelist=node001,node002;reservation=my_reservation;exclusive"` + +**Note:** This workload only supports FP8 precision and the **MX** FP8 recipe (`fp8_mx` / `FP8_RECIPE=mx`). + +### Example Commands + +Kimi-K2 1T, FP8 MX, 256 GB300 GPUs: + +```shell +JOB_TOTAL_GPUS=256 GPU_TYPE=gb300 DTYPE=fp8 ./launch.sh +``` + +Kimi-K2 1T, FP8 MX, 512 GB300 GPUs: + +```shell +JOB_TOTAL_GPUS=512 GPU_TYPE=gb300 DTYPE=fp8 ./launch.sh +``` + +Kimi-K2 1T, FP8 MX, 256 GB200 GPUs: + +```shell +JOB_TOTAL_GPUS=256 GPU_TYPE=gb200 DTYPE=fp8 ./launch.sh +``` + +Kimi-K2 1T, FP8 MX, 512 GB200 GPUs: + +```shell +JOB_TOTAL_GPUS=512 GPU_TYPE=gb200 DTYPE=fp8 ./launch.sh +``` + +Kimi-K2 1T, FP8 MX, 256 B200 GPUs: + +```shell +JOB_TOTAL_GPUS=256 GPU_TYPE=b200 DTYPE=fp8 ./launch.sh +``` + +Kimi-K2 1T, FP8 MX, 512 B200 GPUs: + +```shell +JOB_TOTAL_GPUS=512 GPU_TYPE=b200 DTYPE=fp8 ./launch.sh +``` + +# Output Locations + +Benchmark results are saved under `$LLMB_WORKLOAD/experiments/` with the following structure: + +```text +experiments/ +├── / +│ └── _/ +│ ├── / +│ │ ├── log-.out # Main training log (used for timing analysis) +│ │ ├── sbatch_.out # Batch script output +│ │ └── nsys_profile/ # Profiling output (when enabled) +│ │ └── *.nsys-rep +│ └── [batch scripts and other files] +``` + +The `` typically follows the pattern: `pretrain_kimi-k2_1t___`. + +**Key files:** + +- `log-.out` – Training step timing and metrics parsed by `llmb-run jobs`. +- `nsys_profile/` – Profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true`. + +# Run Nsight Profiling + +To enable Nsight Systems profiling, set `ENABLE_PROFILE=true` or use the `-p` flag when submitting your job. The job runs 50 steps; steps 45–50 are profiled. + +Install the latest [Nsight Systems](https://docs.nvidia.com/nsight-systems/) to view the resulting reports. + +## Profiling job details + +- **MPI Ranks:** all ranks +- **Steps profiled:** 45–50 (configurable) +- **Output:** under the same experiment directory as training results +- **Filename pattern:** `profile_{SLURM_JOBID}_{SLURM_NODEID}_{SLURM_PROCID}.nsys-rep` + +**Example:** + +```shell +ENABLE_PROFILE=true JOB_TOTAL_GPUS=256 GPU_TYPE=gb300 ./launch.sh +``` + +```shell +llmb-run submit -w pretrain_kimi-k2 -s 1t --dtype fp8 --scale 256 -p +``` + +## Optional profiling options + +- `PROFILE_START_STEP`: first step to profile (default 45). +- `PROFILE_STOP_STEP`: last step to profile (default 50). +- `ENABLE_GPU_METRICS`: set to `true` to collect GPU metrics during Nsight profiling (default false). + +**Example with GPU metrics:** + +```shell +ENABLE_PROFILE=true ENABLE_GPU_METRICS=true JOB_TOTAL_GPUS=256 GPU_TYPE=gb300 ./launch.sh +``` + +## Viewing results + +- Install the [Nsight Systems client](https://developer.nvidia.com/nsight-systems/get-started) on your machine. +- Copy the generated `.nsys-rep` files (e.g. to `/home/nsight-traces/`). +- In Nsight Systems: File → Open and select one or more `.nsys-rep` files. +- For multi-GPU runs, see the [Multi-Report Analysis Guide](https://docs.nvidia.com/nsight-systems/UserGuide/index.html#multi-report-analysis). + +See the [Nsight Systems tutorials](https://developer.nvidia.com/nsight-systems/get-started#tutorials) for a quick start. diff --git a/kimi-k2/launch.sh b/kimi-k2/launch.sh new file mode 100755 index 0000000..812b347 --- /dev/null +++ b/kimi-k2/launch.sh @@ -0,0 +1,150 @@ +#!/bin/bash +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +if [ ${BASH_VERSION:0:1} -lt 4 ] || [ ${BASH_VERSION:0:1} -eq 4 ] && [ ${BASH_VERSION:2:1} -lt 2 ]; then + printf "Unsupported %s version: %s\n" "${BASH}" "${BASH_VERSION}" >&2 + echo "Requires Bash 4.2 or greater." >&2 + exit 1 +fi + +set -eu -o pipefail + +export WORKLOAD_TYPE=pretrain +export MODEL_NAME=kimi-k2 +export FW_VERSION=26.04.00 + +export OPENBLAS_NUM_THREADS=1 # Required for login nodes with tight memory restrictions. Do not remove. + +export LLMB_WORKLOAD=$LLMB_INSTALL/workloads/${WORKLOAD_TYPE}_${MODEL_NAME} +export NEMORUN_HOME=$LLMB_WORKLOAD +export LLMB_REPO=$PWD + +export IMAGE=${RUN_CONF_IMAGE:-$LLMB_INSTALL/images/nvidia+nemo+$FW_VERSION.sqsh} + +DTYPE=${DTYPE:-fp8} +DTYPE=${DTYPE,,} +FP8_RECIPE=${FP8_RECIPE:-mx} +FP8_RECIPE=${FP8_RECIPE,,} +COMPUTE_TYPE=${DTYPE}_${FP8_RECIPE} +GPU_TYPE=${GPU_TYPE:?GPU_TYPE is a required variable.} +GPU_TYPE=${GPU_TYPE,,} +JOB_TOTAL_GPUS=${JOB_TOTAL_GPUS:?JOB_TOTAL_GPUS is a required variable.} + +PROFILE_ENABLED=${ENABLE_PROFILE:-false} +PROFILE_ENABLED=${PROFILE_ENABLED,,} +PROFILE_START_STEP=${PROFILE_START_STEP:-45} +PROFILE_STOP_STEP=${PROFILE_STOP_STEP:-50} +GPU_METRICS_ENABLED=${ENABLE_GPU_METRICS:-false} +GPU_METRICS_ENABLED=${GPU_METRICS_ENABLED,,} +ENABLE_VBOOST=${ENABLE_VBOOST:-false} +ENABLE_VBOOST=${ENABLE_VBOOST,,} +if [[ $GPU_TYPE == "b200" ]]; then + TIME_LIMIT=${TIME_LIMIT:-"00:45:00"} +else + TIME_LIMIT=${TIME_LIMIT:-"00:20:00"} +fi +MAX_STEPS=${MAX_STEPS:-50} + +if [[ $DTYPE != "fp8" ]]; then + echo "Error: Kimi-K2 supports fp8 only." + exit 1 +fi +if [[ $FP8_RECIPE != "mx" ]]; then + echo "Error: Kimi-K2 supports FP8 MX recipe only (FP8_RECIPE=mx)." + exit 1 +fi + +# Handle additional SLURM parameters from environment variable +ADDITIONAL_SLURM_PARAMS=${ADDITIONAL_SLURM_PARAMS:-""} + +# Add additional SLURM parameters if provided +SLURM_ARGS="" +if [ -n "$ADDITIONAL_SLURM_PARAMS" ]; then + SLURM_ARGS="--additional_slurm_params ${ADDITIONAL_SLURM_PARAMS}" +fi + +CONTAINER_MOUNTS="" +export HF_HOME="$LLMB_INSTALL/.cache/huggingface" +CONTAINER_MOUNTS="$HF_HOME" +if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then + if [[ -n ${CONTAINER_MOUNTS} ]]; then + CONTAINER_MOUNTS+="," + fi + CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" +fi + +CONFIG_OVERRIDES="${CONFIG_OVERRIDES:-}" +if [[ -n ${CONFIG_OVERRIDES} ]]; then + CONFIG_OVERRIDES+=" " +fi +if [[ -n ${CONTAINER_MOUNTS} ]]; then + CONFIG_OVERRIDES+=" --custom_mounts $CONTAINER_MOUNTS" +fi + +if [[ $PROFILE_ENABLED == "true" ]]; then + CONFIG_OVERRIDES+=" --enable_nsys " + CONFIG_OVERRIDES+=" --profiling_start_step=$PROFILE_START_STEP " + CONFIG_OVERRIDES+=" --profiling_stop_step=$PROFILE_STOP_STEP " + PROFILE_RANKS=$(seq -s, 0 $((JOB_TOTAL_GPUS - 1))) + CONFIG_OVERRIDES+=" --profiling_ranks=$PROFILE_RANKS" + CONFIG_OVERRIDES+=" --nsys_trace=cuda " + CONFIG_OVERRIDES+=" --nsys_extra_args=--nvtx-domain-include=NCCL " + if [[ $GPU_METRICS_ENABLED == true ]]; then + CONFIG_OVERRIDES+=" --profiling_gpu_metrics " + fi +fi + +if [[ $ENABLE_VBOOST == true ]]; then + CONFIG_OVERRIDES+=" --enable_vboost true " +fi + +if [[ $GPU_TYPE == "gb300" ]] || [[ $GPU_TYPE == "gb200" ]]; then + GPUS_PER_NODE=4 +elif [[ $GPU_TYPE == "b200" ]] || [[ $GPU_TYPE == "h100" ]]; then + GPUS_PER_NODE=8 +else + echo "Error: GPU_TYPE must be gb300, gb200, b200, or h100 for Kimi-K2." + exit 1 +fi + +# run command +pushd $LLMB_WORKLOAD/Megatron-Bridge + +python3 scripts/performance/setup_experiment.py \ + --container_image $IMAGE \ + --compute_dtype $COMPUTE_TYPE \ + --gpu $GPU_TYPE \ + --num_gpus $JOB_TOTAL_GPUS \ + --gpus_per_node $GPUS_PER_NODE \ + --offline \ + --model_family_name kimi \ + --model_recipe_name kimi_k2 \ + ${CONFIG_OVERRIDES} \ + --account $SBATCH_ACCOUNT \ + --partition $SBATCH_PARTITION \ + --log_dir $NEMORUN_HOME \ + --time_limit $TIME_LIMIT \ + --max_steps $MAX_STEPS \ + --packager none \ + $SLURM_ARGS + +popd diff --git a/kimi-k2/metadata.yaml b/kimi-k2/metadata.yaml new file mode 100644 index 0000000..3db7306 --- /dev/null +++ b/kimi-k2/metadata.yaml @@ -0,0 +1,80 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +# Setup +general: + workload: kimi-k2 + workload_type: pretrain + framework: megatron_bridge + +container: + images: + - 'nvcr.io#nvidia/nemo:26.04.00' + +downloads: + huggingface: + - repo_id: 'moonshotai/Kimi-K2-Instruct' + assets: [config] + +repositories: + megatron_bridge: + url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" + commit: "fab68031197b64934027e188c0cb417fdf1e1d7a" + + nemo_run: + url: "https://github.com/NVIDIA-NeMo/Run.git" + commit: "64b91e0187b93475ea0d54028317e349ced7ac1b" + +setup: + venv_req: true + dependencies: + git: + megatron_bridge: + repo_key: megatron_bridge + install_method: + type: clone + pip: + - package: nemo_run + repo_key: nemo_run + +# Run +run: + launcher_type: 'megatron_bridge' + launch_script: 'launch.sh' + + gpu_configs: + gb300: + model_configs: + - model_size: '1t' + dtypes: ['fp8'] + scales: [512, 256] + + gb200: + model_configs: + - model_size: '1t' + dtypes: ['fp8'] + scales: [512, 256] + + b200: + model_configs: + - model_size: '1t' + dtypes: ['fp8'] + scales: [512, 256] diff --git a/llama3.1/README.md b/llama3.1/README.md index 2506d50..32187c5 100644 --- a/llama3.1/README.md +++ b/llama3.1/README.md @@ -8,65 +8,74 @@ This recipe contains information and scripts to produce performance results for ### FP8 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | -| 405b | 256-512 | FP8 | 8192 | 126 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 4 | 1 | GPUs\*6 | 192 | False | -| 70b | 64-512 | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*4 | 2 | False | -| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-512 | FP8 | 8192 | 126 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 4 | 1 | GPUs\*6 | 192 | False | +| 70b | 64-512 | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*4 | 2 | False | +| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | ### NVFP4 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | -| 405b | 256-512 | NVFP4 | 8192 | 126 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 4 | 1 | GPUs\*6 | 192 | False | -| 70b | 64-512 | NVFP4 | 8192 | 80 | False | 1 | 4 | 1 | 1 | 1 | GPUs/4 | 5 | 1 | GPUs\*4 | 16 | False | -| 8b | 8-128 | NVFP4 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-512 | NVFP4 | 8192 | 126 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 4 | 1 | GPUs\*6 | 192 | False | +| 70b | 64-512 | NVFP4 | 8192 | 80 | False | 1 | 4 | 1 | 1 | 1 | GPUs/4 | 5 | 1 | GPUs\*4 | 16 | False | +| 8b | 8-128 | NVFP4 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | ## GB200 ### FP8 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | -| 405b | 256-512 | FP8 | 8192 | 126 | False | 4 | 16 | 1 | 1 | 1 | GPUs/64 | 4 | 1 | GPUs\*6 | 384 | False | -| 70b | 64-512 | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*4 | 2 | False | -| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*16 | 8 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-512 | FP8 | 8192 | 126 | False | 4 | 16 | 1 | 1 | 1 | GPUs/64 | 4 | 1 | GPUs\*6 | 384 | False | +| 70b | 64-512 | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*4 | 2 | False | +| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*16 | 8 | False | ### NVFP4 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | -| 405b | 256-512 | NVFP4 | 8192 | 126 | False | 4 | 16 | 1 | 1 | 1 | GPUs/64 | 8 | 1 | GPUs\*6 | 384 | False | -| 8b | 8-128 | NVFP4 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-512 | NVFP4 | 8192 | 126 | False | 4 | 16 | 1 | 1 | 1 | GPUs/64 | 8 | 1 | GPUs\*6 | 384 | False | +| 8b | 8-128 | NVFP4 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | ## B300 ### FP8 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :------: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :-----: | :-: | :---: | -| 405b | 256-1024 | FP8 | 8192 | 126 | False | 2 | 8 | 2 | 1 | 1 | GPUs/32 | 4 | 1 | GPUs\*6 | 192 | False | -| 70b | 64-128 | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 1 | GPUs\*4 | 4 | False | -| 70b | 256-512 | FP8 | 8192 | 80 | False | 1 | 4 | 1 | 1 | 1 | GPUs/4 | 5 | 1 | GPUs\*4 | 16 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :---------: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-512 | FP8 | 8192 | 126 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 4 | 1 | GPUs\*6 | 192 | False | +| 70b | **64** | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 1 | GPUs\*4 | 4 | False | +| 70b | **128-512** | FP8 | 8192 | 80 | False | 1 | 4 | 1 | 1 | 1 | GPUs/4 | 5 | 1 | GPUs\*4 | 16 | False | +| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | + +### NVFP4 + +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-512 | NVFP4 | 8192 | 126 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 4 | 1 | GPUs\*6 | 192 | False | +| 70b | 64-512 | NVFP4 | 8192 | 80 | False | 1 | 4 | 1 | 1 | 1 | GPUs/4 | 5 | 1 | GPUs\*4 | 16 | False | +| 8b | 8-128 | NVFP4 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | ## B200 ### FP8 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :------: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | -| 405b | 256-1024 | FP8 | 8192 | 126 | False | 4 | 8 | 2 | 1 | 1 | GPUs/64 | 8 | 1 | GPUs\*6 | 384 | False | -| 70b | 64-128 | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 1 | GPUs\*4 | 4 | False | -| 70b | 256-1024 | FP8 | 8192 | 80 | False | 2 | 4 | 1 | 1 | 1 | GPUs/8 | 5 | 1 | GPUs\*4 | 32 | False | -| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*16 | 8 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :----------: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-1024 | FP8 | 8192 | 126 | False | 4 | 8 | 2 | 1 | 1 | GPUs/64 | 8 | 1 | GPUs\*6 | 384 | False | +| 70b | **64** | FP8 | 8192 | 80 | True | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 1 | GPUs\*4 | 4 | False | +| 70b | **128-1024** | FP8 | 8192 | 80 | False | 2 | 4 | 1 | 1 | 1 | GPUs/8 | 5 | 1 | GPUs\*4 | 32 | False | +| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 2 | GPUs\*16 | 8 | False | ### NVFP4 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :------: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | -| 405b | 256-1024 | NVFP4 | 8192 | 126 | False | 4 | 16 | 1 | 1 | 1 | GPUs/64 | 8 | 1 | GPUs\*6 | 384 | False | -| 70b | 64-1024 | NVFP4 | 8192 | 80 | False | 2 | 4 | 1 | 1 | 1 | GPUs/8 | 5 | 1 | GPUs\*4 | 32 | False | -| 8b | 8-128 | NVFP4 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :------: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :-----: | :-: | :-: | :------: | :-: | :---: | +| 405b | 256-1024 | NVFP4 | 8192 | 126 | False | 4 | 16 | 1 | 1 | 1 | GPUs/64 | 8 | 1 | GPUs\*6 | 384 | False | +| 70b | 64-1024 | NVFP4 | 8192 | 80 | False | 2 | 4 | 1 | 1 | 1 | GPUs/8 | 5 | 1 | GPUs\*4 | 32 | False | +| 8b | 8-128 | NVFP4 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 4 | GPUs\*16 | 4 | False | ## H100 @@ -74,91 +83,76 @@ This recipe contains information and scripts to produce performance results for ### BF16 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :------: | :-: | :-: | :------: | :-: | :---: | -| 405b | 1024 | BF16 | 8192 | 126 | False | 8 | 8 | 2 | 1 | 1 | GPUs/128 | 8 | 1 | 1536 | 192 | False | -| 70b | 64-1024 | BF16 | 8192 | 80 | False | 4 | 4 | 2 | 1 | 1 | GPUs/32 | 5 | 1 | GPUs\*4 | 128 | False | -| 8b | 8-128 | BF16 | 8192 | 32 | False | 1 | 1 | 2 | 1 | 1 | GPUs/2 | NA | 1 | GPUs\*16 | 32 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :------: | :-: | :-: | :------: | :-: | :---: | +| 405b | 1024 | BF16 | 8192 | 126 | False | 8 | 8 | 2 | 1 | 1 | GPUs/128 | 8 | 1 | 1536 | 192 | False | +| 70b | 64-1024 | BF16 | 8192 | 80 | False | 4 | 4 | 2 | 1 | 1 | GPUs/32 | 5 | 1 | GPUs\*4 | 128 | False | +| 8b | 8-128 | BF16 | 8192 | 32 | False | 1 | 1 | 2 | 1 | 1 | GPUs/2 | NA | 1 | GPUs\*16 | 32 | False | ### FP8 -| Llama3.1 Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | -| ------------------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :------: | :-: | :-: | :------: | :-: | :---: | -| 405b | 1024 | FP8 | 8192 | 126 | False | 8 | 8 | 2 | 1 | 1 | GPUs/128 | 8 | 1 | 1536 | 192 | False | -| 70b | 64-1024 | FP8 | 8192 | 80 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 5 | 2 | GPUs\*4 | 4 | False | -| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 1 | GPUs\*16 | 16 | False | +| Model Size | GPUs | Datatype | SeqLen | Layers | FSDP | TP | PP | CP | EP | ETP | DP | VP | MBS | GBS | GA | CG | +| ---------- | :-----: | :------: | :----: | :----: | :---: | :-: | :-: | :-: | :-: | :-: | :------: | :-: | :-: | :------: | :-: | :---: | +| 405b | 1024 | FP8 | 8192 | 126 | False | 8 | 8 | 2 | 1 | 1 | GPUs/128 | 8 | 1 | 1536 | 192 | False | +| 70b | 64-1024 | FP8 | 8192 | 80 | False | 4 | 8 | 1 | 1 | 1 | GPUs/32 | 5 | 2 | GPUs\*4 | 4 | False | +| 8b | 8-128 | FP8 | 8192 | 32 | False | 1 | 1 | 1 | 1 | 1 | GPUs | NA | 1 | GPUs\*16 | 16 | False | # Performance Measurement and Analysis -Performance for Llama3.1 training is measured by the achieved GPU FLOPS via the `TFLOPS_per_GPU` metric, which indicates computational throughput efficiency. Additionally, training step timing (milliseconds per iteration) is captured and logged for every training step in the main training log file [see Output Locations](#output-locations). +Performance is reported as: -Since the early training steps typically take much longer time (with input prefetch, activation memory allocation, and JIT compilation), we use the `parse_train_timing_mbridge.sh` script to analyze iterations 35-44 and calculate mean and standard deviation for reliable performance metrics for both TFLOPS per GPU and timing measurements. +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU -### Running the parse_train_timing_mbridge.sh script +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). -To analyze training timing from your experiment results, run the script from the workload directory. In an installed environment, recipe files are available under `$LLMB_INSTALL/llmb_repo` (a copy created by the installer). +## Viewing results with `llmb-run jobs` + +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: ```bash -# Basic usage - parses results in the directory named 'experiments' in the current folder -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh +# List all jobs you've submitted, with parsed metrics +llmb-run jobs -# Specify a different experiments directory -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh /path/to/experiments +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show -# Output in CSV format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=csv +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log +``` -# Output in JSON format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=json +Example `llmb-run jobs` output (illustrative values): -# Show full filenames instead of shortened versions -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --full-names +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 ``` -Example output: +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../cli/llmb-run/README.md#jobs-command) for the full command reference. -```shell -Elapsed Time (ms) and TFLOPS/GPU Analysis (iterations 35-44) -================================================================================ -Experiment Status Time Mean (ms) Time Std (ms) TFLOPS_per_GPU Mean TFLOPS_per_GPU Std ------------------------------------------------------------------------------------------- -------- ------------- ------------ ------------------- ------------------ -pretrain_llama31_405b_fp8_cs_gpus128_tp2_pp1_cp1_vpNone_ep1_mbs1_gbs64_658572 Success 5741.470 68.670 1636.80 20.89 -``` +## Derived metrics -To obtain throughput as a tokens per second measurement, follow this formula: +To convert step time into tokens per second: -```shell -(sequence length) * (global batch size) / (training step time in seconds) = (throughput in tokens per second) +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) ``` -E.g. 8192 * 64 / 5.74 = 91339 +To estimate time-to-train for a target token budget: -To calculate time to train with 1T tokens estimate: - -```shell -(total tokens) / (throughput in tokens per second) / (number of seconds in a day) = (time to train in days) +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 ``` -E.g. 1e12 / 91339 / 86400 = 126.72 days +To compute model FLOPs utilization (MFU): -To calculate the model flops utilization (MFU). - -```shell -MFU = avg(TFLOPS_GPU) / (peak GPU FLOPS) +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) ``` -**Peak theoretical FP8 throughput across GPUs (in TFLOPS)** - For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../README.md#peak-theoretical-throughput) section in the main README. -E.g. Llama3.1 405b FP8 on 128x GB200 GPUs that has an average of 1636.8 TFLOPs per GPU for steps 34-44 - -```shell -peak FLOPS for GB200 = 4900 TFLOPS -avg(TFLOPS_GPU) = 1636.8 -MFU = 1636.8 / 4900 = 33.40% -``` - # Prerequisites A HuggingFace account is required and you will need to [create a HuggingFace access token](https://huggingface.co/settings/tokens). Add the generated token to your environment via `export HF_TOKEN=`. @@ -167,7 +161,9 @@ Requires Python 3.12.x, or conda. ## Request Access -Access to the Llama 3.1 models must be requested through [Meta's website](https://www.llama.com/llama-downloads/) then requested on the [HuggingFace Llama 3.1](https://huggingface.co/meta-llama/Llama-3.1-405B) page. The approval process is not automatic and could take a day or more. +Access requirements depend on the model size. The 405B configuration requires Llama 3.1 access, which must be requested through [Meta's website](https://www.llama.com/llama-downloads/) and then on the [HuggingFace Llama 3.1 405B](https://huggingface.co/meta-llama/Llama-3.1-405B) page. + +The 8B and 70B configurations intentionally reuse Megatron-Bridge `llama3` configs, so they require Llama 3 family access instead. Request access through [Meta's website](https://www.llama.com/llama-downloads/) and then request access to either [HuggingFace Llama 3 70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B) or [HuggingFace Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B). Either approval grants access to the Llama 3 family. The approval process is not automatic and could take a day or more. ## Slurm @@ -220,31 +216,33 @@ llmb-run submit -w pretrain_llama3.1 -s 8b --dtype fp8 --scale 16 ### Additional SLURM Parameters -Use a SLURM reservation: +For `llmb-run submit`, use the built-in Slurm flags instead of `ADDITIONAL_SLURM_PARAMS`. + +Use a Slurm reservation: ```bash -ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 --reservation my_reservation ``` Run on specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 --nodelist node001,node002 ``` Exclude specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 --exclude node003,node004 ``` -Combine multiple parameters (semicolon-separated): +Combine multiple parameters: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 +llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive ``` -For more details on llmb-run usage, see the [llmb-run documentation](../cli/llmb-run/README.md). +For more details on `llmb-run` usage, see the [llmb-run documentation](../cli/llmb-run/README.md). ## Direct Method @@ -334,7 +332,7 @@ The `` typically follows these patterns: **Key files:** -- `log-.out` - Contains training step timing and performance metrics analyzed by `parse_train_timing_mbridge.sh` +- `log-.out` - Contains training step timing and performance metrics parsed by `llmb-run jobs` - `nsys_profile/` - Contains profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true` # Profiling @@ -404,7 +402,7 @@ PyTorch Profiling is intended for rare, advanced debugging scenarios such as NCC ENABLE_PYTORCH_PROFILE=true llmb-run submit -w pretrain_llama3.1 -s 405b --dtype fp8 --scale 128 ``` -For details on the PyTorch Profiler and how to view resulting traces, see the [PyTorch Profiler documentation](https://docs.pytorch.org/tutorials/recipes/recipes/profiler_recipe.html). +Trace files are saved to `torch_profile/rank-N.json.gz` in the job output directory, where `N` is the rank number. For details on the PyTorch Profiler and how to view resulting traces, see the [PyTorch Profiler documentation](https://docs.pytorch.org/tutorials/recipes/recipes/profiler_recipe.html). # Run With Checkpoints diff --git a/llama3.1/launch.sh b/llama3.1/launch.sh index 0c231e1..9e754fd 100755 --- a/llama3.1/launch.sh +++ b/llama3.1/launch.sh @@ -30,7 +30,9 @@ set -eu -o pipefail export WORKLOAD_TYPE=pretrain export MODEL_NAME=llama3.1 -export FW_VERSION=26.02.01 +export FW_VERSION=26.04.00 + +export IMAGE=${RUN_CONF_IMAGE:-$LLMB_INSTALL/images/nvidia+nemo+$FW_VERSION.sqsh} export OPENBLAS_NUM_THREADS=1 # Required for login nodes with tight memory restrictions. Do not remove. @@ -62,12 +64,6 @@ GPU_TYPE=${GPU_TYPE:?GPU_TYPE is a required variable.} GPU_TYPE=${GPU_TYPE,,} JOB_TOTAL_GPUS=${JOB_TOTAL_GPUS:?JOB_TOTAL_GPUS is a required variable.} -if [[ $GPU_TYPE == "b200" ]]; then - FW_VERSION=26.02.00 -fi - -export IMAGE=${RUN_CONF_IMAGE:-$LLMB_INSTALL/images/nvidia+nemo+$FW_VERSION.sqsh} - # Handle additional SLURM parameters from environment variable ADDITIONAL_SLURM_PARAMS=${ADDITIONAL_SLURM_PARAMS:-""} @@ -87,7 +83,10 @@ if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" fi -CONFIG_OVERRIDES="" +CONFIG_OVERRIDES="${CONFIG_OVERRIDES:-}" +if [[ -n ${CONFIG_OVERRIDES} ]]; then + CONFIG_OVERRIDES+=" " +fi # Time limit: 30 min for 8B/70B, 2 hr for 405B (override with TIME_LIMIT env) if [[ -z ${TIME_LIMIT:-} ]]; then @@ -103,43 +102,18 @@ ENABLE_CHECKPOINT=${ENABLE_CHECKPOINT:-false} ENABLE_CHECKPOINT=${ENABLE_CHECKPOINT,,} CHECKPOINT_INTERVAL=${CHECKPOINT_INTERVAL:-$MAX_STEPS} # Default: save checkpoint at end of training -if [[ $GPU_TYPE == "gb200" ]] && [[ $MODEL_SIZE == "70b" ]] && [[ $DTYPE == "nvfp4" ]]; then - echo "Error: NVFP4 is not supported for Llama3.1 70B on GB200." >&2 - exit 1 -fi - if { [[ $GPU_TYPE == "b300" ]] || [[ $GPU_TYPE == "b200" ]]; } && [[ $MODEL_SIZE == "405b" ]]; then GBS=$((JOB_TOTAL_GPUS * 6)) fi -if [[ $GPU_TYPE == "b300" ]] && [[ $MODEL_SIZE == "405b" ]] && [[ $DTYPE == "fp8" ]]; then - FP8_RECIPE=cs - TP=2 - PP=8 - CP=2 - VP=4 - MBS=1 -fi - if [[ $GPU_TYPE == "b300" ]] && { [[ $MODEL_SIZE == "70b" ]] || [[ $MODEL_SIZE == "405b" ]]; } && [[ $JOB_TOTAL_GPUS -ge 512 ]]; then export NCCL_IB_QPS_PER_CONNECTION=${NCCL_IB_QPS_PER_CONNECTION:-4} fi -if [[ $GPU_TYPE == "b300" ]] && [[ $MODEL_SIZE == "70b" ]] && [[ $DTYPE == "fp8" ]]; then - if [[ $JOB_TOTAL_GPUS -le 128 ]]; then - FP8_RECIPE=cs - elif [[ $JOB_TOTAL_GPUS -ge 256 ]]; then - FP8_RECIPE=mx - fi -fi - -if [[ $GPU_TYPE == "b200" ]] && [[ $MODEL_SIZE == "70b" ]] && [[ $DTYPE == "fp8" ]] && [[ $JOB_TOTAL_GPUS -ge 256 ]]; then +if { [[ $GPU_TYPE == "b300" ]] || [[ $GPU_TYPE == "b200" ]]; } \ + && [[ $MODEL_SIZE == "70b" ]] && [[ $DTYPE == "fp8" ]] \ + && [[ $JOB_TOTAL_GPUS -ge 128 ]]; then FP8_RECIPE=mx - TP=2 - PP=4 - CP=1 - VP=5 - MBS=1 fi if [[ -n ${TP-} ]]; then @@ -283,6 +257,7 @@ python scripts/performance/setup_experiment.py \ --partition $SBATCH_PARTITION \ --log_dir $NEMORUN_HOME \ --time_limit $TIME_LIMIT \ + --packager none \ $SLURM_ARGS popd diff --git a/llama3.1/metadata.yaml b/llama3.1/metadata.yaml index b7af028..9f1e01b 100644 --- a/llama3.1/metadata.yaml +++ b/llama3.1/metadata.yaml @@ -26,29 +26,16 @@ general: framework: megatron_bridge container: - images: - by_gpu: - default: - - 'nvcr.io#nvidia/nemo:26.02.01' - b200: - - 'nvcr.io#nvidia/nemo:26.02.00' + images: + - 'nvcr.io#nvidia/nemo:26.04.00' -repositories: - by_gpu: - default: - megatron_bridge: - url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "aeead1ae667d795ebe725cb4a608581a21f402cc" - nemo_run: - url: "https://github.com/NVIDIA-NeMo/Run.git" - commit: "525d68bfce2d6baed86ed3d7d0edbae07833ea0d" - b200: - megatron_bridge: - url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "6b3b5ba7ef64182caba23faa8d7abc0125aa3807" - nemo_run: - url: "https://github.com/NVIDIA-NeMo/Run.git" - commit: "ab0c4328275c4c731f1bdea3ceb0e68a9a17a6a2" +repositories: + megatron_bridge: + url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" + commit: "fab68031197b64934027e188c0cb417fdf1e1d7a" + nemo_run: + url: "https://github.com/NVIDIA-NeMo/Run.git" + commit: "64b91e0187b93475ea0d54028317e349ced7ac1b" downloads: huggingface: @@ -97,7 +84,7 @@ run: dtypes: ['fp8', 'nvfp4'] scales: [256, 512] - model_size: '70b' - dtypes: ['fp8'] + dtypes: ['fp8', 'nvfp4'] scales: [64,128,256,512] - model_size: '8b' dtypes: ['fp8', 'nvfp4'] @@ -105,19 +92,22 @@ run: b300: model_configs: - model_size: '405b' - dtypes: ['fp8'] + dtypes: ['fp8', 'nvfp4'] scales: [256, 512] - model_size: '70b' - dtypes: ['fp8'] + dtypes: ['fp8', 'nvfp4'] scales: [64,128,256,512] + - model_size: '8b' + dtypes: ['fp8', 'nvfp4'] + scales: [8,16,32,64,128] b200: model_configs: - model_size: '405b' dtypes: ['fp8', 'nvfp4'] - scales: [256, 512, 1024] + scales: [256, 512] - model_size: '70b' dtypes: ['fp8', 'nvfp4'] - scales: [64,128,256,512,1024] + scales: [64,128,256,512] - model_size: '8b' dtypes: ['fp8', 'nvfp4'] scales: [8,16,32,64,128] @@ -128,7 +118,7 @@ run: scales: [1024] - model_size: '70b' dtypes: ['bf16', 'fp8'] - scales: [64,128,256,512,1024] + scales: [64,128,256,512] - model_size: '8b' dtypes: ['bf16', 'fp8'] scales: [8,16,32,64,128] diff --git a/llama3/finetune/README.md b/llama3/finetune/README.md index e2216cc..3aa227a 100644 --- a/llama3/finetune/README.md +++ b/llama3/finetune/README.md @@ -45,57 +45,57 @@ This recipe contains information and scripts to produce performance results for # Performance Measurement and Analysis -Performance for LLAMA3 70B LoRa finetuning is measured by the achieved GPU FLOPS via the `TFLOPS_per_GPU` metric, which indicates computational throughput efficiency. Additionally, finetuning step timing (seconds per iteration) is captured and logged for every finetuning step in the main finetuning log file [see Output Locations](#output-locations). +Performance is reported as: -Since the early finetuning steps typically take much longer time (with input prefetch, activation memory allocation, and JIT compilation), we use the `parse_train_timing_mbridge.sh` script to analyze iterations 35-44 and calculate mean and standard deviation for reliable performance metrics for both TFLOPS per GPU and timing measurements. +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU -> **Note:** The `MODEL_TFLOP/s/GPU` value reported by Megatron-Bridge in the training log is incorrect for LoRA finetuning in this release. Use `parse_train_timing_mbridge.sh` to obtain accurate TFLOPS per GPU, which computes the correct value using the LoRA-specific FLOPs formula accounting for the FLOPs breakdown across frozen and trainable parameters. +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). -### Running the parse_train_timing_mbridge.sh script +## Viewing results with `llmb-run jobs` -To analyze training timing from your experiment results, run the script from the workload directory. In an installed environment, recipe files are available under `$LLMB_INSTALL/llmb_repo` (a copy created by the installer). +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: ```bash -# Basic usage - parses results in the directory named 'experiments' in the current folder -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh +# List all jobs you've submitted, with parsed metrics +llmb-run jobs -# Specify a different experiments directory -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh /path/to/experiments +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show - -# Show full filenames instead of shortened versions -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --full-names +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log ``` -Example output: +Example `llmb-run jobs` output (illustrative values): -```shell -Elapsed Time (ms) (iterations 35-44) -================================================================================ -Experiment Status Time Mean (ms) Time Std (ms) MODEL_TFLOPS_per_GPU Mean MODEL_TFLOPS_per_GPU Std ------------------------------------------------------------------------------------------- -------- ------------- ------------ ------------------- ------------------ -lora_llama3_70b_bf16_gpus8_tp1_pp1_cp1_vpNone_ep1_etpNone_mbs1_gbs32_1097062 Success 3314.210 6.861 1382.35 2.86 +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 ``` -To obtain throughput as a tokens per second measurement, follow this formula: +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../../cli/llmb-run/README.md#jobs-command) for the full command reference. -```shell -(throughput in tokens per second) = (sequence length) * (global batch size) / training_step_timing -``` +## Derived metrics -To calculate time to train estimate: +To convert step time into tokens per second: -```shell -(time to train in days) = (total tokens) / (throughput in tokens per second) / (number of seconds in a day) +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) ``` -To calculate the model flops utilization (MFU): +To estimate time-to-train for a target token budget: -```shell -MFU = (achieved TFLOPS_per_GPU) / (peak GPU FLOPS) +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 ``` -**Peak theoretical throughput across GPUs and Data Types (in TFLOPS)** +To compute model FLOPs utilization (MFU): + +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) +``` For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../../README.md#peak-theoretical-throughput) section in the main README. @@ -169,31 +169,33 @@ llmb-run submit -w finetune_llama3 --dtype bf16 --scale 8 ### Additional SLURM Parameters -Use a SLURM reservation: +For `llmb-run submit`, use the built-in Slurm flags instead of `ADDITIONAL_SLURM_PARAMS`. + +Use a Slurm reservation: ```bash -ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w finetune_llama3 --dtype fp8 --scale 8 +llmb-run submit -w finetune_llama3 --dtype fp8 --scale 8 --reservation my_reservation ``` Run on specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w finetune_llama3 --dtype bf16 --scale 8 +llmb-run submit -w finetune_llama3 --dtype fp8 --scale 8 --nodelist node001,node002 ``` Exclude specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w finetune_llama3 --dtype bf16 --scale 8 +llmb-run submit -w finetune_llama3 --dtype fp8 --scale 8 --exclude node003,node004 ``` -Combine multiple parameters (semicolon-separated): +Combine multiple parameters: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w finetune_llama3 --dtype bf16 --scale 8 +llmb-run submit -w finetune_llama3 --dtype fp8 --scale 8 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive ``` -For more details on llmb-run usage, see the [llmb-run documentation](../../cli/llmb-run/README.md). +For more details on `llmb-run` usage, see the [llmb-run documentation](../../cli/llmb-run/README.md). ## Direct Method @@ -284,7 +286,7 @@ The `` typically follows the pattern: `lora_llama3_70b__ **Key files:** -- `log-.out` - Contains training step timing and performance metrics analyzed by `parse_train_timing.sh` +- `log-.out` - Contains training step timing and performance metrics parsed by `llmb-run jobs` - `nsys_profile/` - Contains profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true` # Profiling diff --git a/llama3/finetune/launch.sh b/llama3/finetune/launch.sh index e65e711..502e2b1 100755 --- a/llama3/finetune/launch.sh +++ b/llama3/finetune/launch.sh @@ -89,7 +89,10 @@ if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" fi -CONFIG_OVERRIDES="" +CONFIG_OVERRIDES="${CONFIG_OVERRIDES:-}" +if [[ -n ${CONFIG_OVERRIDES} ]]; then + CONFIG_OVERRIDES+=" " +fi if [[ -n ${CONTAINER_MOUNTS} ]]; then CONFIG_OVERRIDES+=" --custom_mounts $CONTAINER_MOUNTS" fi diff --git a/llama3/finetune/metadata.yaml b/llama3/finetune/metadata.yaml index b80ff19..56c3f30 100644 --- a/llama3/finetune/metadata.yaml +++ b/llama3/finetune/metadata.yaml @@ -32,10 +32,10 @@ container: repositories: megatron_bridge: url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "aeead1ae667d795ebe725cb4a608581a21f402cc" + commit: "f07871e23f637f4ae87d92256babc51fdbb12f39" nemo_run: url: "https://github.com/NVIDIA-NeMo/Run.git" - commit: "525d68bfce2d6baed86ed3d7d0edbae07833ea0d" + commit: "01a9a8ba360f7b2908728ad0516e0ad9d936966d" setup: venv_req: true @@ -95,3 +95,4 @@ run: dtypes: ['bf16', 'fp8'] scales: [8, 16] exact_scales: true + diff --git a/microbenchmarks/cpu_overhead/README.md b/microbenchmarks/cpu_overhead/README.md index bdfa5ce..c89300d 100755 --- a/microbenchmarks/cpu_overhead/README.md +++ b/microbenchmarks/cpu_overhead/README.md @@ -65,8 +65,8 @@ Results for the workload are stored at `$LLMB_INSTALL/workloads/microbenchmark_c You should expect to see separate logs for each use case: ``` -├── _overhead_%j.err # Error logs -├── _overhead_%j.out # Benchmarking output +├── _overhead_%N_%j.err # Error logs +├── _overhead_%N_%j.out # Benchmarking output ``` The `*.out` file provides key performance metrics: diff --git a/microbenchmarks/cpu_overhead/download_dataset.sh b/microbenchmarks/cpu_overhead/download_dataset.sh index 85baa72..bd88c1f 100755 --- a/microbenchmarks/cpu_overhead/download_dataset.sh +++ b/microbenchmarks/cpu_overhead/download_dataset.sh @@ -1,5 +1,5 @@ #!/bin/bash -# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: MIT # # Permission is hereby granted, free of charge, to any person obtaining a @@ -52,8 +52,5 @@ HF_CACHE_DIR=$LLMB_INSTALL/.cache/huggingface mkdir -p $HF_CACHE_DIR pushd $LLMB_WORKLOAD -if [ ! -d "${MODEL_WEIGHTS_DIR}" ]; then - hf download openai/gpt-oss-120b --cache-dir $HF_CACHE_DIR --local-dir $MODEL_PATH -fi - +hf download openai/gpt-oss-120b --cache-dir $HF_CACHE_DIR --local-dir $MODEL_PATH popd diff --git a/microbenchmarks/cpu_overhead/launch.sh b/microbenchmarks/cpu_overhead/launch.sh index 1a2c085..e1e31aa 100755 --- a/microbenchmarks/cpu_overhead/launch.sh +++ b/microbenchmarks/cpu_overhead/launch.sh @@ -90,8 +90,8 @@ for value in $USE_CASES; do fi export SLURM_MPI_TYPE="pmix" - export SRUN_OUTPUT=${RESULT_DIR}/${LOG_NAME}_%j.out - export SRUN_ERROR=${RESULT_DIR}/${LOG_NAME}_%j.err + export SRUN_OUTPUT=${RESULT_DIR}/${LOG_NAME}_%N_%j.out + export SRUN_ERROR=${RESULT_DIR}/${LOG_NAME}_%N_%j.err srun --container-image "$IMAGE" \ --container-mounts "$MOUNT_DIR" \ diff --git a/microbenchmarks/system_info/README.md b/microbenchmarks/system_info/README.md index d856a02..69479c9 100644 --- a/microbenchmarks/system_info/README.md +++ b/microbenchmarks/system_info/README.md @@ -8,6 +8,7 @@ model metadata so it fits the current `llmb-run` recipe schema. # Commands Collected 01. `lscpu` + - `cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor` 02. `lspci -v` 03. `numactl -H` 04. `cat /proc/cmdline` @@ -15,12 +16,23 @@ model metadata so it fits the current `llmb-run` recipe schema. 06. `getconf PAGE_SIZE` 07. `dmesg | grep -i smmu` 08. `nvidia-smi -q` + - `nvidia-smi topo -m` 09. `sysctl -n kernel.numa_balancing` 10. `ibv_devinfo` - InfiniBand HCA device names and attributes -11. enroot config +11. Slurm topology and MPI + - `scontrol show config` - prints `TopologyPlugin`/`TopologyParam` for context + - `scontrol show topology` - flags failure if topology output is empty + - `srun --mpi=list` - flags failure if the `pmix` plugin is not listed + (LLMB recipes invoke Slurm with `--mpi=pmix`; Slurm must be built with `--with-pmix`) +12. enroot config - checks `enroot.conf` for recommended settings (`ENROOT_ROOTFS_WRITABLE`, `ENROOT_REMAP_ROOT`) - dumps `environ.d/` contents and flags bare defaults or missing `NCCL_IB_HCA` -12. `srun nvidia-smi` inside a container - validates pyxis/enroot and GPU visibility +13. `srun` inside a container - validates pyxis/enroot, GPU visibility, and hook configuration + - `nvidia-smi` - GPU visibility inside the container + - `env | grep MASTER_ADDR` - confirms the Enroot + [`50-slurm-pytorch.sh`](https://github.com/NVIDIA/enroot/tree/main/conf/hooks/extra) hook + is active inside the container (the hook populates `MASTER_ADDR`/`MASTER_PORT`/`WORLD_SIZE` + from Slurm env vars and is required for PyTorch distributed bootstrap) All commands are non-fatal; failures are reported in output and execution continues. diff --git a/microbenchmarks/system_info/launch.sh b/microbenchmarks/system_info/launch.sh index 234dd40..bfc382a 100755 --- a/microbenchmarks/system_info/launch.sh +++ b/microbenchmarks/system_info/launch.sh @@ -37,7 +37,7 @@ fi export WORKLOAD_TYPE=microbenchmark export WORKLOAD=system_info -export FW_VERSION=26.02.00 +export FW_VERSION=26.04.00 export LLMB_INSTALL=${LLMB_INSTALL:?Please set LLMB_INSTALL to the path of the installation directory for all workloads} export IMAGE=${RUN_CONF_IMAGE:-$LLMB_INSTALL/images/nvidia+nemo+$FW_VERSION.sqsh} @@ -76,7 +76,30 @@ echo "SLURM_JOB_ID: ${SLURM_JOB_ID:-unknown}" echo "LLMB_EXPERIMENT_DIR: ${LLMB_EXPERIMENT_DIR:-unset}" echo "Timestamp: $(date -Iseconds)" -run_step "1" "lscpu - CPU details, SKU, core count" "lscpu" +run_step "1a" "lscpu - CPU details, SKU, core count" "lscpu" + +# shellcheck disable=SC2317 # invoked indirectly via declare -f +check_cpu_governor() { + if ! compgen -G '/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor' > /dev/null; then + echo "cpufreq not available on this system (no scaling_governor files)" + return + fi + + local govs + govs=$(cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | sort -u) + echo "${govs}" + echo + + if [ "${govs}" = "performance" ]; then + echo "[OK] All cores using performance governor." + else + echo "[WARNING] One or more cores not using performance governor." + echo " 'performance' is recommended for benchmarking to avoid frequency scaling noise." + fi +} + +run_step "1b" "cpufreq scaling_governor - CPU frequency governor per core" \ + "$(declare -f check_cpu_governor); check_cpu_governor" run_step "2" "lspci -v - networking hardware and software layers" "lspci -v" run_step "3" "numactl -H - NUMA node assignments" "numactl -H" run_step "4" "cat /proc/cmdline - Linux kernel arguments" "cat /proc/cmdline" @@ -85,12 +108,94 @@ run_step "5" "systemd-detect-virt - virtualization check" \ run_step "6" "getconf PAGE_SIZE - memory page size" "getconf PAGE_SIZE" run_step "7" "dmesg | grep -i smmu - SMMU CMDQV feature status" \ "dmesg | grep -i smmu || echo 'No SMMU lines found or dmesg access is restricted.'" -run_step "8" "nvidia-smi -q - GPU hardware, firmware, software details" "nvidia-smi -q" +run_step "8a" "nvidia-smi -q - GPU hardware, firmware, software details" "nvidia-smi -q" +run_step "8b" "nvidia-smi topo -m - GPU/NIC topology matrix (NVLink, PCIe, NUMA affinity)" \ + "nvidia-smi topo -m | sed 's/\x1b\[[0-9;]*m//g'" run_step "9" "sysctl kernel.numa_balancing - automatic NUMA balancing" \ "val=\$(sysctl -n kernel.numa_balancing) && printf 'kernel.numa_balancing = %s (%s)\n' \"\${val}\" \"\$([ \"\${val}\" = 0 ] && echo disabled || echo enabled)\"" run_step "10" "ibv_devinfo - InfiniBand HCA device info" "ibv_devinfo" +# shellcheck disable=SC2317 # invoked indirectly via declare -f +check_slurm_topology_config() { + if ! command -v scontrol > /dev/null 2>&1; then + echo "scontrol command not found" + return 127 + fi + + local config + config=$(scontrol show config) + local rc=$? + if [[ ${rc} -ne 0 ]]; then + echo "[FAILED] scontrol show config returned exit code ${rc}." + return "${rc}" + fi + + echo "--- Slurm topology configuration ---" + printf "%s\n" "${config}" | grep -Ei "^[[:space:]]*(TopologyPlugin|TopologyParam)[[:space:]]*=" \ + || echo "(no TopologyPlugin or TopologyParam entries found)" +} + +# shellcheck disable=SC2317 # invoked indirectly via declare -f +check_slurm_topology() { + if ! command -v scontrol > /dev/null 2>&1; then + echo "scontrol command not found" + return 127 + fi + + local topology + topology=$(scontrol show topology) + local rc=$? + if [[ -n ${topology} ]]; then + printf "%s\n" "${topology}" + fi + + echo + if [[ ${rc} -ne 0 ]]; then + echo "[FAILED] scontrol show topology returned exit code ${rc}." + return "${rc}" + fi + + if [[ -z ${topology//[[:space:]]/} ]]; then + echo "[FAILED] scontrol show topology returned empty output." + echo " Slurm topology should be configured for benchmark clusters that" + echo " span more than one rack or network switch." + return 1 + fi + + echo "[OK] scontrol show topology returned topology data." +} + +run_step "11a" "scontrol show config - Slurm topology plugin configuration" \ + "$(declare -f check_slurm_topology_config); check_slurm_topology_config" +run_step "11b" "scontrol show topology - Slurm topology configuration is present" \ + "$(declare -f check_slurm_topology); check_slurm_topology" + +# shellcheck disable=SC2317 # invoked indirectly via declare -f +check_slurm_mpi_pmix() { + local output rc + output=$(srun --mpi=list 2>&1) + rc=$? + printf '%s\n' "${output}" + echo + + if [[ ${rc} -ne 0 ]]; then + echo "[FAILED] srun --mpi=list returned exit code ${rc}." + return "${rc}" + fi + + if printf '%s\n' "${output}" | grep -qiE "(^|[[:space:]])pmix"; then + echo "[OK] PMIx MPI plugin is available." + else + echo "[FAILED] PMIx MPI plugin not listed by 'srun --mpi=list'." + echo " LLMB recipes invoke Slurm with --mpi=pmix; Slurm must be built with --with-pmix." + return 1 + fi +} + +run_step "11c" "srun --mpi=list - Slurm PMIx MPI plugin available" \ + "$(declare -f check_slurm_mpi_pmix); check_slurm_mpi_pmix" + # shellcheck disable=SC2317 # invoked indirectly via declare -f check_enroot_conf() { local conf="/etc/enroot/enroot.conf" @@ -169,14 +274,56 @@ check_enroot_environ() { fi } -run_step "11a" "enroot config - enroot.conf" \ +run_step "12a" "enroot config - enroot.conf" \ "$(declare -f check_enroot_conf); check_enroot_conf" -run_step "11b" "enroot config - environ.d" \ +run_step "12b" "enroot config - environ.d" \ "$(declare -f enroot_environ_has_assignment); $(declare -f check_enroot_environ); check_enroot_environ" -run_step "12" "srun container nvidia-smi - GPU visibility inside container (${IMAGE})" \ +run_step "13a" "srun container nvidia-smi - GPU visibility inside container (${IMAGE})" \ "srun --nodes=1 --ntasks=1 --container-image '${IMAGE}' --container-writable --no-container-mount-home nvidia-smi" +# shellcheck disable=SC2317 # invoked indirectly via declare -f +check_container_slurm_pytorch_hook() { + local image="$1" + local output rc + + output=$(srun --nodes=1 --ntasks=1 \ + --container-image "${image}" \ + --container-writable --no-container-mount-home \ + bash -c 'env | grep -E "^(MASTER_ADDR|MASTER_PORT|WORLD_SIZE|LOCAL_RANK|RANK)=" | sort || true' 2>&1) + rc=$? + printf '%s\n' "${output}" + echo + + if [[ ${rc} -ne 0 ]]; then + echo "[FAILED] srun bash inside container returned exit code ${rc}." + return "${rc}" + fi + + local missing=() + local var + for var in MASTER_ADDR MASTER_PORT WORLD_SIZE; do + if ! printf '%s\n' "${output}" | grep -q "^${var}=[^[:space:]]"; then + missing+=("${var}") + fi + done + + if [[ ${#missing[@]} -eq 0 ]]; then + echo "[OK] MASTER_ADDR, MASTER_PORT, and WORLD_SIZE are set inside the container." + echo " Enroot extra hook '50-slurm-pytorch.sh' appears to be installed and configured." + else + echo "[FAILED] Missing PyTorch hook vars inside the container: ${missing[*]}" + echo " Enroot extra hook '50-slurm-pytorch.sh' is missing or not configured." + echo " LLMB recipes rely on this hook to populate MASTER_ADDR/MASTER_PORT/WORLD_SIZE" + echo " from Slurm env vars at container start." + echo " See https://github.com/NVIDIA/enroot/tree/main/conf/hooks/extra" + return 1 + fi +} + +run_step "13b" "srun container env - 50-slurm-pytorch.sh hook sets MASTER_ADDR, MASTER_PORT, WORLD_SIZE (${IMAGE})" \ + "$(declare -f check_container_slurm_pytorch_hook); check_container_slurm_pytorch_hook '${IMAGE}'" + print_banner "System Info Collection Summary" echo "Failed non-fatal steps: ${FAILED_STEPS}" echo "Completed at: $(date -Iseconds)" diff --git a/microbenchmarks/system_info/metadata.yaml b/microbenchmarks/system_info/metadata.yaml index c427ceb..a06f172 100644 --- a/microbenchmarks/system_info/metadata.yaml +++ b/microbenchmarks/system_info/metadata.yaml @@ -26,10 +26,10 @@ general: framework: host-debug # Host-debug steps run directly on the node; the container image is only used -# for the srun smoke-test (step 11) that validates pyxis/enroot + GPU visibility. +# for the final srun smoke-test that validates pyxis/enroot + GPU visibility. container: images: - - 'nvcr.io#nvidia/nemo:26.02.00' + - 'nvcr.io#nvidia/nemo:26.04.00' # Run run: diff --git a/nccl/metadata.yaml b/nccl/metadata.yaml index f4e0a3d..b82244e 100644 --- a/nccl/metadata.yaml +++ b/nccl/metadata.yaml @@ -27,7 +27,7 @@ general: container: images: - - 'nvcr.io#nvidia/nemo:26.02.00' + - 'nvcr.io#nvidia/nemo:26.04.00' repositories: comms-perf: diff --git a/nemotron-h/README.md b/nemotron-h/README.md index 47ca2a7..e1ab916 100644 --- a/nemotron-h/README.md +++ b/nemotron-h/README.md @@ -35,76 +35,58 @@ This recipe contains information and scripts to produce performance results for # Performance Measurement and Analysis -Performance for Nemotron-H training is measured by the achieved GPU FLOPS via the `TFLOPS_per_GPU` metric, which indicates computational throughput efficiency. Additionally, training step timing (seconds per iteration) is captured and logged for every training step in the main training log file [see Output Locations](#output-locations). +Performance is reported as: -Since the early training steps typically take much longer time (with input prefetch, activation memory allocation, and JIT compilation), we use the `parse_train_timing.sh` script to analyze iterations 35-44 and calculate mean and standard deviation for reliable performance metrics for both TFLOPS per GPU and timing measurements. +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU -### Running the parse_train_timing.sh script +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). -To analyze training timing from your experiment results, run the script from the workload directory. In an installed environment, recipe files are available under `$LLMB_INSTALL/llmb_repo` (a copy created by the installer). +## Viewing results with `llmb-run jobs` -```bash -# Basic usage - parses results in the directory named 'experiments' in the current folder -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh - -# Specify a different experiments directory -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh /path/to/experiments +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: -# Output in CSV format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=csv +```bash +# List all jobs you've submitted, with parsed metrics +llmb-run jobs -# Output in JSON format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=json +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show -# Show full filenames instead of shortened versions -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --full-names +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log ``` -Example output: +Example `llmb-run jobs` output (illustrative values): -```shell -Train Step Timing and TFLOPS Analysis (iterations 35-44) -================================================================================ -Experiment Status Time Mean (s) Time Std (s) TFLOPS_per_GPU Mean TFLOPS_per_GPU Std ------------------------------------------------------------------------------------------- -------- ------------- ------------ ------------------- ------------------ -pretrain_nemotronh_56b_fp8_gpus128_tp4_pp1_cp1_vpNone_mbs2_gbs384_1063626 Success 5.310 0.007 1961.50 2.80 +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 ``` -To obtain throughput as a tokens per second measurement, follow this formula: +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../cli/llmb-run/README.md#jobs-command) for the full command reference. -```shell -(throughput in tokens per second) = (sequence length) * (global batch size) / training_step_timing -``` +## Derived metrics -E.g. 8192 * 384 / 5.310 = 592415 +To convert step time into tokens per second: -To calculate time to train estimate: - -```shell -(time to train in days) = (total tokens) / (throughput in tokens per second) / (number of seconds in a day) +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) ``` -E.g. 1e12 / 592415 / 86400 = 19.54 days +To estimate time-to-train for a target token budget: -To calculate the model flops utilization (MFU): - -```shell -MFU = (achieved TFLOPS_per_GPU) / (peak GPU FLOPS) +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 ``` -The peak theoretical throughput for GB200 FP8 is **4.9** PFLOPS. +To compute model FLOPs utilization (MFU): -E.g. Nemotron-H 56B FP8 on 128x GB200 GPUs - -```shell -peak FLOPS for GB200 FP8 = 4.9 PFLOPS -achieved TFLOPS_per_GPU = 1961.50 TFLOPS - -MFU = 1961.50e+12 / 4.9e+15 = 40.03% +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) ``` -**Peak theoretical throughput across GPUs and Data Types (in TFLOPS)** - For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../README.md#peak-theoretical-throughput) section in the main README. # Prerequisites @@ -163,31 +145,33 @@ llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 1024 ### Additional SLURM Parameters -Use a SLURM reservation: +For `llmb-run submit`, use the built-in Slurm flags instead of `ADDITIONAL_SLURM_PARAMS`. + +Use a Slurm reservation: ```bash -ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 +llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 --reservation my_reservation ``` Run on specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 +llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 --nodelist node001,node002 ``` Exclude specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 +llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 --exclude node003,node004 ``` -Combine multiple parameters (semicolon-separated): +Combine multiple parameters: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 +llmb-run submit -w pretrain_nemotron-h --dtype fp8 --scale 128 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive ``` -For more details on llmb-run usage, see the [llmb-run documentation](../cli/llmb-run/README.md). +For more details on `llmb-run` usage, see the [llmb-run documentation](../cli/llmb-run/README.md). ## Direct Method @@ -271,7 +255,7 @@ experiments/ │ └── _/ │ ├── / │ │ ├── log-.out # Main training log with performance data -│ │ ├── sbatch_.out # Batch script output +│ │ ├── sbatch_.out # Batch script output │ │ └── nsys_profile/ # Profiling output (when enabled) │ │ └── *.nsys-rep files │ └── [batch scripts and other files] @@ -281,7 +265,7 @@ The `` typically follows the pattern: `pretrain_nemotron-h_56b_ **Key files:** -- `log-.out` - Contains training step timing and performance metrics analyzed by `parse_train_timing.sh` +- `log-.out` - Contains training step timing and performance metrics parsed by `llmb-run jobs` - `nsys_profile/` - Contains profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true` # Profiling @@ -299,7 +283,7 @@ In order to view the resulting profiles, ensure you have the latest version of N - **MPI Ranks:** all ranks - **Job Steps:** 45-50 - **Output Location:** Profiling output saved alongside training results ([see Output Locations](#output-locations)) -- **Filename format:** `profile_{SLURM_JOBID}_{SLURM_NODEID}_{SLURM_PROCID}.nsys-rep` +- **Filename format:** `profile_${SLURM_JOB_ID}_${SLURM_NODEID}_${SLURM_LOCALID}.nsys-rep` **Example command:** diff --git a/nemotron-h/launch.sh b/nemotron-h/launch.sh index c15ee9e..d6a6647 100755 --- a/nemotron-h/launch.sh +++ b/nemotron-h/launch.sh @@ -86,7 +86,10 @@ if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" fi -CONFIG_OVERRIDES="" +CONFIG_OVERRIDES="${CONFIG_OVERRIDES:-}" +if [[ -n ${CONFIG_OVERRIDES} ]]; then + CONFIG_OVERRIDES+=" " +fi if [[ -n ${CONTAINER_MOUNTS} ]]; then CONFIG_OVERRIDES+=" --custom_mounts $CONTAINER_MOUNTS" fi diff --git a/nemotron-h/metadata.yaml b/nemotron-h/metadata.yaml index 8cdc88d..69c11f2 100644 --- a/nemotron-h/metadata.yaml +++ b/nemotron-h/metadata.yaml @@ -32,11 +32,11 @@ container: repositories: megatron_bridge: url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "aeead1ae667d795ebe725cb4a608581a21f402cc" + commit: "f07871e23f637f4ae87d92256babc51fdbb12f39" nemo_run: url: "https://github.com/NVIDIA-NeMo/Run.git" - commit: "525d68bfce2d6baed86ed3d7d0edbae07833ea0d" + commit: "ab0c4328275c4c731f1bdea3ceb0e68a9a17a6a2" setup: venv_req: true @@ -84,4 +84,4 @@ run: model_configs: - model_size: '56b' dtypes: 'fp8' - scales: [32, 64, 128, 256, 512, 1024] + scales: [32, 64, 128, 256, 512] diff --git a/nemotron3/README.md b/nemotron3/README.md new file mode 100644 index 0000000..5ecb9a7 --- /dev/null +++ b/nemotron3/README.md @@ -0,0 +1,354 @@ +# Overview + +This recipe contains information and scripts to produce performance results for Nemotron 3 pre-training workloads (**30b** (usually referenced as nano) and **120b** (usually referenced as super)). The scripts help perform environment setup and launch benchmark jobs. Configurations use weak scaling methodology (global batch size scales proportionally with GPU count). + +## GB300 Nemotron 3 30B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| FP8/BF16 | 8 | 8192 | 52 | 1 | 1 | 1 | 8 | 8 | | 4 | 512 | 16 | +| FP8/BF16 | 16 | 8192 | 52 | 1 | 1 | 1 | 8 | 16 | | 4 | 1024 | 16 | +| FP8/BF16 | 32 | 8192 | 52 | 1 | 1 | 1 | 8 | 32 | | 4 | 2048 | 16 | +| FP8/BF16 | 64 | 8192 | 52 | 1 | 1 | 1 | 8 | 64 | | 4 | 4096 | 16 | + +## GB200 Nemotron 3 30B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| BF16 | 8 | 8192 | 52 | 1 | 1 | 1 | 8 | 8 | | 2 | 512 | 32 | +| BF16 | 16 | 8192 | 52 | 1 | 1 | 1 | 8 | 16 | | 2 | 1024 | 32 | +| BF16 | 32 | 8192 | 52 | 1 | 1 | 1 | 8 | 32 | | 2 | 2048 | 32 | +| BF16 | 64 | 8192 | 52 | 1 | 1 | 1 | 8 | 64 | | 2 | 4096 | 32 | + +## B300 Nemotron 3 30B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| FP8/BF16 | 8 | 8192 | 52 | 1 | 1 | 1 | 8 | 8 | | 4 | 512 | 16 | +| FP8/BF16 | 16 | 8192 | 52 | 1 | 1 | 1 | 8 | 16 | | 4 | 1024 | 16 | +| FP8/BF16 | 32 | 8192 | 52 | 1 | 1 | 1 | 8 | 32 | | 4 | 2048 | 16 | +| FP8/BF16 | 64 | 8192 | 52 | 1 | 1 | 1 | 8 | 64 | | 4 | 4096 | 16 | + +## B200 Nemotron 3 30B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| FP8/BF16 | 8 | 8192 | 52 | 1 | 1 | 1 | 8 | 8 | | 2 | 512 | 32 | +| FP8/BF16 | 16 | 8192 | 52 | 1 | 1 | 1 | 8 | 16 | | 2 | 1024 | 32 | +| FP8/BF16 | 32 | 8192 | 52 | 1 | 1 | 1 | 8 | 32 | | 2 | 2048 | 32 | +| FP8/BF16 | 64 | 8192 | 52 | 1 | 1 | 1 | 8 | 64 | | 2 | 4096 | 32 | + +## H100 Nemotron 3 30B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| FP8/BF16 | 16 | 8192 | 52 | 1 | 1 | 1 | 8 | 16 | | 1 | 1024 | 64 | +| FP8/BF16 | 32 | 8192 | 52 | 1 | 1 | 1 | 8 | 32 | | 1 | 2048 | 64 | +| FP8/BF16 | 64 | 8192 | 52 | 1 | 1 | 1 | 8 | 64 | | 1 | 4096 | 64 | + +## GB300 Nemotron 3 120B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| -------------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| NVFP4/FP8/BF16 | 64 | 8192 | 88 | 1 | 1 | 1 | 64 | 64 | | 1 | 512 | 8 | +| NVFP4/FP8/BF16 | 128 | 8192 | 88 | 1 | 1 | 1 | 64 | 128 | | 1 | 1024 | 8 | +| NVFP4/FP8/BF16 | 256 | 8192 | 88 | 1 | 1 | 1 | 64 | 256 | | 1 | 2048 | 8 | +| NVFP4/FP8/BF16 | 512 | 8192 | 88 | 1 | 1 | 1 | 64 | 512 | | 1 | 4096 | 8 | + +## B300 Nemotron 3 120B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| BF16 | 64 | 8192 | 88 | 1 | 1 | 1 | 8 | 64 | | 1 | 512 | 8 | +| BF16 | 128 | 8192 | 88 | 1 | 1 | 1 | 8 | 128 | | 1 | 1024 | 8 | +| BF16 | 256 | 8192 | 88 | 1 | 1 | 1 | 8 | 256 | | 1 | 2048 | 8 | +| BF16 | 512 | 8192 | 88 | 1 | 1 | 1 | 8 | 512 | | 1 | 4096 | 8 | + +## B200 Nemotron 3 120B + +| Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | +| --------- | :--: | :----: | :----: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :--: | :-: | +| FP8/BF16 | 64 | 8192 | 88 | 1 | 1 | 1 | 64 | 64 | | 1 | 512 | 8 | +| FP8/BF16 | 128 | 8192 | 88 | 1 | 1 | 1 | 64 | 128 | | 1 | 1024 | 8 | +| FP8/BF16 | 256 | 8192 | 88 | 1 | 1 | 1 | 64 | 256 | | 1 | 2048 | 8 | +| FP8/BF16 | 512 | 8192 | 88 | 1 | 1 | 1 | 64 | 512 | | 1 | 4096 | 8 | + +# Performance Measurement and Analysis + +Performance is reported as: + +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU + +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). + +## Viewing results with `llmb-run jobs` + +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: + +```bash +# List all jobs you've submitted, with parsed metrics +llmb-run jobs + +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show + +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log +``` + +Example `llmb-run jobs` output (illustrative values): + +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 +``` + +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../cli/llmb-run/README.md#jobs-command) for the full command reference. + +## Derived metrics + +To convert step time into tokens per second: + +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) +``` + +To estimate time-to-train for a target token budget: + +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 +``` + +To compute model FLOPs utilization (MFU): + +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) +``` + +For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../README.md#peak-theoretical-throughput) section in the main README. + +# Prerequisites + +Requires Python 3.12.x, or conda. + +## Request Access + +No special access required to run this benchmark. + +## Slurm + +We reference a number of Slurm commands and parameters in this document. A brief summary is included below. It's important to note these are a guide and might not be applicable to all environments. Please consult with your system administrator for the parameters that are specific to your system. + +**Common parameters:** + +- `SBATCH_PARTITION` or `-p` - Partition (or queue) to use. +- `SBATCH_ACCOUNT` or `-A` - Slurm account to associate with your job, different from your user. Meant for accounting purposes. +- `SBATCH_GPUS_PER_NODE` or `--gres=gpu:` - If your cluster is configured with GRES this should be set to all GPUs in a node. Ignore if not configured. + - Encountering errors such as 'GPUs not found' or 'Cannot submit to this partition without GPU resources' means this setting is required. + +These parameters can be set either by exporting the environment variable or using the corresponding `sbatch` flag. + +## Prepare environment + +Use the **installer** referenced in the [main README](../README.md) to prepare the recipe environment: + +The following directory layout and key variables are used in the recipe: + +- `LLMB_INSTALL`: Top-level directory for all benchmarking artifacts (images, datasets, venvs, workloads, etc). +- `LLMB_WORKLOAD`: Workload-specific directory, e.g. `${LLMB_INSTALL}/workloads/pretrain_nemotron_3`. +- Results, logs, and checkpoints are stored under subfolders of `LLMB_WORKLOAD` (see below). + +# Prepare Dataset + +Since Nemotron 3 training only uses synthetic datasets, this step is omitted. + +# Run Training + +Once the environment has been prepared, it is time to train a model. The training runs for the first 50 steps by default (`MAX_STEPS`, overridable) and then stops. Log files and results are stored under the `${LLMB_WORKLOAD}/experiments/` folder ([see Output Locations](#output-locations) for details). + +## Using llmb-run (Recommended) + +The easiest way to run benchmarks is using the llmb-run launcher tool. This method handles configuration automatically and provides a streamlined interface. + +```bash +# Navigate to your installation directory +cd $LLMB_INSTALL + +# Example: Nemotron 3 nano, BF16, 8 GPUs +llmb-run submit -w pretrain_nemotron_3 -s 30b --dtype bf16 --scale 8 + +# Example: Nemotron 3 super, BF16, 64 GPUs +llmb-run submit -w pretrain_nemotron_3 -s 120b --dtype bf16 --scale 64 +``` + +### Additional SLURM Parameters + +Use a SLURM reservation: + +```bash +ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w pretrain_nemotron_3 -s 120b --dtype bf16 --scale 64 +``` + +Run on specific nodes: + +```bash +ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w pretrain_nemotron_3 -s 120b --dtype bf16 --scale 64 +``` + +Exclude specific nodes: + +```bash +ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w pretrain_nemotron_3 -s 120b --dtype bf16 --scale 64 +``` + +Combine multiple parameters (semicolon-separated): + +```bash +ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w pretrain_nemotron_3 -s 120b --dtype bf16 --scale 64 +``` + +For more details on llmb-run usage, see the [llmb-run documentation](../cli/llmb-run/README.md). + +## Direct Method + +Alternatively, you can run training directly using the launch script. This method provides more control over individual parameters and environment variables. + +**Important**: + +- Ensure your virtual environment is activated before running the training commands below. If you used the installer with conda, run `conda activate $LLMB_INSTALL/venvs/`. If you used the installer with python venv, run `source $LLMB_INSTALL/venvs//bin/activate`. +- Run the launch script from the installed recipe directory: `cd $LLMB_INSTALL/llmb_repo/nemotron3/` + +### Command Template + +```shell +JOB_TOTAL_GPUS= GPU_TYPE= [DTYPE=] [FP8_RECIPE=] [MODEL_SIZE=] [ADDITIONAL_SLURM_PARAMS=] ./launch.sh +``` + +### Environment Variables + +**Required:** + +- `JOB_TOTAL_GPUS`: Number of GPUs to use +- `GPU_TYPE`: Type of GPU hardware + - `gb300` - NVIDIA GB300 GPUs + - `gb200` - NVIDIA GB200 GPUs + - `b300` - NVIDIA B300 GPUs + - `b200` - NVIDIA B200 GPUs + - `h100` - NVIDIA H100 GPUs + +**Optional:** + +- `DTYPE`: Precision format (default: `bf16`). Supported: `bf16`, `fp8`, `nvfp4`. + +- `FP8_RECIPE`: FP8 recipe when `DTYPE=fp8` (default: `mx`). + +- `MODEL_SIZE`: Model variant (default: `30b`) + + - `30b` - Nemotron 3 Nano recipe + - `120b` - Nemotron 3 Super recipe + +- `ADDITIONAL_SLURM_PARAMS`: Extra `sbatch` flags (e.g. `--nodelist`, `--reservation`), semicolon-separated + + - Example: `"nodelist=node001,node002;reservation=my_reservation;exclusive"` + +### Example Commands + +Train Nemotron 3 Nano with BF16 precision on 8 GB300 GPUs: + +```shell +JOB_TOTAL_GPUS=8 GPU_TYPE=GB300 DTYPE=bf16 MODEL_SIZE=30b ./launch.sh +``` + +# Output Locations + +All benchmark results are saved under `$LLMB_WORKLOAD/experiments/` with the following structure: + +``` +experiments/ +├── / +│ └── _/ +│ ├── / +│ │ ├── log-.out # Main training log with performance data +│ │ ├── sbatch_.out # Batch script output +│ │ └── nsys_profile/ # Profiling output (when enabled) +│ │ └── *.nsys-rep files +│ └── [batch scripts and other files] +``` + +The `` typically follows the pattern: `pretrain_nemotron_3_nano___` + +**Key files:** + +- `log-.out` - Contains training step timing and performance metrics parsed by `llmb-run jobs` +- `nsys_profile/` - Contains profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true` + +# Profiling + +Profiling is supported with Nsight Systems or PyTorch Profiler. + +## Run Nsight Profiling + +To enable profiling with Nsight Systems, use the `-p` flag with `llmb-run` or set `ENABLE_PROFILE=true` when submitting your job. The job will run for a total of 50 steps where steps 45-50 will be profiled. + +In order to view the resulting profiles, ensure you have the latest version of Nsight Systems installed. For more information visit: [Nsight Systems](https://docs.nvidia.com/nsight-systems/) + +### Profiling job details: + +- **MPI Ranks:** all ranks +- **Job Steps:** 45-50 +- **Output Location:** Profiling output saved alongside training results ([see Output Locations]) +- **Filename format:** `profile_${SLURM_JOB_ID}_${SLURM_NODEID}_${SLURM_LOCALID}.nsys-rep` + +**Example command:** + +```shell +llmb-run submit -w pretrain_nemotron_3 -s 30b --dtype fp8 --scale 8 -p +``` + +### Customizing profiling behavior: + +- Specify job steps to profile: + - `PROFILE_START_STEP`: start profiling on this job step. + - Default: 45 + - `PROFILE_STOP_STEP`: stop profiling on this job step. + - Default: 50 +- Enable GPU metrics collection: + - `ENABLE_GPU_METRICS`: Enable GPU metrics collection during Nsight profiling (default: false) + * When set to `true` along with `ENABLE_PROFILE=true`, captures detailed GPU performance metrics + * Provides additional GPU utilization, memory usage, and compute efficiency data + * May require additional system configuration for GPU device metrics to work properly + +**Example command with GPU metrics:** + +```shell +ENABLE_GPU_METRICS=true llmb-run submit -w pretrain_nemotron_3 -s 30b --dtype bf16 --scale 8 -p +``` + +### Viewing results + +In order to view the profile traces (\*.nsys-rep files) interactively: + +- Install the latest [Nsight Systems client](https://developer.nvidia.com/nsight-systems/get-started) on your preferred system +- Copy the generated .nsys-rep files to a folder on your preferred system. E.g., /home/nsight-traces/ +- Open Nsight Systems client, then click "File | Open" and select one or more .nsys-rep files from that folder. For more details, see [Reading Your Report in GUI guide](https://docs.nvidia.com/nsight-systems/UserGuide/index.html#opening-an-existing-report). +- Once loaded you can analyze the workload behavior to learn about any performance bottlenecks associated with the model or the job run. + +Since most of the benchmarking jobs run on multiple GPUs, there will be multiple .nsys-rep files generated for each run. [Multi-Report Analysis Guide](https://docs.nvidia.com/nsight-systems/UserGuide/index.html#multi-report-analysis) will be very helpful to automate the analysis and get to results quicker by using Nsight recipes. + +**See** these [tutorials](https://developer.nvidia.com/nsight-systems/get-started#tutorials) to get a quick start if you are new to Nsight profiling. + +## PyTorch Profiling + +PyTorch Profiling is intended for rare, advanced debugging scenarios such as NCCL correlation analysis. To enable it, set `ENABLE_PYTORCH_PROFILE=true` when submitting your job. + +> **Note:** This option is mutually exclusive with Nsight profiling (`ENABLE_PROFILE`). Both cannot be enabled at the same time. + +**Example command:** + +```shell +ENABLE_PYTORCH_PROFILE=true llmb-run submit -w pretrain_nemotron_3 -s 30b --dtype bf16 --scale 8 +``` + +Trace files are saved to `torch_profile/rank-N.json.gz` in the job output directory, where `N` is the rank number. For details on the PyTorch Profiler and how to view resulting traces, see the [PyTorch Profiler documentation](https://docs.pytorch.org/tutorials/recipes/recipes/profiler_recipe.html). diff --git a/nemotron3/launch.sh b/nemotron3/launch.sh new file mode 100755 index 0000000..1652684 --- /dev/null +++ b/nemotron3/launch.sh @@ -0,0 +1,156 @@ +#!/bin/bash +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +if [ ${BASH_VERSION:0:1} -lt 4 ] || [ ${BASH_VERSION:0:1} -eq 4 ] && [ ${BASH_VERSION:2:1} -lt 2 ]; then + printf "Unsupported %s version: %s\n" "${BASH}" "${BASH_VERSION}" >&2 + echo "Requires Bash 4.2 or greater." >&2 + exit 1 +fi + +set -eu -o pipefail + +export WORKLOAD_TYPE=pretrain +export MODEL_NAME=nemotron_3 +export FW_VERSION=26.04.00 + +export OPENBLAS_NUM_THREADS=1 # Required for login nodes with tight memory restrictions. Do not remove. + +export LLMB_WORKLOAD=$LLMB_INSTALL/workloads/${WORKLOAD_TYPE}_${MODEL_NAME} +export NEMORUN_HOME=$LLMB_WORKLOAD +export LLMB_REPO=$PWD + +export IMAGE=${RUN_CONF_IMAGE:-$LLMB_INSTALL/images/nvidia+nemo+$FW_VERSION.sqsh} + +DTYPE=${DTYPE:-bf16} +DTYPE=${DTYPE,,} +FP8_RECIPE=${FP8_RECIPE:-mx} +FP8_RECIPE=${FP8_RECIPE,,} +GPU_TYPE=${GPU_TYPE:?GPU_TYPE is a required variable.} +GPU_TYPE=${GPU_TYPE,,} +JOB_TOTAL_GPUS=${JOB_TOTAL_GPUS:?JOB_TOTAL_GPUS is a required variable.} +MODEL_SIZE=${MODEL_SIZE:-30b} +MODEL_SIZE=${MODEL_SIZE,,} + +if [[ $MODEL_SIZE == "30b" ]]; then + MODEL_RECIPE_NAME="nemotron_3_nano" +elif [[ $MODEL_SIZE == "120b" ]]; then + MODEL_RECIPE_NAME="nemotron_3_super" +else + echo "Error: Invalid MODEL_SIZE '$MODEL_SIZE'. Supported values: 30b, 120b" >&2 + exit 1 +fi + +PROFILE_ENABLED=${ENABLE_PROFILE:-false} +PROFILE_ENABLED=${PROFILE_ENABLED,,} +PYTORCH_PROFILE_ENABLED=${ENABLE_PYTORCH_PROFILE:-false} +PYTORCH_PROFILE_ENABLED=${PYTORCH_PROFILE_ENABLED,,} +PROFILE_START_STEP=${PROFILE_START_STEP:-45} +PROFILE_STOP_STEP=${PROFILE_STOP_STEP:-50} +GPU_METRICS_ENABLED=${ENABLE_GPU_METRICS:-false} +GPU_METRICS_ENABLED=${GPU_METRICS_ENABLED,,} +ENABLE_VBOOST=${ENABLE_VBOOST:-false} +ENABLE_VBOOST=${ENABLE_VBOOST,,} +TIME_LIMIT=${TIME_LIMIT:-"00:45:00"} +MAX_STEPS=${MAX_STEPS:-50} + +if [[ $DTYPE == "fp8" ]]; then + if [[ $GPU_TYPE == "h100" ]]; then + FP8_RECIPE="cs" + fi + COMPUTE_TYPE=${DTYPE}_${FP8_RECIPE} +else + COMPUTE_TYPE=${DTYPE} +fi + +# Handle additional SLURM parameters from environment variable +ADDITIONAL_SLURM_PARAMS=${ADDITIONAL_SLURM_PARAMS:-""} + +# Add additional SLURM parameters if provided +SLURM_ARGS="" +if [ -n "$ADDITIONAL_SLURM_PARAMS" ]; then + SLURM_ARGS="--additional_slurm_params ${ADDITIONAL_SLURM_PARAMS}" +fi + +export HF_HOME="$LLMB_INSTALL/.cache/huggingface" +CONTAINER_MOUNTS="$HF_HOME" + +if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then + if [[ -n ${CONTAINER_MOUNTS} ]]; then + CONTAINER_MOUNTS+="," + fi + CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" +fi + +CONFIG_OVERRIDES="" +if [[ -n ${CONTAINER_MOUNTS} ]]; then + CONFIG_OVERRIDES+=" --custom_mounts $CONTAINER_MOUNTS" +fi + +if [[ $PROFILE_ENABLED == "true" ]]; then + CONFIG_OVERRIDES+=" --enable_nsys " + CONFIG_OVERRIDES+=" --profiling_start_step=$PROFILE_START_STEP " + CONFIG_OVERRIDES+=" --profiling_stop_step=$PROFILE_STOP_STEP " + PROFILE_RANKS=$(seq -s, 0 $((JOB_TOTAL_GPUS - 1))) + CONFIG_OVERRIDES+=" --profiling_ranks=$PROFILE_RANKS" + CONFIG_OVERRIDES+=" --nsys_trace=cuda " + CONFIG_OVERRIDES+=" --nsys_extra_args=--nvtx-domain-include=NCCL " + if [[ $GPU_METRICS_ENABLED == true ]]; then + CONFIG_OVERRIDES+=" --profiling_gpu_metrics " + fi +fi + +if [[ $PYTORCH_PROFILE_ENABLED == "true" ]]; then + CONFIG_OVERRIDES+=" --pytorch_profiler true " +fi + +if [[ $ENABLE_VBOOST == true ]]; then + CONFIG_OVERRIDES+=" --enable_vboost true " +fi + +if [[ $GPU_TYPE == "gb300" ]] || [[ $GPU_TYPE == "gb200" ]]; then + GPUS_PER_NODE=4 +elif [[ $GPU_TYPE == "b300" ]] || [[ $GPU_TYPE == "b200" ]] || [[ $GPU_TYPE == "h100" ]]; then + GPUS_PER_NODE=8 +fi + +# run command +pushd $LLMB_WORKLOAD/Megatron-Bridge + +python3 scripts/performance/setup_experiment.py \ + --container_image $IMAGE \ + --compute_dtype $COMPUTE_TYPE \ + --gpu $GPU_TYPE \ + --num_gpus $JOB_TOTAL_GPUS \ + --gpus_per_node $GPUS_PER_NODE \ + --offline \ + --model_family_name nemotronh \ + --model_recipe_name ${MODEL_RECIPE_NAME} \ + ${CONFIG_OVERRIDES} \ + --account $SBATCH_ACCOUNT \ + --partition $SBATCH_PARTITION \ + --log_dir $NEMORUN_HOME \ + --time_limit $TIME_LIMIT \ + --max_steps $MAX_STEPS \ + --packager none \ + $SLURM_ARGS + +popd diff --git a/nemotron3/metadata.yaml b/nemotron3/metadata.yaml new file mode 100644 index 0000000..da81446 --- /dev/null +++ b/nemotron3/metadata.yaml @@ -0,0 +1,108 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +# Setup +general: + workload: nemotron_3 + workload_type: pretrain + framework: megatron_bridge + +container: + images: + - 'nvcr.io#nvidia/nemo:26.04.00' + +downloads: + huggingface: + - repo_id: 'nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16' + - repo_id: 'nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16' + +repositories: + megatron_bridge: + url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" + commit: "fab68031197b64934027e188c0cb417fdf1e1d7a" + + nemo_run: + url: "https://github.com/NVIDIA-NeMo/Run.git" + commit: "64b91e0187b93475ea0d54028317e349ced7ac1b" + +setup: + venv_req: true + dependencies: + git: + megatron_bridge: + repo_key: megatron_bridge + install_method: + type: clone + pip: + - package: nemo_run + repo_key: nemo_run + +tools: + nsys: + by_gpu: + gb300: "2025.5.1.121-3638078" + gb200: "2025.5.1.121-3638078" + b300: "2025.5.1.121-3638078" + b200: "2025.5.1.121-3638078" + +run: + launcher_type: 'megatron_bridge' + launch_script: 'launch.sh' + + gpu_configs: + gb300: + model_configs: + - model_size: '30b' + dtypes: ['bf16', 'fp8'] + scales: [8, 16, 32, 64] + - model_size: '120b' + dtypes: ['bf16', 'fp8', 'nvfp4'] + scales: [64, 128, 256, 512] + + gb200: + model_configs: + - model_size: '30b' + dtypes: ['bf16'] + scales: [8, 16, 32, 64] + + b300: + model_configs: + - model_size: '30b' + dtypes: ['bf16', 'fp8'] + scales: [8, 16, 32, 64] + - model_size: '120b' + dtypes: ['bf16'] + scales: [64, 128, 256, 512] + + b200: + model_configs: + - model_size: '30b' + dtypes: ['bf16', 'fp8'] + scales: [8, 16, 32, 64] + - model_size: '120b' + dtypes: ['bf16', 'fp8'] + scales: [64, 128, 256, 512] + + h100: + model_configs: + - model_size: '30b' + dtypes: ['bf16', 'fp8'] + scales: [16, 32, 64] diff --git a/qwen3/pretrain/README.md b/qwen3/pretrain/README.md index b32d7e9..8bcc743 100644 --- a/qwen3/pretrain/README.md +++ b/qwen3/pretrain/README.md @@ -24,8 +24,8 @@ Configurations use weak scaling methodology (global batch size scales proportion | Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | | :-------- | :--: | :----: | :----: | --: | --: | --: | --: | --: | --: | --: | :---: | --: | -| BF16 | 256 | 4096 | 94 | 1 | 4 | 1 | 16 | 64 | 12 | 2 | 8192 | 64 | -| BF16 | 512 | 4096 | 94 | 1 | 4 | 1 | 16 | 128 | 12 | 2 | 16384 | 64 | +| BF16 | 256 | 4096 | 94 | 1 | 4 | 1 | 32 | 64 | N/A | 2 | 8192 | 64 | +| BF16 | 512 | 4096 | 94 | 1 | 4 | 1 | 32 | 128 | N/A | 2 | 16384 | 64 | ## GB200 @@ -60,8 +60,8 @@ Configurations use weak scaling methodology (global batch size scales proportion | Precision | GPUs | SeqLen | Layers | TP | PP | CP | EP | DP | VP | MBS | GBS | GA | | :-------- | :--: | :----: | :----: | --: | --: | --: | --: | --: | --: | --: | :---: | --: | -| BF16 | 256 | 4096 | 94 | 1 | 8 | 1 | 8 | 32 | N/A | 1 | 8192 | 256 | -| BF16 | 512 | 4096 | 94 | 1 | 8 | 1 | 8 | 64 | N/A | 1 | 16384 | 256 | +| BF16 | 256 | 4096 | 94 | 1 | 8 | 1 | 8 | 32 | N/A | 2 | 8192 | 128 | +| BF16 | 512 | 4096 | 94 | 1 | 8 | 1 | 8 | 64 | N/A | 2 | 16384 | 128 | ## B200 @@ -100,65 +100,58 @@ Configurations use weak scaling methodology (global batch size scales proportion # Performance Measurement and Analysis -Performance for Qwen3 training is measured by the achieved GPU FLOPS via the `TFLOPS_per_GPU` metric, which indicates computational throughput efficiency. Additionally, training step timing (seconds per iteration) is captured and logged for every training step in the main training log file [see Output Locations](#output-locations). +Performance is reported as: + +- `s/iter` — wall-clock seconds per training step +- `TFLOPS/GPU` — sustained FLOPS achieved per GPU -Since the early training steps typically take much longer time (with input prefetch, activation memory allocation, and JIT compilation), we use the `parse_train_timing_mbridge.sh` script to analyze iterations 35-44 and calculate mean and standard deviation for reliable performance metrics for both TFLOPS per GPU and timing measurements. +Each benchmark runs 50 steps; iterations 35–44 are averaged to skip warmup (input prefetch, activation allocation, JIT compilation). -### Running the parse_train_timing_mbridge.sh script +## Viewing results with `llmb-run jobs` -To analyze training timing from your experiment results, run the script from the workload directory. In an installed environment, recipe files are available under `$LLMB_INSTALL/llmb_repo` (a copy created by the installer). +Each `llmb-run jobs` command refreshes Slurm state and parses the training log for any job that has finished (succeeded, failed, or cancelled) — there is no background updater. Run from `$LLMB_INSTALL`: ```bash -# Basic usage - parses results in the directory named 'experiments' in the current folder -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh +# List all jobs you've submitted, with parsed metrics +llmb-run jobs -# Specify a different experiments directory -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh /path/to/experiments +# Full details for one job (Job ID comes from the listing above) +llmb-run jobs show -# Output in CSV format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=csv +# Open the training log; --follow tails it, --dir prints the experiment directory +llmb-run jobs log +``` -# Output in JSON format -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --format=json +Example `llmb-run jobs` output (illustrative values): -# Show full filenames instead of shortened versions -$LLMB_INSTALL/llmb_repo/common/parse_train_timing_mbridge.sh --full-names +```text + Workload DType Scale Job ID Profile Submit Time Slurm Status Elapsed s/iter TFLOPS/GPU + pretrain_example_8b bf16 128 1234567 No 2026-04-17 13:42 COMPLETED 00:12:34 4.21 1234.56 + pretrain_example_70b fp8 256 1234589 No 2026-04-17 14:05 RUNNING 00:03:11 ``` -Example output: +Blank `s/iter` or `TFLOPS/GPU` means the job has not finished yet, or the log did not contain enough completed iterations. See the [llmb-run README](../../cli/llmb-run/README.md#jobs-command) for the full command reference. -```shell -Elapsed Time (ms) and TFLOPS/GPU Analysis (iterations 35-44) -================================================================================ -Experiment Status Time Mean (ms) Time Std (ms) TFLOPS_per_GPU Mean TFLOPS_per_GPU Std ------------------------------------------------------------------------------------------- -------- ------------- ------------ ------------------- ------------------ -pretrain_qwen3_235b_a22b_bf16_gpus256_tp2_pp8_cp1_vp4_ep32_mbs1_gbs2048_4006524 Success 25532.270 13.911 190.00 0.11 -``` +## Derived metrics -To obtain throughput as a tokens per second measurement, follow this formula: +To convert step time into tokens per second: -```shell -(throughput in tokens per second) = (sequence length) * (global batch size) / training_step_timing +```text +(throughput in tokens/sec) = (sequence length) * (global batch size) / (s/iter) ``` -E.g. +To estimate time-to-train for a target token budget: -To calculate time to train estimate: - -```shell -(time to train in days) = (total tokens) / (throughput in tokens per second) / (number of seconds in a day) +```text +(time to train in days) = (total tokens) / (throughput in tokens/sec) / 86400 ``` -E.g. - -To calculate the model flops utilization (MFU): +To compute model FLOPs utilization (MFU): -```shell -MFU = (achieved TFLOPS_per_GPU) / (peak GPU FLOPS) +```text +MFU = TFLOPS/GPU / (peak GPU FLOPS) ``` -**Peak theoretical throughput across GPUs and Data Types (in TFLOPS)** - For peak theoretical throughput values used in MFU calculations, see the [Peak Theoretical Throughput](../../README.md#peak-theoretical-throughput) section in the main README. # Prerequisites @@ -223,31 +216,33 @@ llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 ### Additional SLURM Parameters -Use a SLURM reservation: +For `llmb-run submit`, use the built-in Slurm flags instead of `ADDITIONAL_SLURM_PARAMS`. + +Use a Slurm reservation: ```bash -ADDITIONAL_SLURM_PARAMS="reservation=my_reservation" llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 +llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 --reservation my_reservation ``` Run on specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002" llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 +llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 --nodelist node001,node002 ``` Exclude specific nodes: ```bash -ADDITIONAL_SLURM_PARAMS="exclude=node003,node004" llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 +llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 --exclude node003,node004 ``` -Combine multiple parameters (semicolon-separated): +Combine multiple parameters: ```bash -ADDITIONAL_SLURM_PARAMS="nodelist=node001,node002;reservation=my_reservation;exclusive" llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 +llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 --nodelist node001,node002 --reservation my_reservation --slurm-arg exclusive ``` -For more details on llmb-run usage, see the [llmb-run documentation](../../cli/llmb-run/README.md). +For more details on `llmb-run` usage, see the [llmb-run documentation](../../cli/llmb-run/README.md). ## Direct Method @@ -320,7 +315,7 @@ The `` typically follows the pattern: `pretrain_qwen3_.out` - Contains training step timing and performance metrics analyzed by `parse_train_timing_mbridge.sh` +- `log-.out` - Contains training step timing and performance metrics parsed by `llmb-run jobs` - `nsys_profile/` - Contains profiling traces when using the `-p` flag with `llmb-run` or when `ENABLE_PROFILE=true` # Profiling @@ -390,4 +385,4 @@ PyTorch Profiling is intended for rare, advanced debugging scenarios such as NCC ENABLE_PYTORCH_PROFILE=true llmb-run submit -w pretrain_qwen3 -s 235b --dtype bf16 --scale 256 ``` -For details on the PyTorch Profiler and how to view resulting traces, see the [PyTorch Profiler documentation](https://docs.pytorch.org/tutorials/recipes/recipes/profiler_recipe.html). +Trace files are saved to `torch_profile/rank-N.json.gz` in the job output directory, where `N` is the rank number. For details on the PyTorch Profiler and how to view resulting traces, see the [PyTorch Profiler documentation](https://docs.pytorch.org/tutorials/recipes/recipes/profiler_recipe.html). diff --git a/qwen3/pretrain/launch.sh b/qwen3/pretrain/launch.sh index 8545573..2c37c33 100755 --- a/qwen3/pretrain/launch.sh +++ b/qwen3/pretrain/launch.sh @@ -60,11 +60,7 @@ export GPU_TYPE=${GPU_TYPE:?GPU_TYPE is a required variable.} export GPU_TYPE=${GPU_TYPE,,} export JOB_TOTAL_GPUS=${JOB_TOTAL_GPUS:?JOB_TOTAL_GPUS is a required variable.} -if [ "$GPU_TYPE" = "gb300" ]; then - FW_VERSION=26.02.00 -else - FW_VERSION=26.02.01 -fi +FW_VERSION=26.04.00 export IMAGE=${RUN_CONF_IMAGE:-$LLMB_INSTALL/images/nvidia+nemo+$FW_VERSION.sqsh} if [ "$MODEL_SIZE" = "235b" ]; then @@ -104,7 +100,10 @@ if [[ -n ${RUN_CONF_MOUNTS:-""} ]]; then CONTAINER_MOUNTS+="${RUN_CONF_MOUNTS}" fi -CONFIG_OVERRIDES="" +CONFIG_OVERRIDES="${CONFIG_OVERRIDES:-}" +if [[ -n ${CONFIG_OVERRIDES} ]]; then + CONFIG_OVERRIDES+=" " +fi if [[ -n ${CONTAINER_MOUNTS} ]]; then CONFIG_OVERRIDES+=" --custom_mounts $CONTAINER_MOUNTS" fi @@ -213,6 +212,7 @@ python3 scripts/performance/setup_experiment.py \ --log_dir $NEMORUN_HOME \ --time_limit $TIME_LIMIT \ --max_steps $MAX_STEPS \ + --packager none \ $SLURM_ARGS popd diff --git a/qwen3/pretrain/metadata.yaml b/qwen3/pretrain/metadata.yaml index 5c12792..2a6183d 100644 --- a/qwen3/pretrain/metadata.yaml +++ b/qwen3/pretrain/metadata.yaml @@ -27,11 +27,7 @@ general: container: images: - by_gpu: - default: - - 'nvcr.io#nvidia/nemo:26.02.01' - gb300: - - 'nvcr.io#nvidia/nemo:26.02.00' + - 'nvcr.io#nvidia/nemo:26.04.00' downloads: huggingface: @@ -39,21 +35,12 @@ downloads: - repo_id: 'Qwen/Qwen3-235B-A22B' repositories: - by_gpu: - default: - megatron_bridge: - url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "aeead1ae667d795ebe725cb4a608581a21f402cc" - nemo_run: - url: "https://github.com/NVIDIA-NeMo/Run.git" - commit: "525d68bfce2d6baed86ed3d7d0edbae07833ea0d" - gb300: - megatron_bridge: - url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" - commit: "6b3b5ba7ef64182caba23faa8d7abc0125aa3807" - nemo_run: - url: "https://github.com/NVIDIA-NeMo/Run.git" - commit: "ab0c4328275c4c731f1bdea3ceb0e68a9a17a6a2" + megatron_bridge: + url: "https://github.com/NVIDIA-NeMo/Megatron-Bridge.git" + commit: "fab68031197b64934027e188c0cb417fdf1e1d7a" + nemo_run: + url: "https://github.com/NVIDIA-NeMo/Run.git" + commit: "64b91e0187b93475ea0d54028317e349ced7ac1b" setup: venv_req: true @@ -67,6 +54,13 @@ setup: - package: nemo_run repo_key: nemo_run +tools: + nsys: + by_gpu: + gb300: "2025.5.1.121-3638078" + gb200: "2025.5.1.121-3638078" + b200: "2025.5.1.121-3638078" + # Run run: diff --git a/release.yaml b/release.yaml index 5cc0130..e66c905 100644 --- a/release.yaml +++ b/release.yaml @@ -22,5 +22,5 @@ # Top-level release metadata for the LLM Benchmarking Collection # This file centralizes version management across all recipes -llmb_version: '26.02.01' +llmb_version: '26.05.00'