Skip to content

dev_cluster: Add --cpuset-stride flag#29843

Merged
StephanDollberg merged 2 commits intodevfrom
stephan/cpuset-skip
Mar 17, 2026
Merged

dev_cluster: Add --cpuset-stride flag#29843
StephanDollberg merged 2 commits intodevfrom
stephan/cpuset-skip

Conversation

@StephanDollberg
Copy link
Copy Markdown
Member

Make dev_cluster cpuset generation optionally use a stride.

This is useful for SMT systems or SMT specific testing:

  • --cpuset-stride=2 on a 32 core system will generate 0,2 , 4,6 and
    8,10
  • --cpuset-stride=16 on a 32 core system will generate 0,16, 1,17 and
    2,18

Both are a valid scenarios depending on the SMT sibling core assignment
id of the system.

Note flag validation is not very detailed as it's generally hard to
validate the different valid scenarios (e.g.: as per above).

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

  • none

Always pass unrolled cpusets, i.e.: 0,1,2,3 instead of 0-3.

Preparation for "smarter" cpuset assign logic
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional CPU interleaving strategy to tools/dev_cluster.py so generated --cpuset values can be spaced by a user-provided stride, enabling SMT-aware placements for dev clusters.

Changes:

  • Introduces --cpuset-stride CLI flag (default 1) to control spacing between allocated CPU IDs.
  • Adds cpuset_cpu(...) and updates cpuset generation to emit an explicit CPU list rather than a contiguous range.
  • Threads the new stride setting through Redpanda node startup.

Comment thread tools/dev_cluster.py
Comment thread tools/dev_cluster.py
Comment thread tools/dev_cluster.py
Comment thread tools/dev_cluster.py Outdated


def cpuset_cpu(
cpu_count: int, stride: int, cores: int, node_index: int, core_index: int
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use both "cpu" and "core" here, is there a difference I should be aware of?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One is the actual hardware core count, the other is --smp. Let me clarify.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved

travisdowns
travisdowns previously approved these changes Mar 16, 2026
@StephanDollberg
Copy link
Copy Markdown
Member Author

Argh, actually this is more complicated. psutil cpu thing also excludes offline cpus.

Make dev_cluster cpuset generation optionally use a stride.

This is useful for SMT systems or SMT specific testing:

 - --cpuset-stride=2 on a 32 core system will generate 0,2 , 4,6 and
   8,10
 - --cpuset-stride=16 on a 32 core system will generate 0,16, 1,17 and
   2,18

Both are a valid scenarios depending on the SMT sibling core assignment
id of the system.

Note flag validation is not very detailed as it's generally hard to
validate the different valid scenarios (e.g.: as per above).
@StephanDollberg
Copy link
Copy Markdown
Member Author

Using nproc --all now to get the count

@travisdowns
Copy link
Copy Markdown
Member

Argh, actually this is more complicated. psutil cpu thing also excludes offline cpus.

How should stride work with offline CPUs? Should it include them (treat them as if online, effectively)?

@StephanDollberg
Copy link
Copy Markdown
Member Author

How should stride work with offline CPUs? Should it include them (treat them as if online, effectively)?

I think (this is what's implemented) it should just ignore them and leave all responsibility with the user as otherwise the logic would get even more complex.

@StephanDollberg StephanDollberg merged commit 198ad25 into dev Mar 17, 2026
13 checks passed
@StephanDollberg StephanDollberg deleted the stephan/cpuset-skip branch March 17, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants