Skip to content

Set command to spawn with lower priority in GenericScript Run method#438

Draft
neddp wants to merge 1 commit into
mainfrom
337-nice-child-scripts
Draft

Set command to spawn with lower priority in GenericScript Run method#438
neddp wants to merge 1 commit into
mainfrom
337-nice-child-scripts

Conversation

@neddp
Copy link
Copy Markdown
Member

@neddp neddp commented Jun 1, 2026

Lifecycle scripts (drain, pre-start, post-start, post-deploy, etc.) are executed by the agent and inherit its scheduling priority. When a script is CPU-intensive (e.g. cloning large amounts of data from a database cluster), it can starve the agent's own event loop, causing the director to time out with an agent-unreachable error - even though the agent is technically alive and the script is making progress. The deployment then fails and the script has to run from scratch.

The stemcell already runs the agent at nice -15 (see bosh-linux-stemcell-builder@00054bd) to give it priority over BOSH-managed jobs (nice 0). However, lifecycle scripts spawned by the agent inherit that -15 priority, defeating the purpose.

This PR sets SpawnWithLowerPriority = true on the Command struct for all lifecycle scripts, so they run at a lower priority than the agent itself:

  • Linux: child process nice = parent nice + 5 (capped at 19). With the agent at -15, scripts get nice -10 - still above normal jobs (0), but below the agent.
  • Windows: child process priority class is set to BelowNormal.

The priority logic itself lives entirely in bosh-utils (#142), which adds the SpawnWithLowerPriority field to boshsys.Command with platform-specific priority logic inlined directly (no external dependency, as suggested by @rkoster).

Before we merge

References

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bee335c1-6e98-4ee9-89b0-b5537aa60418

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 337-nice-child-scripts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

Lifecycle hooks can make the agent unresponsive

1 participant