Skip to content

OTA-869: implement separate deployments with multi-layer autoscaling#1061

Open
fao89 wants to merge 1 commit into
openshift:masterfrom
fao89:feature/separate-deployments-autoscaling-resilience
Open

OTA-869: implement separate deployments with multi-layer autoscaling#1061
fao89 wants to merge 1 commit into
openshift:masterfrom
fao89:feature/separate-deployments-autoscaling-resilience

Conversation

@fao89
Copy link
Copy Markdown
Member

@fao89 fao89 commented Feb 13, 2026

  • Split Cincinnati into independent graph-builder and policy-engine pods
  • Fix KEDA incident vulnerability by using base metrics instead of recording rules
  • Add HPA fallback autoscaling for resilience when KEDA unavailable
  • Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
  • Switch from localhost to Kubernetes DNS service communication
  • Add comprehensive incident prevention alerts and monitoring

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com
Signed-off-by: Fabricio Aguiar fabricio.aguiar@gmail.com

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Summary by CodeRabbit

  • Tests
    • Enhanced Cincinnati service readiness verification to ensure core components reach operational status before testing
    • Improved inter-service connectivity validation with comprehensive internal health checks
    • Increased reliability of service health verification during test execution

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f576a44e-d960-4118-88b4-ddf5d9ece77d

📥 Commits

Reviewing files that changed from the base of the PR and between 6fb76df and 3b39243.

⛔ Files ignored due to path filters (2)
  • dist/openshift/cincinnati-deployment.yaml is excluded by !**/dist/**
  • dist/openshift/readme.md is excluded by !**/dist/**
📒 Files selected for processing (1)
  • Justfile
🚧 Files skipped from review as they are similar to previous changes (1)
  • Justfile

Walkthrough

The Justfile test flow now waits for two pods—cincinnati-graph-builder and cincinnati-policy-engine—to become Ready, captures their pod names, runs internal HTTP checks from the policy-engine pod (curl localhost:8081 and curl cincinnati-graph-builder:8080), then continues with the existing external route test. (49 words)

Changes

Multi-service readiness & connectivity

Layer / File(s) Summary
Wait for individual service pods & derive names
Justfile
Replace single app=cincinnati wait with separate waits for app=cincinnati-graph-builder and app=cincinnati-policy-engine, and store each pod name for later exec use.
Internal HTTP checks from policy-engine pod
Justfile
Exec into the policy-engine pod and run curl localhost:8081?channel=a to validate the policy-engine HTTP endpoint.
Inter-service DNS/connectivity check
Justfile
From the policy-engine pod, curl http://cincinnati-graph-builder:8080 to verify Kubernetes DNS and service reachability between graph-builder and policy-engine.
Preserve external route test
Justfile
Keep the existing external route/access verification after internal checks.
sequenceDiagram
  participant Tester as rgba(0,128,255,0.5) Tester
  participant KubeAPI as rgba(0,200,0,0.5) Kube API
  participant PolicyPod as rgba(255,165,0,0.5) policy-engine Pod
  participant GraphSvc as rgba(128,0,128,0.5) cincinnati-graph-builder Service
  participant External as rgba(200,0,0,0.5) External Route

  Tester->>KubeAPI: wait for policy-engine Ready
  Tester->>KubeAPI: wait for graph-builder Ready
  KubeAPI-->>Tester: pod names
  Tester->>PolicyPod: kubectl exec curl localhost:8081?channel=a
  PolicyPod-->>Tester: HTTP 200/response
  Tester->>PolicyPod: kubectl exec curl cincinnati-graph-builder:8080
  PolicyPod->>GraphSvc: HTTP request
  GraphSvc-->>PolicyPod: HTTP 200/response
  Tester->>External: perform external route test
  External-->>Tester: external response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title references OTA-869 and describes implementing separate deployments with multi-layer autoscaling. The raw summary confirms the Justfile changes involve updating test logic for separate cincinnati-graph-builder and cincinnati-policy-engine pods, aligning with the deployment separation objective. However, the title focuses on the high-level feature (separate deployments, autoscaling) rather than the specific test changes in this changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Repository uses Rust with test_case macros, not Ginkgo. All test names are static and deterministic with no dynamic information. Check not applicable.
Test Structure And Quality ✅ Passed Custom check requests Ginkgo test code review. PR is for a Rust project with no Go/Ginkgo tests. Only shell script Justfile modified. Check not applicable.
Microshift Test Compatibility ✅ Passed This PR does not add any Ginkgo e2e tests. The repository is a Rust project using standard Rust tests. The custom check is not applicable to this PR.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests are added. Cincinnati is a Rust project with no Go tests. The Justfile changes add a shell script test, not Ginkgo tests.
Topology-Aware Scheduling Compatibility ✅ Passed Pod anti-affinity uses PREFERRED (not required), replicas default to 1 (works on SNO), and PDBs use maxUnavailable=1. No control-plane node selectors or problematic constraints. Topology-compatible.
Ote Binary Stdout Contract ✅ Passed OTE check not applicable. Cincinnati is Rust, not Go. No OTE integration present.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo tests added. PR modifies Justfile recipe and Rust e2e tests only. Custom check targets Ginkgo tests specifically. Not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fao89

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2026
@fao89 fao89 force-pushed the feature/separate-deployments-autoscaling-resilience branch from b811fac to 6fb76df Compare February 13, 2026 18:13
@fao89 fao89 changed the title feat: implement separate deployments with multi-layer autoscaling OTA-1863: implement separate deployments with multi-layer autoscaling Feb 18, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 18, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Feb 18, 2026

@fao89: This pull request references OTA-1863 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

  • Split Cincinnati into independent graph-builder and policy-engine pods
  • Fix KEDA incident vulnerability by using base metrics instead of recording rules
  • Add HPA fallback autoscaling for resilience when KEDA unavailable
  • Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
  • Switch from localhost to Kubernetes DNS service communication
  • Add comprehensive incident prevention alerts and monitoring

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com
Signed-off-by: Fabricio Aguiar fabricio.aguiar@gmail.com

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@fao89 fao89 changed the title OTA-1863: implement separate deployments with multi-layer autoscaling OTA-869: implement separate deployments with multi-layer autoscaling Mar 4, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 4, 2026

@fao89: This pull request references OTA-869 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "4.22.0" version, but no target version was set.

Details

In response to this:

  • Split Cincinnati into independent graph-builder and policy-engine pods
  • Fix KEDA incident vulnerability by using base metrics instead of recording rules
  • Add HPA fallback autoscaling for resilience when KEDA unavailable
  • Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
  • Switch from localhost to Kubernetes DNS service communication
  • Add comprehensive incident prevention alerts and monitoring

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com
Signed-off-by: Fabricio Aguiar fabricio.aguiar@gmail.com

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

- Split Cincinnati into independent graph-builder and policy-engine pods
- Fix KEDA incident vulnerability by using base metrics instead of recording rules
- Add HPA fallback autoscaling for resilience when KEDA unavailable
- Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
- Switch from localhost to Kubernetes DNS service communication
- Add comprehensive incident prevention alerts and monitoring

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Fabricio Aguiar <fabricio.aguiar@gmail.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
@fao89 fao89 force-pushed the feature/separate-deployments-autoscaling-resilience branch from 6fb76df to 3b39243 Compare May 14, 2026 15:51
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 14, 2026

@fao89: This pull request references OTA-869 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "5.0.0" version, but no target version was set.

Details

In response to this:

  • Split Cincinnati into independent graph-builder and policy-engine pods
  • Fix KEDA incident vulnerability by using base metrics instead of recording rules
  • Add HPA fallback autoscaling for resilience when KEDA unavailable
  • Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
  • Switch from localhost to Kubernetes DNS service communication
  • Add comprehensive incident prevention alerts and monitoring

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com
Signed-off-by: Fabricio Aguiar fabricio.aguiar@gmail.com

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Summary by CodeRabbit

  • Tests
  • Enhanced Cincinnati service readiness verification to ensure core components reach operational status before testing
  • Improved inter-service connectivity validation with comprehensive internal health checks
  • Increased reliability of service health verification during test execution

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@fao89
Copy link
Copy Markdown
Member Author

fao89 commented May 14, 2026

/test cargo-test

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 14, 2026

@fao89: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants