Skip to content

refactor: extract dereference/validate pipeline from reconcile_hive#707

Open
adwk67 wants to merge 17 commits into
mainfrom
feature/validated-config-types-2
Open

refactor: extract dereference/validate pipeline from reconcile_hive#707
adwk67 wants to merge 17 commits into
mainfrom
feature/validated-config-types-2

Conversation

@adwk67
Copy link
Copy Markdown
Member

@adwk67 adwk67 commented May 12, 2026

Summary

  • Derive Clone, Debug, Eq, Hash, Ord, PartialEq, PartialOrd on HiveRole so it can be used as a BTreeMap key and in validated structs
  • Extract external resource resolution (product image, S3 connection, metadata database, OPA config) into controller::dereference module with its own Snafu error enum
  • Extract product-config validation and config merging into validate_cluster(), producing a ValidatedHiveCluster struct that proves all validation succeeded before any Kubernetes resources are created
  • ValidatedHiveCluster owns the resolved product image and per-role/per-rolegroup merged configs; existing build functions are unchanged and receive parameters from the validated structs

Reviewer notes

  • dereference() and validate_cluster() contain no new logic — they are pure extractions of code that was previously inline in reconcile_hive()
  • The ValidatedHiveCluster struct intentionally has fewer fields than a full ownership model would (no name/namespace/uid/metadata validated types). This is a "construct but decompose" fail-fast gate: built early in reconcile to prove validation passes, then its fields feed the existing unchanged build functions
  • The controller/ directory coexists with controller.rs — Rust treats controller.rs as the module root and looks for submodules in controller/. Currently just dereference, with validate and further stages to follow in later PRs
  • The Dereference wrapper variant in the controller's Error enum replaces 3 individual error variants (ResolveProductImage, InvalidOpaConfig, InvalidMetadataDatabaseConnection) that moved into the new module's own error enum. ConfigureS3Connection and ObjectHasNoNamespace remain in the controller because they are still used by build functions
  • Mirrors the pattern established in the airflow-operator (refactor: extract dereference/validate pipeline from reconcile_airflow airflow-operator#795)

Test plan

  • Existing unit tests pass
  • Clean compile with no warnings
  • No behavioural changes — pure refactoring

🤖 Generated with Claude Code

Move external resource resolution (product image, S3 connection, metadata
database, OPA config) into controller::dereference module with its own
error enum. Extract config validation and merging into validate_cluster(),
which produces a ValidatedHiveCluster proving all product-config validation
succeeded before any Kubernetes resources are created.

The validated struct owns the resolved product image and per-role/
per-rolegroup merged configs. Existing build functions are unchanged
and receive their parameters from the validated structs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@adwk67 adwk67 self-assigned this May 12, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@adwk67
Copy link
Copy Markdown
Member Author

adwk67 commented May 13, 2026

Rename FailedToResolveResourceConfig to FailedToResolveConfig and
fix OPA error display string to match the convention used across
all three dereference/validate extraction PRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@adwk67 adwk67 marked this pull request as ready for review May 13, 2026 10:12
@siegfriedweber siegfriedweber self-requested a review May 13, 2026 14:09
@siegfriedweber siegfriedweber moved this to Development: In Review in Stackable Engineering May 13, 2026
adwk67 and others added 3 commits May 13, 2026 16:58
Image resolution is a pure computation, not an I/O dereference, so it
belongs in validate_cluster alongside the other config validation. This
aligns with the pattern used by the trino and airflow operators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment thread rust/operator-binary/src/controller/dereference.rs Outdated
Comment thread rust/operator-binary/src/controller/validate.rs Outdated
Comment thread rust/operator-binary/src/controller/validate.rs Outdated
Comment thread rust/operator-binary/src/controller/validate.rs
Comment thread rust/operator-binary/src/controller/validate.rs Outdated
Comment thread rust/operator-binary/src/controller/validate.rs Outdated
Comment thread rust/operator-binary/src/controller/dereference.rs
Comment thread rust/operator-binary/src/controller.rs
Comment thread rust/operator-binary/src/controller.rs Outdated
Comment on lines +53 to +65
/// Per-role configuration extracted during validation.
#[derive(Clone, Debug)]
pub struct ValidatedRoleConfig {
pub pdb: stackable_operator::commons::pdb::PdbConfig,
pub listener_class: String,
}

/// Per-rolegroup configuration: the merged CRD config plus the product-config properties.
#[derive(Clone, Debug)]
pub struct ValidatedRoleGroupConfig {
pub merged_config: MetaStoreConfig,
pub product_config_properties: HashMap<PropertyNameKind, BTreeMap<String, String>>,
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These validated structs could be moved to controller.rs next to ValidatedCluster. They belong together.

Comment thread rust/operator-binary/src/controller/validate.rs Outdated
metadata:
annotations:
listeners.stackable.tech/listener-name: hive-metastore
creationTimestamp: null
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The smoke tests fail:

    case.go:401: failed in step 60-install-hive
    case.go:403:
        --- StatefulSet:test/hive-metastore-default
        +++ StatefulSet:test/hive-metastore-default
        @@ -7,8 +7,10 @@
             app.kubernetes.io/managed-by: hive.stackable.tech_hivecluster
             app.kubernetes.io/name: hive
             app.kubernetes.io/role-group: default
        +    app.kubernetes.io/version: 4.2.0-stackable0.0.0-dev
             restarter.stackable.tech/enabled: "true"
             stackable.tech/vendor: Stackable
        +  managedFields: '[... elided field over 10 lines long ...]'
           name: hive-metastore-default
           namespace: test
           ownerReferences:
        @@ -16,9 +18,14 @@
             controller: true
             kind: HiveCluster
             name: hive
        +    uid: c7430cef-8300-4798-a3b0-8a18a5d46c14
         spec:
        +  persistentVolumeClaimRetentionPolicy:
        +    whenDeleted: Retain
        +    whenScaled: Retain
           podManagementPolicy: Parallel
           replicas: 1
        +  revisionHistoryLimit: 10
           selector:
             matchLabels:
               app.kubernetes.io/component: metastore
        @@ -28,12 +35,16 @@
           serviceName: hive-metastore-default-headless
           template:
             metadata:
        +      annotations:
        +        configmap.restarter.stackable.tech/hive-metastore-default: 77b8df79-2b13-41a9-942d-e49ea5232832/5052
        +        secret.restarter.stackable.tech/hive-credentials: a1f650d3-3145-4617-88da-896d84f46a88/5044
               labels:
                 app.kubernetes.io/component: metastore
                 app.kubernetes.io/instance: hive
                 app.kubernetes.io/managed-by: hive.stackable.tech_hivecluster
                 app.kubernetes.io/name: hive
                 app.kubernetes.io/role-group: default
        +        app.kubernetes.io/version: 4.2.0-stackable0.0.0-dev
                 stackable.tech/vendor: Stackable
             spec:
               affinity:
        @@ -130,6 +141,7 @@
                   value: group-value
                 - name: ROLE_VAR
                   value: role-value
        +        image: oci.stackable.tech/sdp/hive:4.2.0-stackable0.0.0-dev
                 imagePullPolicy: IfNotPresent
                 livenessProbe:
                   failureThreshold: 3
        @@ -162,6 +174,8 @@
                   requests:
                     cpu: 250m
                     memory: 768Mi
        +        terminationMessagePath: /dev/termination-log
        +        terminationMessagePolicy: File
                 volumeMounts:
                 - mountPath: /stackable/secrets/test-hive-s3-secret-class
                   name: test-hive-s3-secret-class-s3-credentials
        @@ -181,6 +195,8 @@
               enableServiceLinks: false
               restartPolicy: Always
               schedulerName: default-scheduler
        +      securityContext:
        +        fsGroup: 1000
               serviceAccount: hive-serviceaccount
               serviceAccountName: hive-serviceaccount
               terminationGracePeriodSeconds: 300
        @@ -204,6 +220,7 @@
                   volumeClaimTemplate:
                     metadata:
                       annotations:
        +                secrets.stackable.tech/class: opa-tls-test
                         secrets.stackable.tech/provision-parts: public
                     spec:
                       accessModes:
        @@ -228,13 +245,16 @@
                   defaultMode: 420
                   name: hive-metastore-default
                 name: log-config-mount
        +  updateStrategy:
        +    rollingUpdate:
        +      partition: 0
        +    type: RollingUpdate
           volumeClaimTemplates:
           - apiVersion: v1
             kind: PersistentVolumeClaim
             metadata:
               annotations:
                 listeners.stackable.tech/listener-name: hive-metastore
        -      creationTimestamp: null
               labels:
                 app.kubernetes.io/component: metastore
                 app.kubernetes.io/instance: hive
        @@ -252,7 +272,16 @@
                   storage: "1"
               storageClassName: listeners.stackable.tech
               volumeMode: Filesystem
        +    status:
        +      phase: Pending
         status:
        +  availableReplicas: 1
        +  collisionCount: 0
        +  currentReplicas: 1
        +  currentRevision: hive-metastore-default-57c74d49fd
        +  observedGeneration: 1
           readyReplicas: 1
           replicas: 1
        -
        +  updateRevision: hive-metastore-default-57c74d49fd
        +  updatedReplicas: 1
        +

    case.go:403: resource StatefulSet:test/hive-metastore-default: .spec.volumeClaimTemplates.metadata.creationTimestamp: key is missing from map
    logger.go:42: 14:10:59 | smoke_postgres-12.5.6_hive-4.2.0_opa-latest-1.12.3_openshift-false_s3-use-tls-false_opa-use-tls-true | skipping kubernetes event logging
=== NAME  kuttl
    harness.go:407: run tests finished
    harness.go:515: cleaning up
    harness.go:572: removing temp folder: ""
--- FAIL: kuttl (972.33s)
    --- FAIL: kuttl/harness (0.00s)
        --- FAIL: kuttl/harness/smoke_postgres-12.5.6_hive-4.2.0_opa-latest-1.12.3_openshift-false_s3-use-tls-false_opa-use-tls-true (972.32s)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The creationTimestamp is not set in my k3s cluster.

Client Version: v1.34.3
Kustomize Version: v5.7.1
Server Version: v1.34.5+k3s1

Can we remove it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: In Review

Development

Successfully merging this pull request may close these issues.

2 participants