Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions docs/reference/cluster_manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,25 @@ Note that `s3_wal_path` and `gs_wal_path` are mutually exclusive.
from a remote primary. See the Patroni documentation
[here](https://patroni.readthedocs.io/en/latest/standby_cluster.html) for more details. Optional.

## Lifecycle configuration

Parameters to control cluster hibernate/wake-up behavior.

* **phase**
Set to `"stopped"` to hibernate the cluster. When this field is set on a
running cluster, the operator will:
* Store the current number of instances in the status
* Scale down the StatefulSet to 0 replicas
* Scale down the connection pooler to 0 replicas
* Set the cluster status to "Stopping", then "Stopped"

When this field is removed from a stopped cluster, the operator will:
* Restore the number of instances from the stored value
* Scale up the StatefulSet and connection pooler
* Set the cluster status to "Updating", then "Running"

This field is optional. When not set, the cluster operates normally.

## Volume properties

Those parameters are grouped under the `volume` top-level key and define the
Expand Down Expand Up @@ -714,3 +733,21 @@ can have the following properties:

* **memory**
memory requests to be set as an annotation on the stream resource. Optional.

## Status fields

The operator reports the cluster state through the `status` sub-resource. These
fields are managed by the operator and should not be set manually.

* **PostgresClusterStatus**
Current state of the cluster. One of: Creating, Updating, Running,
UpdateFailed, SyncFailed, CreateFailed, Invalid, Stopping, Stopped.

* **previousNumberOfInstances**
The number of instances the cluster had before hibernation. Used to restore
the cluster to its previous size when waking up. Cleared after wake-up.

* **previousPoolerInstances**
A map of connection pooler role to its replica count before hibernation.
The keys are "master" and "replica". Used to restore the pooler when waking
up. Cleared after wake-up.
89 changes: 89 additions & 0 deletions docs/user.md
Original file line number Diff line number Diff line change
Expand Up @@ -930,6 +930,95 @@ When you apply this manifest, the operator will:

The process is asynchronous. You can monitor the operator logs and the state of the `postgresql` resource to follow the progress. Once the new cluster is up and running, your applications can reconnect.

## Hibernate and Wake-up a Cluster

The operator supports hibernating a PostgreSQL cluster to save resources when it's
not needed, and waking it up again when required. This feature:

* Scales down the PostgreSQL StatefulSet to 0 replicas (stops all pods)
* Scales down the connection pooler to 0 replicas
* Preserves the cluster configuration and data (PVCs are retained)
* Stores the previous replica counts for automatic restoration

### Initiating Hibernate

To hibernate a running cluster, set the `lifecycle.phase` field to `"stopped"`:

```yaml
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-test-cluster
spec:
teamId: "test-team"
# ... other cluster parameters
numberOfInstances: 3
lifecycle:
phase: "stopped"
```

When you apply this manifest, the operator will:

* Store the current `numberOfInstances` in `status.previousNumberOfInstances`
* Store the connection pooler replica counts in `status.previousPoolerInstances`
* Set `spec.numberOfInstances` to 0
* Scale down the StatefulSet to 0 replicas
* Scale down the connection pooler deployments to 0 replicas
* Set `status.PostgresClusterStatus` to "Stopping", then "Stopped"

### Waking up a Cluster

To wake up a hibernated cluster, remove the `lifecycle.phase` field or set it to
an empty value:

```yaml
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-test-cluster
spec:
teamId: "test-team"
# ... other cluster parameters
# lifecycle.phase is not set or is removed
```

When you apply this manifest, the operator will:

* Restore `numberOfInstances` from `status.previousNumberOfInstances`
* Restore the connection pooler replica counts from `status.previousPoolerInstances`
* Scale up the StatefulSet to the previous replica count
* Scale up the connection pooler deployments to the previous replica counts
* Set `status.PostgresClusterStatus` to "Updating", then "Running"
* Clear `status.previousNumberOfInstances` and `status.previousPoolerInstances`

### Cluster Status During Lifecycle Transitions

| Status | Meaning |
|--------|---------|
| Running | Cluster is running normally |
| Stopping | Cluster is transitioning to stopped state (pods terminating) |
| Stopped | All pods have been terminated, cluster is hibernated |

### Restrictions During Hibernate

* **During Stopping**: All spec changes are blocked. You must wait for the cluster
to reach the Stopped state before making changes.

* **During Stopped**: Spec changes are blocked unless you remove `lifecycle.phase`
to wake up the cluster. This prevents accidental modifications to a hibernated
cluster.

### Connection Pooler Behavior

The connection pooler is automatically scaled alongside the cluster:

* When the cluster hibernates, the pooler is scaled to 0 replicas
* When the cluster wakes up, the pooler is restored to its previous replica count
* The previous replica counts are stored in `status.previousPoolerInstances`

Note: If the connection pooler was already at 0 replicas before hibernate, it
will remain at 0 after wake-up.

## Setting up a standby cluster

Standby cluster is a [Patroni feature](https://github.com/zalando/patroni/blob/master/docs/replica_bootstrap.rst#standby-cluster)
Expand Down
15 changes: 15 additions & 0 deletions manifests/postgresql.crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3246,6 +3246,13 @@ spec:
- name
type: object
type: array
lifecycle:
description: LifecycleSpec describes the lifecycle state of a Postgres
cluster.
properties:
phase:
type: string
type: object
logicalBackupRetention:
type: string
logicalBackupSchedule:
Expand Down Expand Up @@ -4197,6 +4204,14 @@ spec:
properties:
PostgresClusterStatus:
type: string
previousNumberOfInstances:
format: int32
type: integer
previousPoolerInstances:
type: object
additionalProperties:
format: int32
type: integer
required:
- PostgresClusterStatus
type: object
Expand Down
2 changes: 2 additions & 0 deletions pkg/apis/acid.zalan.do/v1/const.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ const (
ClusterStatusSyncFailed = "SyncFailed"
ClusterStatusAddFailed = "CreateFailed"
ClusterStatusRunning = "Running"
ClusterStatusStopping = "Stopping"
ClusterStatusStopped = "Stopped"
ClusterStatusInvalid = "Invalid"
)

Expand Down
15 changes: 15 additions & 0 deletions pkg/apis/acid.zalan.do/v1/postgresql.crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3246,6 +3246,13 @@ spec:
- name
type: object
type: array
lifecycle:
description: LifecycleSpec describes the lifecycle state of a Postgres
cluster.
properties:
phase:
type: string
type: object
logicalBackupRetention:
type: string
logicalBackupSchedule:
Expand Down Expand Up @@ -4197,6 +4204,14 @@ spec:
properties:
PostgresClusterStatus:
type: string
previousNumberOfInstances:
format: int32
type: integer
previousPoolerInstances:
type: object
additionalProperties:
format: int32
type: integer
required:
- PostgresClusterStatus
type: object
Expand Down
10 changes: 9 additions & 1 deletion pkg/apis/acid.zalan.do/v1/postgresql_type.go
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ type PostgresSpec struct {
TLS *TLSDescription `json:"tls,omitempty"`
AdditionalVolumes []AdditionalVolume `json:"additionalVolumes,omitempty"`
Streams []Stream `json:"streams,omitempty"`
Lifecycle *LifecycleSpec `json:"lifecycle,omitempty"`
Env []v1.EnvVar `json:"env,omitempty"`

// deprecated
Expand Down Expand Up @@ -257,6 +258,11 @@ type StandbyDescription struct {
StandbyPrimarySlotName string `json:"standby_primary_slot_name,omitempty"`
}

// LifecycleSpec describes the lifecycle state of a Postgres cluster.
type LifecycleSpec struct {
Phase string `json:"phase,omitempty"`
}

// TLSDescription specs TLS properties
type TLSDescription struct {
// +required
Expand Down Expand Up @@ -302,7 +308,9 @@ type UserFlags []string

// PostgresStatus contains status of the PostgreSQL cluster (running, creation failed etc.)
type PostgresStatus struct {
PostgresClusterStatus string `json:"PostgresClusterStatus"`
PostgresClusterStatus string `json:"PostgresClusterStatus"`
PreviousNumberOfInstances int32 `json:"previousNumberOfInstances,omitempty"`
PreviousPoolerInstances map[string]int32 `json:"previousPoolerInstances,omitempty"`
}

// ConnectionPooler Options for connection pooler
Expand Down
10 changes: 10 additions & 0 deletions pkg/apis/acid.zalan.do/v1/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,16 @@ func (postgresStatus PostgresStatus) Creating() bool {
return postgresStatus.PostgresClusterStatus == ClusterStatusCreating
}

// Stopping status of cluster
func (postgresStatus PostgresStatus) Stopping() bool {
return postgresStatus.PostgresClusterStatus == ClusterStatusStopping
}

// Stopped status of cluster
func (postgresStatus PostgresStatus) Stopped() bool {
return postgresStatus.PostgresClusterStatus == ClusterStatusStopped
}

func (postgresStatus PostgresStatus) String() string {
return postgresStatus.PostgresClusterStatus
}
21 changes: 21 additions & 0 deletions pkg/apis/acid.zalan.do/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading