Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 44 additions & 13 deletions docs/pause.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,62 @@
# Pause/resume and standby mode for a PostgreSQL cluster

## Pause and resume
## Pause and resume the cluster

Sometimes you may need to temporarily shut down (pause) your cluster and restart it later, such as during maintenance.
You can temporarily shut down your PostgreSQL cluster and bring it back later without losing data or configuration. You may want to pause the cluster for maintenance tasks, emergency manual intervention or debugging.

The `deploy/cr.yaml` file contains a special `spec.pause` key for this.
Setting it to `true` gracefully stops the cluster:
When paused, all changes to the cluster's current state are suspended and no statuses other than the "Progressing" condition are updated until you resume the reconciliation.

### How to pause

Set the `spec.pause` option to `true` in your `deploy/cr.yaml` Custom Resource:

```yaml
spec:
.......
pause: true
# ... rest of your spec
```

To start the cluster after it was paused, revert the `spec.pause`
key to `false`.
Apply the change:

**Troubleshooting tip**
```bash
kubectl apply -f deploy/cr.yaml
```

If you're pausing the cluster when there is a running backup, the Operator won't pause it for you. It will print a warning about running backups. In this case delete a running backup job and retry.
The Operator will gracefully stop the cluster (primary, replicas, pgBackRest, and related jobs).

## Put in standby mode
### How to resume

You can also put the cluster into a [standby :octicons-link-external-16:](https://www.postgresql.org/docs/current/warm-standby.html) (read-only) mode instead of completely shutting it down. This is done by a special `spec.standby` key. Set it to `true` for read-only state. To resume the normal cluster operation, set it to `false`.
Set `spec.pause` back to `false` in the same Custom Resource and apply:

```yaml
spec:
.......
standby: false
pause: false
# ... rest of your spec
```

```bash
kubectl apply -f deploy/cr.yaml
```

The Operator will start the cluster again using the existing data volumes.

### Troubleshooting

**The Operator does not pause the cluster**

If a backup job is running, the Operator will not pause the cluster and will log a warning. Remove a running backup job so you can pause:

```bash
kubectl delete job -l postgres-operator.crunchydata.com/pgbackrest-backup -n <namespace>
```

Then retry pausing the cluster.

## Standby mode

Standby PostgreSQL clusters provide a continuously replicated copy of your primary cluster, forming the backbone of high‑availability and disaster‑recovery strategies. They stay in sync through streaming replication, enabling you to quickly promote a standby if the primary becomes unavailable. Standby clusters can also run in separate regions or environments, helping you maintain business continuity during outages.

The standby mode for a cluster is controlled with the `spec.standby.enabled` option plus the `spec.standby.repoName` and/or `spec.standby.host` and `spec.standby.port` options in the Custom Resource. What options to specify depends on the standby cluster type.

Read more about the supported types of standby clusters and their setup in the [Deploy a standby cluster for Disaster Recovery](standby.md) documentation.

11 changes: 6 additions & 5 deletions docs/standby.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# How to deploy a standby cluster for Disaster Recovery
# Deploy a standby cluster for Disaster Recovery

Disaster recovery is not optional for businesses operating in the digital age. With the ever-increasing reliance on data, system outages or data loss can be catastrophic, causing significant business disruptions and financial losses.

With multi-cloud or multi-regional PostgreSQL deployments, the complexity of managing disaster recovery only increases. This is where the Percona Operators come in, providing a solution to streamline disaster recovery for PostgreSQL clusters running on Kubernetes. With the Percona Operators, businesses can manage multi-cloud or hybrid-cloud PostgreSQL deployments with ease, ensuring that critical data is always available and secure, no matter what happens.

Operators automate routine tasks and remove toil. For standby, the [Percona Operator for PostgreSQL version 2](index.md) provides the following options:
Operators automate routine tasks and remove toil. Percona Operator for PostgreSQL supports the following types of standby clusters:

1. A repo-based standby that recovers WAL files from a `pgBackRest` repo stored in external storage. For this setup, you reference the `pgBackRest` repo name and the cloud-based backup configuration that matches the one from the primary site. Refer to the [Standby cluster deployment based on pgBackRest](standby-backup.md) tutorial for the setup steps.
2. A streaming standby receives WAL files by connecting to the primary over the network. The primary site must be accessible over the network and allow secure authentication with TLS. The standby cluster must securely authenticate to the primary. For this reason, both sites must have the same custom TLS certificates. For the setup, you provide the host and port of the primary cluster and the certificates. Learn more about the setup in the [Standby cluster deployment based on streaming replication](standby-streaming.md) tutorial.
3. Streaming standby with external repository is the combination of two previous types and is configured with the options from both types. In this setup, the standby cluster streams WAL records from the primary. If the streaming replication falls behind, the cluster recovers WAL from the backup repo.

1. [pgBackrest repo based standby](standby-backup.md). The standby cluster will be connected to a pgBackRest cloud repo, so it will receive WAL files from the repo and apply them to the database.
2. [Streaming replication](standby-streaming.md). The standby cluster will use an authenticated network connection to the primary cluster to receive WAL records directly.
3. Combination of (1) and (2). The standby cluster is configured for both repo-based standby and streaming replicaton. It bootstraps from the pgBackRest repo and continues to receive WAL files as they are pushed to the repo, and can also directly receive them from primary. Using this approach ensures the cluster will still be up to date with the pgBackRest repo if streaming falls behind.