The Klio Plugin v0.0.15

The Klio plugin for CloudNativePG allows you to leverage the backup and WAL streaming capabilities of Klio for your PostgreSQL clusters managed by CloudNativePG. It will add two containers to each PostgreSQL instance pod:

  • A klio-plugin container that handles backup creation and management
  • A klio-wal container that streams WAL files to the Klio server in real-time

Configuration

The Klio plugin integrates with CloudNativePG through the CNPG-I (CloudNativePG Interface) specification, enabling Klio to manage backups and WAL streaming for your PostgreSQL clusters. To use Klio with a CloudNativePG cluster, you need to:

  1. Create a PluginConfiguration resource that defines how to connect to the Klio server
  2. Reference the plugin in your Cluster resource specification

Prerequisites

Before configuring a cluster to use the Klio plugin, ensure you have:

  • A running Klio Server resource deployed in your namespace
  • Client credentials (username and password) stored in a Kubernetes Secret
  • The server's TLS certificate available in a Secret

Creating a PluginConfiguration resource

The PluginConfiguration custom resource defines how the Klio plugin connects to and communicates with the Klio server. This resource contains connection details, authentication credentials, and optional configuration for metrics, profiling, and backup retention policies.

Basic example

Here's a minimal PluginConfiguration example:

apiVersion: klio.enterprisedb.io/v1alpha1
kind: PluginConfiguration
metadata:
  name: klio-plugin-config
  namespace: default
spec:
  serverAddress: klio-server.default
  clientSecretName: client-sample-tls
  serverSecretName: klio-server-tls
  # mode: standard  # Optional: standard (default) or read-only

Client credentials secret

The client credentials must be stored in a Kubernetes Secret of type kubernetes.io/tls, containing a secret to be presented to the Klio server.

This secret can be generated with cert-manager by following the documentation in the Klio server page, or be provided by the user.

Server Address

The serverAddress field specifies where the Klio server can be reached. This can be:

  • A Kubernetes service name: klio-server.default (within the same namespace)
  • A fully qualified domain name: klio-server.default.svc.cluster.local
  • An external address: klio.example.com

Connections will be done using the default ports of the Klio base and WAL servers, respectively 51515 and 52000.

TLS configuration

The serverSecretName field references a Secret containing the TLS certificate used to secure communication with the Klio server. This is the same certificate configured on the Server resource.

Configuring a Cluster to use the Klio plugin

Once you have created a PluginConfiguration, reference it in your CloudNativePG Cluster resource:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-postgres-cluster
  namespace: default
spec:
  instances: 3

  postgresql:
    pg_hba:
      - local replication all peer map=local # Allow replication connections locally

  plugins:
    - name: klio.enterprisedb.io
      enabled: true # Activate the Klio plugin (default)
      parameters:
        pluginConfigurationRef: klio-plugin-config

  storage:
    size: 10Gi

To be able to stream WAL files, ensure that your PostgreSQL configuration allows local replication connections. You can do this by adding an entry to the pg_hba section, as shown in the example above.

Plugin parameters

The plugins section in the Cluster specification requires:

  • name: Must be set to klio.enterprisedb.io to identify the Klio plugin
  • enabled: Set to true to activate the plugin. This is the default value.
  • parameters.pluginConfigurationRef: The name of your PluginConfiguration resource
Note

Even though the Klio plugin is used to archive WAL files on the Klio server, it does not use the archiveCommand parameter in the PostgreSQL configuration, as the WAL are streamed directly to the Klio server. Thus, you must not set isWALArchiver: true in the plugin configuration.

Important

If you add the Klio plugin to an existing cluster, you must restart the cluster to inject the sidecar containers. Use kubectl cnpg restart or set the kubectl.kubernetes.io/restartedAt annotation on the cluster.

Warning

The PluginConfiguration resource referenced in the Cluster should exist before creating or updating the cluster. If it doesn't exist, the Klio plugin uses the PreReconcile hook to gate reconciliation, causing the cluster to wait at the start of the reconciliation loop (before any object creation or status changes) until the PluginConfiguration is created. This avoids unrecoverable error states, but the cluster will not progress until the dependency is satisfied. Ensure the PluginConfiguration is created with the correct name.

Applying configuration changes

Most changes to the PluginConfiguration resource are applied automatically. When you update a PluginConfiguration, the Klio operator detects the change and updates the corresponding klio-config Secret. The Klio sidecar containers monitor their configuration file for changes and restart automatically to apply the new configuration. This restart is intentional and expected.

What to expect during configuration updates:

  • The klio-plugin, klio-wal, and klio-restore sidecar containers will restart to pick up the new configuration
  • Container restart counts will increment - this is normal behavior and indicates the configuration was successfully applied
  • Restarts are graceful and brief (typically 5-10 seconds)
  • No data loss occurs - PostgreSQL continues running and WAL streaming resumes automatically after restart
  • You can verify the configuration was applied by checking the ConfigurationApplied condition in the PluginConfiguration status

This automatic propagation applies to all configuration fields stored in the config file, including:

  • Server address
  • Tier 1 and Tier 2 settings (backup, recovery, retention)
  • Operation mode (standard or read-only)
  • Cluster name override
  • WAL prefetch configuration
Note

Changes to container customizations (such as image, resources, or securityContext) and the pprof setting are not applied automatically. These fields affect the pod spec, which is managed by CloudNativePG. To apply these changes, use kubectl cnpg restart to roll the cluster pods.

Advanced configuration options

The PluginConfiguration resource supports several advanced options to customize the plugin's behavior.

Retention policies

Define how long backups should be retained by configuring retention policies for Tier 1 and Tier 2 storage. Retention policies can be configured independently for each tier:

apiVersion: klio.enterprisedb.io/v1alpha1
kind: PluginConfiguration
metadata:
  name: klio-plugin-config
spec:
  serverAddress: klio-server.default
  clientSecretName: klio-client-credentials
  serverSecretName: klio-server-tls
  tier1:
    retention:
      keepLatest: 5
      keepHourly: 12
      keepDaily: 7
      keepWeekly: 4
      keepMonthly: 6
      keepAnnual: 2
  tier2:
    enableBackup: true
    enableRecovery: true
    retention:
      keepLatest: 10
      keepDaily: 30
      keepMonthly: 12
      keepAnnual: 5

Except for keepLatest, each option defines how many backups to retain for the specified time period. For example, keepDaily: 7 means that we should retain at most one backup for each of the past 7 days.

If multiple backups exist within the same time bucket, the most recent one is kept, unless preserved by a different keep rule. Backups that are not retained by any rule are deleted. Rule evaluation is done when a new backup is taken.

The Klio server will automatically delete WAL files that are no longer needed for recovery by any retained backup.

All retention settings are optional. For each unspecified retention level, the default Kopia value is applied:

keepLatest: 10
keepHourly: 48
keepDaily: 7
keepWeekly: 4
keepMonthly: 24
keepAnnual: 1

Set a rule to 0 to disable that retention level.

Operation Mode

The mode field controls whether the plugin can perform both backup and restore operations (standard) or only restore operations (read-only).

spec:
  mode: standard  # or read-only

Standard Mode (default)

In standard mode, the plugin can:

  • Create backups to Tier 1 (and optionally Tier 2)
  • Stream WAL files to the server
  • Restore from both Tier 1 and Tier 2

This is the default mode and suitable for most deployments.

Read-Only Mode

Use read-only mode when connecting to a read-only Klio server for recovery operations only. This is useful for disaster recovery scenarios where you want to restore from a read-only server in a secondary region or datacenter.

apiVersion: klio.enterprisedb.io/v1alpha1
kind: PluginConfiguration
metadata:
  name: dr-restore-config
spec:
  serverAddress: dr-server.default
  clientSecretName: dr-client-credentials
  serverSecretName: dr-server-tls
  mode: read-only

  # Read-only mode requires:
  # - tier2 configuration with enableRecovery: true
  # - NO tier1 configuration
  tier2:
    enableRecovery: true
    # enableBackup must be false or omitted

Restrictions in read-only mode:

  • Tier 2 must be configured with enableRecovery: true
  • Tier 2 enableBackup must be false or omitted
  • Tier 1 configuration is not allowed
  • No backup or WAL streaming operations are performed

See the Read-Only Server Mode documentation for details on setting up a read-only Klio server.

Cluster name override

By default, the plugin uses the name of the CloudNativePG Cluster resource. You can override this if needed:

spec:
  clusterName: my-custom-cluster-name

This can be useful working with backups from different clusters, for example when restoring clusters or configuring replica clusters.

Tier 2 configuration

Tier 2 provides secondary storage (typically object storage like S3) for long-term backup retention and disaster recovery. Configure Tier 2 using the tier2 section:

spec:
  tier2:
    enableBackup: true
    enableRecovery: true
    retention:
      keepDaily: 30
      keepMonthly: 12

Options

  • enableBackup: When set to true, backups and WAL files are automatically synchronized to Tier 2 storage after being stored in Tier 1. This ensures your backups are available in long-term storage.

  • enableRecovery: When set to true, Klio will look for backups and WAL files in both Tier 1 and Tier 2 during restore operations. If a backup is available in both tiers, Tier 1 takes precedence as restore from it will be faster.

  • retention: Configure a separate retention policy for Tier 2. Typically, you would configure longer retention periods for Tier 2 since object storage is more cost-effective for long-term storage.

See the Architecture documentation for more details on Tier 2 storage.

WAL Prefetch Configuration

During recovery operations, Klio can prefetch WAL files ahead of PostgreSQL's requests to speed up recovery. Configure prefetching using the walPrefetch section:

spec:
  walPrefetch:
    count: 8
    maxConcurrentDownloads: 16

Options

  • count: The number of WAL files to prefetch ahead during recovery. Set to 0 to disable prefetching. Default is 2.

  • maxConcurrentDownloads: The maximum number of concurrent WAL downloads. Higher values can improve recovery speed on high-bandwidth connections but use more resources. Default is 4.

Observability

See the OpenTelemetry observability section for more details on how to monitor the Klio plugin using OpenTelemetry.

Performance profiling

Enable the pprof HTTP endpoint for performance profiling and troubleshooting:

spec:
  pprof: true

When enabled, the pprof endpoint is exposed and can be used with Go's profiling tools to analyze CPU usage, memory allocation, goroutines, and other runtime metrics.

Warning

Only enable pprof in development or testing environments, or when actively troubleshooting performance issues. It should not be enabled in production unless necessary.

Container customization

The PluginConfiguration resource allows you to customize the Klio sidecar containers by providing base container specifications that are used as the foundation for the sidecars. This feature enables you to add custom environment variables, volume mounts, resource limits, and other container settings without modifying the PostgreSQL container environment.

Basic example

apiVersion: klio.enterprisedb.io/v1alpha1
kind: PluginConfiguration
metadata:
  name: klio-plugin-config
spec:
  serverAddress: klio-server.default
  clientSecretName: klio-client-credentials
  serverSecretName: klio-server-tls
  containers:
    - name: klio-plugin
      env:
        - name: CUSTOM_ENV_VAR
          value: "my-value"
        - name: DEBUG_LEVEL
          value: "info"
    - name: klio-wal
      env:
        - name: WAL_BUFFER_SIZE
          value: "8192"

How container merging works

The containers you define serve as the base for the Klio sidecars, with the following merge behavior:

  1. Your container is the base: When you define a container (e.g., klio-plugin), your specification serves as the starting point
  2. Klio enforces required values: Klio sets its essential configuration:
    • Container name (klio-plugin, klio-wal, or klio-restore)
    • Container args (the command arguments needed for operation)
    • CONTAINER_NAME environment variable
  3. Your customizations are preserved: All other fields you define remain intact
  4. Template defaults fill gaps: For fields you don't specify, Klio applies sensible defaults (image, security context, standard volume mounts, etc.)
Important

Klio's required values (name, args, CONTAINER_NAME env var) will always override any conflicting values you set. All other customizations are respected.

Plugin configuration selection in replica clusters

When a CloudNativePG Cluster has multiple PluginConfiguration resources — one for archiving and one or more for external clusters — Klio selects which configuration to apply to the sidecar containers using the following logic:

  1. Archive plugin takes precedence: if the Cluster has a Klio plugin declared in .spec.plugins (the archive plugin), its PluginConfiguration is used for the sidecar containers on every pod.
  2. Fallback to replica source: if no archive plugin is configured and the cluster is a replica (.spec.replica is set), the designated primary pod uses the PluginConfiguration referenced by the external cluster defined as the replica source (.spec.replica.source). Non-primary pods do not receive a Klio sidecar in this case.

For example, consider a replica cluster that does not archive locally but restores WALs from an external source managed by Klio:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: cluster-dc-b
spec:
  instances: 3
  # No archive plugin in .spec.plugins

  replica:
    self: cluster-dc-b
    primary: cluster-dc-a
    source: cluster-dc-a

  externalClusters:
    - name: cluster-dc-a
      plugin:
        name: klio.enterprisedb.io
        parameters:
          pluginConfigurationRef: source-plugin-config

In this setup, only the designated primary of cluster-dc-b gets a Klio sidecar, configured from the source-plugin-config PluginConfiguration. The container customizations defined in that PluginConfiguration (image, resources, environment variables, etc.) are applied to the sidecar following the same merging rules described above.

Available sidecar containers

The following containers can be customized:

  • klio-plugin: Handles backup creation and management in PostgreSQL pods
  • klio-wal: Streams WAL files to the Klio server in PostgreSQL pods
  • klio-restore: Restores backups during recovery jobs

Example: Resource limits and environment variables

apiVersion: klio.enterprisedb.io/v1alpha1
kind: PluginConfiguration
metadata:
  name: klio-plugin-config
spec:
  serverAddress: klio-server.default
  clientSecretName: klio-client-credentials
  serverSecretName: klio-server-tls
  containers:
    - name: klio-plugin
      env:
        - name: LOG_LEVEL
          value: "debug"
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: "http://otel-collector:4317"
      resources:
        limits:
          memory: "512Mi"
          cpu: "1"
        requests:
          memory: "256Mi"
          cpu: "500m"
    - name: klio-wal
      env:
        - name: WAL_STREAM_TIMEOUT
          value: "30s"
      resources:
        limits:
          memory: "256Mi"
          cpu: "500m"
        requests:
          memory: "128Mi"
          cpu: "250m"