One of the most exciting storage-related features in Kubernetes is Volume snapshot and clone. It allows you to take a snapshot of data volume and later to clone into a new volume, which opens a variety of possibilities like instant backups or testing upgrades. This feature also brings Kubernetes deployments close to cloud providers, which allow you to get volume snapshots with one click.
Word of caution: for the database, it still might be required to apply fsfreeze and FLUSH TABLES WITH READ LOCK or
LOCK BINLOG FOR BACKUP
.
It is much easier in MySQL 8 now, because as with atomic DDL, MySQL 8 should provide crash-safe consistent snapshots without additional locking.
Let’s review how we can use this feature with Google Cloud Kubernetes Engine and Percona Kubernetes Operator for XtraDB Cluster.
First, the snapshot feature is still beta, so it is not available by default. You need to use GKE version 1.14 or later and you need to have the following enabled in your GKE: https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/gce-pd-csi-driver#enabling_on_a_new_cluster.
It is done by enabling “Compute Engine persistent disk CSI Driver“.
Now we need to create a Cluster using storageClassName: standard-rwo for PersistentVolumeClaims. So the relevant part in the resource definition looks like this:
persistentVolumeClaim: storageClassName: standard-rwo accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 11Gi
Let’s assume we have cluster1 running:
NAME READY STATUS RESTARTS AGE cluster1-haproxy-0 2/2 Running 0 49m cluster1-haproxy-1 2/2 Running 0 48m cluster1-haproxy-2 2/2 Running 0 48m cluster1-pxc-0 1/1 Running 0 50m cluster1-pxc-1 1/1 Running 0 48m cluster1-pxc-2 1/1 Running 0 47m percona-xtradb-cluster-operator-79d786dcfb-btkw2 1/1 Running 0 5h34m
And we want to clone a cluster into a new cluster, provisioning with the same dataset. Of course, it can be done using backup into a new volume, but snapshot and clone allow for achieving this much easier. There are still some additional required steps, I will list them as a Cheat Sheet.
1. Create VolumeSnapshotClass (I am not sure why this one is not present by default)
apiVersion: snapshot.storage.k8s.io/v1beta1 kind: VolumeSnapshotClass metadata: name: onesc driver: pd.csi.storage.gke.io deletionPolicy: Delete
2. Create snapshot
apiVersion: snapshot.storage.k8s.io/v1beta1 kind: VolumeSnapshot metadata: name: snapshot-for-newcluster spec: volumeSnapshotClassName: onesc source: persistentVolumeClaimName: datadir-cluster1-pxc-0
3. Clone into a new volume
Here I should note that we need to use the following as volume name convention used by Percona XtraDB Cluster Operator, it is:
datadir-<CLUSTERNAME>-pxc-0
Where CLUSTERNAME is the name used when we create clusters. So now we can clone snapshot into a volume:
datadir-newcluster-pxc-0
Where newcluster is the name of the new cluster.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: datadir-newcluster-pxc-0 spec: dataSource: name: snapshot-for-newcluster kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.io storageClassName: standard-rwo accessModes: - ReadWriteOnce resources: requests: storage: 11Gi
Important: the volume spec in storageClassName and accessModes and storage size should match the original volume.
After volume claim created, now we can start newcluster, however, there is still a caveat; we need to use:
forceUnsafeBootstrap: true
Because otherwise, Percona XtraDB Cluster will think the data from the snapshot was not after clean shutdown (which is true) and will refuse to start.
There is still some limitation to this approach, which you may find inconvenient: the volume can be cloned in only the same namespace, so it can’t be easily transferred from the PRODUCTION namespace into the QA namespace.
Though it still can be done but will require some extra steps and admin Kubernetes privileges, I will show how in the following blog posts.