Recover Percona XtraDB Cluster in Kubernetes From Wrong MySQL Config

MySQL Performance Blog November 23, 2020

10 2 minutes read

Kubernetes operators are meant to simplify the deployment and management of applications. Our Percona Kubernetes Operator for Percona XtraDB Cluster serves the purpose, but also provides users the flexibility to fine-tune their MySQL and proxy services configuration.

The document Changing MySQL Options describes how to provide custom

my.cnf

configuration to the operator. But what would happen if you made a mistake and specified the wrong parameter in the configuration?

Apply Configuration

I already deployed my Percona XtraDB Cluster and deliberately submitted the wrong

my.cnf

configuration in

cr.yaml

spec:
...
  pxc:
    configuration: |
      [mysqld]
      wrong_param=123
…

Apply the configuration:

$ kubectl apply -f deploy/cr.yaml

Once you do this, the Operator will apply a new MySQL configuration to one of the Pods. In a few minutes you will see that the Pod is stuck in CrashLoopBackOff status:

$ kubectl get pods
NAME                                               READY   STATUS             RESTARTS   AGE
percona-xtradb-cluster-operator-79d786dcfb-lzv4b   1/1     Running            0          5h
test-haproxy-0                                     2/2     Running            0          5m27s
test-haproxy-1                                     2/2     Running            0          4m40s
test-haproxy-2                                     2/2     Running            0          4m24s
test-pxc-0                                         1/1     Running            0          5m27s
test-pxc-1                                         1/1     Running            0          4m41s
test-pxc-2                                         0/1     CrashLoopBackOff   1          59s

In the logs it is clearly stated that this parameter is not supported and

mysqld

process cannot start:

       2020-11-19T13:30:30.141829Z 0 [ERROR] [MY-000067] [Server] unknown variable 'wrong_param=123'.
        2020-11-19T13:30:30.142355Z 0 [ERROR] [MY-010119] [Server] Aborting
        2020-11-19T13:30:31.835199Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.20-11.1)  Percona XtraDB Cluster (GPL), Release rel11, Revision 683b26a, WSREP version 26.4.3.

It is worth noting that your Percona XtraDB Cluster is still operational and serves the requests.

Recovery

Let’s try to comment out the configuration section and reapply

cr.yaml

spec:
...
  pxc:
#    configuration: |
#      [mysqld]
#      wrong_param=123
…


$ kubectl apply -f deploy/cr.yaml

And it won’t work (in v1.6). The Pod is still in CrashLoopBackOff state as the operator does not apply any changes when not all Pods are up and running. We are doing that to ensure data safety.

Fortunately, there is an easy way to recover from such a mistake: you can either delete or modify the corresponding ConfigMap resource in Kubernetes. Usually its name is

{your_cluster_name}-pxc

$ kubectl delete configmap test-pxc

And delete the Pod which is failing:

$ kubectl delete pod text-pxc-2

Kubernetes will restart all Percona XtraDB Cluster pods one by one after some time:

test-pxc-0                                         1/1     Running   0          2m28s
test-pxc-1                                         1/1     Running   0          3m23s
test-pxc-2                                         1/1     Running   0          4m36s

You can apply the correct MySQL configuration now through ConfigMap or cr.yaml again. We are assessing other recovery options for such cases and config validation, so stay tuned for upcoming releases.

MySQL Performance Blog November 23, 2020

10 2 minutes read