Kubernetes operators are meant to simplify the deployment and management of applications. Our Percona Kubernetes Operator for Percona XtraDB Cluster serves the purpose, but also provides users the flexibility to fine-tune their MySQL and proxy services configuration.
The document Changing MySQL Options describes how to provide custom
my.cnf
configuration to the operator. But what would happen if you made a mistake and specified the wrong parameter in the configuration?
Apply Configuration
I already deployed my Percona XtraDB Cluster and deliberately submitted the wrong
my.cnf
configuration in
cr.yaml
:
spec: ... pxc: configuration: | [mysqld] wrong_param=123 …
Apply the configuration:
$ kubectl apply -f deploy/cr.yaml
Once you do this, the Operator will apply a new MySQL configuration to one of the Pods. In a few minutes you will see that the Pod is stuck in CrashLoopBackOff status:
$ kubectl get pods NAME READY STATUS RESTARTS AGE percona-xtradb-cluster-operator-79d786dcfb-lzv4b 1/1 Running 0 5h test-haproxy-0 2/2 Running 0 5m27s test-haproxy-1 2/2 Running 0 4m40s test-haproxy-2 2/2 Running 0 4m24s test-pxc-0 1/1 Running 0 5m27s test-pxc-1 1/1 Running 0 4m41s test-pxc-2 0/1 CrashLoopBackOff 1 59s
In the logs it is clearly stated that this parameter is not supported and
mysqld
process cannot start:
2020-11-19T13:30:30.141829Z 0 [ERROR] [MY-000067] [Server] unknown variable 'wrong_param=123'. 2020-11-19T13:30:30.142355Z 0 [ERROR] [MY-010119] [Server] Aborting 2020-11-19T13:30:31.835199Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.20-11.1) Percona XtraDB Cluster (GPL), Release rel11, Revision 683b26a, WSREP version 26.4.3.
It is worth noting that your Percona XtraDB Cluster is still operational and serves the requests.
Recovery
Let’s try to comment out the configuration section and reapply
cr.yaml
:
spec: ... pxc: # configuration: | # [mysqld] # wrong_param=123 … $ kubectl apply -f deploy/cr.yaml
And it won’t work (in v1.6). The Pod is still in CrashLoopBackOff state as the operator does not apply any changes when not all Pods are up and running. We are doing that to ensure data safety.
Fortunately, there is an easy way to recover from such a mistake: you can either delete or modify the corresponding ConfigMap resource in Kubernetes. Usually its name is
{your_cluster_name}-pxc
:
$ kubectl delete configmap test-pxc
And delete the Pod which is failing:
$ kubectl delete pod text-pxc-2
Kubernetes will restart all Percona XtraDB Cluster pods one by one after some time:
test-pxc-0 1/1 Running 0 2m28s test-pxc-1 1/1 Running 0 3m23s test-pxc-2 1/1 Running 0 4m36s
You can apply the correct MySQL configuration now through ConfigMap or cr.yaml again. We are assessing other recovery options for such cases and config validation, so stay tuned for upcoming releases.