So you thought your database is in good shape, but suddenly your CPU usage spiked and reached 100% at 2AM. How would you go about tracking the root cause and fix it?
1. Locate the root cause
If you are using CloudWatch or any infrastructure monitoring solution, you can start your investigation there.
To understand what happened in your database, you should enable the slow query log feature on your database. This will capture all the slow log queries into a log, based on a threshold that you define.
If you are not sure how to enable the slow query log, check out this step-by-step guide
Once you enabled it, you can use slowquerylog.com to visualize the logs on a timeline and zoom-in on the specific issue.
Now, after you have some logs, you can start the investigation:
- When did it start?
- Did you have a major traffic increase ?
- Did you release a new feature / major release?
- Did you change anything? (database configuration parameters? did you enable persistent connection? Upgrade? Moved to a new infra?)
2. Fix the performance issue
There can be many reasons for the sudden increase in CPU usage. Here are few tips on how to fix it:
- Check out which MySQL users are logged in. Maybe there is a specific new user that’s hammering the system. You can do that using the following query:
SELECT SUBSTRING_INDEX(host, ':', 1) AS host_short, GROUP_CONCAT(DISTINCT user) AS users, COUNT(*) AS threads FROM information_schema.processlist GROUP BY host_short ORDER BY COUNT(*), host_short;
- Turn off persistent connections if you are not 100% sure that you need them.
- Check out your slow queries. You might have a new query or an old query that’s suddenly running slow. You can use free tools that will help you find the top slow queries, like EverSQL.
With such automatic MySQL optimization solutions will free you to focus on what matters the most, which is your business, rather than investing your time on manually optimizing your queries and database.
- List all the database processes and check the inno db engine status to check if you have a deadlock. You can do it by running the following command:
SHOW ENGINE INNODB STATUS
SHOW FULL PROCESSLIST; (in case your queries are very long)
- Review database configuration parameters, like Innodb_buffer_pool_size,buffer size, table cache, query cache.
SHOW SERVER STATUSG
- Check out Innodb status, as it might be that a new process is eating up CPU:
SHOW ENGINE INNODB STATUSG
3. Make sure it will not happen again
You don’t want to be caught off guard again at 2AM when your CPU utilization is 100%, hence you should install some type of a database monitoring solution to make sure you get alerts as soon as possible.
There are several Monitoring tools like NewRelic, AppDynamics and others, but for MySQL you can use EverSQL Sensor, which automatically collects dozens of signals and provides you AI-powered MySQL insights, such as:
- Detection of new slow queries.
- Prioritization of which queries to optimize.
- Detection of missing and redundant indexes.
- Automatic query optimization.
- and much more.
Production incidents generated by 100% CPU usage on your database can be avoided. You can get notifications as soon as it starts climbing, and use online solutions to fix it within a few minutes, instead of hiring expensive database consultants.