Most people love mystery movies, there is something thrilling about them. We also love mystery movies, but there is something that we love even more, and it is Joomla mysteries, and yesterday we were fortunate enough that we were assigned to investigate one of these mysteries…
At around eleven hundred hours yesterday morning, a regular client of ours told us that they were seeing the following error on their Joomla site:
Error displaying the error page: Application Instantiation Error: Could not connect to MySQL.
Of course, the above error is probably the most common fatal error on Joomla sites, and we have written about it before, yet this time, it was different, because a few minutes later, the client called back and told us that the site was now working. Hmmm… It was the first time where we see this issue resolve itself. That was weird, we thought. We asked the client if they wanted to investigate further, but they told us that they were OK; it was just a hiccup…
A couple of hours later, they emailed us, and told us that they were seeing the same error message that they saw in the morning. By the time we received the email and checked the site again, the error was gone. So, we emailed the client back, and we recommended that they authorize an investigation about this issue, because instability issues such as these are never a good sign. The client immediately authorized the investigation, and the first thing that we checked was the MySQL error log which is located under the /var/lib/mysql folder (the file name typically ends with .err). Here’s what we saw:
2017-11-07 11:03:29 82870 [Note] Plugin 'FEDERATED' is disabled. 2017-11-07 11:03:29 82870 [Note] InnoDB: Using atomics to ref count buffer pool pages 2017-11-07 11:03:29 82870 [Note] InnoDB: The InnoDB memory heap is disabled 2017-11-07 11:03:29 82870 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2017-11-07 11:03:29 82870 [Note] InnoDB: Memory barrier is not used 2017-11-07 11:03:29 82870 [Note] InnoDB: Compressed tables use zlib 1.2.3 2017-11-07 11:03:29 82870 [Note] InnoDB: Using Linux native AIO 2017-11-07 11:03:29 82870 [Note] InnoDB: Using CPU crc32 instructions 2017-11-07 11:03:29 82870 [Note] InnoDB: Initializing buffer pool, size = 64.0M 2017-11-07 11:03:29 82870 [Note] InnoDB: Completed initialization of buffer pool 2017-11-07 11:03:29 82870 [Note] InnoDB: Highest supported file format is Barracuda. 2017-11-07 11:03:29 82870 [Note] InnoDB: Log scan progressed past the checkpoint lsn 65178299172 2017-11-07 11:03:29 82870 [Note] InnoDB: Database was not shutdown normally! 2017-11-07 11:03:29 82870 [Note] InnoDB: Starting crash recovery. 2017-11-07 11:03:29 82870 [Note] InnoDB: Reading tablespace information from the .ibd files... 2017-11-07 11:03:29 82870 [Note] InnoDB: Restoring possible half-written data pages 2017-11-07 11:03:29 82870 [Note] InnoDB: from the doublewrite buffer... InnoDB: Doing recovery: scanned up to log sequence number 65178300463 InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 1 row operations to undo InnoDB: Trx id counter is 379714816 2017-11-07 11:03:29 82870 [Note] InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Starting in background the rollback of uncommitted transactions 2017-11-07 11:03:30 82870 [Note] InnoDB: 128 rollback segment(s) are active. 2017-11-07 11:03:30 7f6cf1e16700 InnoDB: Rolling back trx with id 379714508, 1 rows to undo 2017-11-07 11:03:30 82870 [Note] InnoDB: Waiting for purge to start 2017-11-07 11:03:30 82870 [Note] InnoDB: Rollback of trx with id 379714508 completed 2017-11-07 11:03:30 7f6cf1e16700 InnoDB: Rollback of non-prepared transactions completed 2017-11-07 11:03:30 82870 [Note] InnoDB: 5.6.38 started; log sequence number 65178300463 2017-11-07 11:03:30 82870 [Note] Server hostname (bind-address): '*'; port: 3306 2017-11-07 11:03:30 82870 [Note] IPv6 is available. 2017-11-07 11:03:30 82870 [Note] - '::' resolves to '::'; 2017-11-07 11:03:30 82870 [Note] Server socket created on IP: '::'. 2017-11-07 11:03:30 82870 [Note] Event Scheduler: Loaded 0 events 2017-11-07 11:03:30 82870 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.6.38' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL) 2017-11-07 13:11:21 11675 [Note] Plugin 'FEDERATED' is disabled. 2017-11-07 13:11:22 11675 [Note] InnoDB: Using atomics to ref count buffer pool pages 2017-11-07 13:11:22 11675 [Note] InnoDB: The InnoDB memory heap is disabled 2017-11-07 13:11:22 11675 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2017-11-07 13:11:22 11675 [Note] InnoDB: Memory barrier is not used 2017-11-07 13:11:22 11675 [Note] InnoDB: Compressed tables use zlib 1.2.3 2017-11-07 13:11:22 11675 [Note] InnoDB: Using Linux native AIO 2017-11-07 13:11:22 11675 [Note] InnoDB: Using CPU crc32 instructions 2017-11-07 13:11:23 11675 [Note] InnoDB: Initializing buffer pool, size = 64.0M 2017-11-07 13:11:23 11675 [Note] InnoDB: Completed initialization of buffer pool 2017-11-07 13:11:23 11675 [Note] InnoDB: Highest supported file format is Barracuda. 2017-11-07 13:11:23 11675 [Note] InnoDB: Log scan progressed past the checkpoint lsn 65220975371 2017-11-07 13:11:24 11675 [Note] InnoDB: Database was not shutdown normally! 2017-11-07 13:11:24 11675 [Note] InnoDB: Starting crash recovery. 2017-11-07 13:11:24 11675 [Note] InnoDB: Reading tablespace information from the .ibd files... 2017-11-07 13:11:27 11675 [Note] InnoDB: Restoring possible half-written data pages 2017-11-07 13:11:27 11675 [Note] InnoDB: from the doublewrite buffer... InnoDB: Doing recovery: scanned up to log sequence number 65221104018 2017-11-07 13:11:30 11675 [Note] InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percent: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed 2017-11-07 13:11:33 11675 [Note] InnoDB: 128 rollback segment(s) are active. 2017-11-07 13:11:33 11675 [Note] InnoDB: Waiting for purge to start 2017-11-07 13:11:33 11675 [Note] InnoDB: 5.6.38 started; log sequence number 65221104018 2017-11-07 13:11:33 11675 [Note] Server hostname (bind-address): '*'; port: 3306 2017-11-07 13:11:34 11675 [Note] IPv6 is available. 2017-11-07 13:11:34 11675 [Note] - '::' resolves to '::'; 2017-11-07 13:11:34 11675 [Note] Server socket created on IP: '::'. 2017-11-07 13:11:35 11675 [Note] Event Scheduler: Loaded 0 events 2017-11-07 13:11:35 11675 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.6.38' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL)
You can clearly see that within 2 hours, MySQL restarted twice, by itself! But why? This is where this whole thing started to spin into a fun Joomla mystery! We looked at all the logs, for a sign that can lead us to the real cause of the problem, and after many hours of investigation, we found a clue. It was the following lines in the messages file which is located under the /var/log folder:
Nov 7 11:03:21 host kernel: [42008818.160677] Out of memory: Kill process 39364 (mysqld) score 27 or sacrifice child Nov 7 11:03:21 host kernel: [42008818.161356] Killed process 39364, UID 498, (mysqld) total-vm:1402564kB, anon-rss:16576kB, file-rss:252kB Nov 7 11:03:22 host kernel: [42008818.858715] php invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
If you carefully read the above lines, you will understand what was happening. The system ran out of memory, and it had to make a decision: either to kill the mysqld process or to sacrifice its child(ren). In the next line, you can see that the kernel decided to kill the mysqld process and immediately went ahead with this decision, which resulted in restarting the MySQL server.
OK, now we know it’s a memory issue. But why? Doesn’t their server have enough memory? We issued the following command in the shell to know the amount of memory the server has…
free -m
… and it returned the following:
total used free shared buffers cached Mem: 1815 1545 270 0 130 1004 -/+ buffers/cache: 409 1405 Swap: 1999 159 1840
Aha! So this server has less than 2 GB of RAM (it is obviously a VPS, and not a server), and 0% of this RAM was free. Of course, Linux tends to allocate unused memory for cache to speed things up, but still, 2 GB is obviously too small for this website, as the server is running out memory.
We offered 2 solutions to the client: 1) We can optimize the queries on the site in order to reduce the memory usage on the database server, or 2) They upgrade to a real dedicated server with at least 16 GB of RAM.
The client opted for the latter solution, which is, in our opinion, the ideal solution.
So there you have it, if you have intermittent downtimes on your MySQL server, then check your logs (especially the /var/log/messages file) to see if your server is running out of memory. If that’s the case, then the best route that you can take is to add resources to your server (or migrate to a more powerful server that has more RAM). If you want to take an alternative route, then just contact us. We are always ready to help, our fees are right, and our work is ultra professional!