Understanding Linux IOWait

I have seen many Linux Performance engineers looking at the “IOWait” portion of CPU usage as something to indicate whenever the system is I/O-bound. In this blog post, I will explain why this approach is unreliable and what better indicators you can use.

Let’s start by running a little experiment – generating heavy I/O usage on the system:

sysbench  --threads=8 --time=0 --max-requests=0  fileio --file-num=1 --file-total-size=10G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run

CPU Usage in Percona Monitoring and Management (PMM):

CPU Usage in Percona Monitoring and Management

root@iotest:~# vmstat 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  6      0 7137152  26452 762972    0    0 40500  1714 2519 4693  1  6 55 35  3
 2  8      0 7138100  26476 762964    0    0 344971    17 20059 37865  3 13  7 73  5
 0  8      0 7139160  26500 763016    0    0 347448    37 20599 37935  4 17  5 72  3
 2  7      0 7139736  26524 762968    0    0 334730    14 19190 36256  3 15  4 71  6
 4  4      0 7139484  26536 762900    0    0 253995     6 15230 27934  2 11  6 77  4
 0  7      0 7139484  26536 762900    0    0 350854     6 20777 38345  2 13  3 77  5

So far, so good, and — we see I/O intensive workload clearly corresponds to high IOWait (“wa” column in vmstat).

Let’s continue running our I/O-bound workload and add a heavy CPU-bound load:

sysbench --threads=8 --time=0 cpu run

heavy CPU usage

root@iotest:~# vmstat 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
12  4      0 7121640  26832 763476    0    0 48034  1460 2895 5443  6  7 47 37  3
13  3      0 7120416  26856 763464    0    0 256464    14 12404 25937 69 15  0  0 16
 8  8      0 7121020  26880 763496    0    0 325789    16 15788 33383 85 15  0  0  0
10  6      0 7121464  26904 763460    0    0 322954    33 16025 33461 83 15  0  0  1
 9  7      0 7123592  26928 763524    0    0 336794    14 16772 34907 85 15  0  0  1
13  3      0 7124132  26940 763556    0    0 386384    10 17704 38679 84 16  0  0  0
 9  7      0 7128252  26964 763604    0    0 356198    13 16303 35275 84 15  0  0  0
 9  7      0 7128052  26988 763584    0    0 324723    14 13905 30898 80 15  0  0  5
10  6      0 7122020  27012 763584    0    0 380429    16 16770 37079 81 18  0  0  1

What happened? IOWait is completely gone and now this system does not look I/O-bound at all!

In reality, though, of course, nothing changed for our first workload — it continues to be I/O-bound; it just became invisible when we look at “IOWait”!

To understand what is happening, we really need to understand what “IOWait” is and how it is computed.

There is a good article that goes into more detail on the subject, but basically, “IOWait” is kind of idle CPU time. If the CPU core gets idle because there is no work to do, the time is accounted as “idle.” If, however, it got idle because a process is waiting on disk, I/O time is counted towards “IOWait.”

However, if a process is waiting on disk I/O but other processes on the system can use the CPU, the time will be counted towards their CPU usage as user/system time instead.

Because of this accounting, other interesting behaviors are possible. Now instead of running eight I/O-bound threads, let’s just run one I/O-bound process on four core VM:

sysbench  --threads=1 --time=0 --max-requests=0  fileio --file-num=1 --file-total-size=10G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run

four core VM CPU usage

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  1      0 7130308  27704 763592    0    0 62000    12 4503 8577  3  5 69 20  3
 2  1      0 7127144  27728 763592    0    0 67098    14 4810 9253  2  5 70 20  2
 2  1      0 7128448  27752 763592    0    0 72760    15 5179 9946  2  5 72 20  1
 4  0      0 7133068  27776 763588    0    0 69566    29 4953 9562  2  5 72 21  1
 2  1      0 7131328  27800 763576    0    0 67501    15 4793 9276  2  5 72 20  1
 2  0      0 7128136  27824 763592    0    0 59461    15 4316 8272  2  5 71 20  3
 3  1      0 7129712  27848 763592    0    0 64139    13 4628 8854  2  5 70 20  3
 2  0      0 7128984  27872 763592    0    0 71027    18 5068 9718  2  6 71 20  1
 1  0      0 7128232  27884 763592    0    0 69779    12 4967 9549  2  5 71 20  1
 5  0      0 7128504  27908 763592    0    0 66419    18 4767 9139  2  5 71 20  1

Even though this process is completely I/O-bound, we can see IOWait (wa) is not particularly high, less than 25%. On larger systems with 32, 64, or more cores, such completely IO-bottlenecked processes will be all but invisible, generating single-digit IOWait percentages.

As such, high IOWait shows many processes in the system waiting on disk I/O, but even with low IOWait, the disk I/O may be bottlenecked for some processes on the system.

If IOWait is unreliable, what can you use instead to give you better visibility?

First, look at application-specific observability. The application, if it is well instrumented, tends to know best whenever it is bound by the disk and what particular tasks are I/O-bound.

If you only have access to Linux metrics, look at the “b” column in vmstat, which corresponds to processes blocked on disk I/O. This will show such processes, even of concurrent CPU-intensive loads, will mask IOWait:

CPU intensive load will mask IOWait

Finally, you can look at per-process statistics to see which processes are waiting for disk I/O. For Percona Monitoring and Management, you can install a plugin as described in the blog post Understanding Processes Running on Linux Host with Percona Monitoring and Management.

With this extension, we can clearly see which processes are runnable (running or blocked on CPU availability) and which are waiting on disk I/O!

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

Download Percona Monitoring and Management Today