Friday, November 30, 2012

Linux performance troubleshooting - hard drive

In this article we will discuss the ways to understand that the performance bottleneck of your server is hard drive. After reading this article you will be able to check current hdd load and what exactly cause this load.

Usually, on modern servers, the hard drive is the slowest part of the entire server. You can easily add more RAM, or increase the mount CPUs, have a server models of network cards, but if your system runs hdd bound software and your hdds are slow - the entire system will be slow. So, for Support Engineer hdd performance troubleshooting is very important skill.

When I first login to the server I run top command to understand general server load and possible reasons of overload.

From this picture we can see that:
  1. That server load has been increased during last 1 minute. I can state this from load average values: 4.68, 2.20, 1.33 (1,5, 15 minutes) and it's clear that during last 15 minutes LA was about 1 and now it's almost 5.
  2. id is only 39% - this means that server has only 39% free of his CPU resource
  3. wa is 49.1% - it's amount of time a CPU has been waiting for I/O. What is I/O? In 99% it's hard drive (1% I left for Networking).
  4. Some processes have 'D' state. D means that process is in 'uninterruptible sleep'. This means that process is waiting for I/O and you can't do anything with this. Even sudo kill -9 will not help :). There was an interesting case when we killed process with -11 signal to get it's core in order to analyze memory leaking. On that moment the process consumed about 11 Gb of  RAM. After receiving kill -11 signal it started to write the content of it's RAM to core file on disk... The system was almost unusable due to very high load and engineers were helpless because it wasn't possible to kill that process until it unloads all it's RAM to core file. The only way is reboot of the server or wait..!
So, from 2,3 and 4 we can state that there is high hdd load. How to check current hdd load? iostat will help!

> iostat -dx /dev/sda 1
where d - is device utilization report, x - display extended statistics, /dev/sda - my hard drive, 1 - the amount of time in seconds between each report. The first row generated by the iostat(and vmstat too) command provides statistics concerning the time since the system was booted. So usually it should be ignored.


The columns that we are interested in are:
  • r/s - The number (after merges) of read requests completed per second for the device
  • w/s - The number (after merges) of write requests completed per second for the device
  • rkB/s - The number of kilobytes, megabytes read from the device per second
  • wkB/s - The number of sectors (kilobytes, megabytes) written to the device per second
  • %util - Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%
In my case we can see that my hdd was loaded up to 100%. Mostly it was loaded by write requests (from 100 to 144 requests per second) and from 43 Mb to 65 Mb was written to hdd per second. Typical 5400 rpm/s hdd performance :).
If you run iostat without determining the exact hdd it will show you load of all hard drives with device-mapper numbers (if any). If you specify -N option, it will display the registered device mapper names for any device mapper devices.  Useful for viewing LVM statistics.

OK. To this moment it's clear that server has hdd performance issue. But how to understand the process(es) that causes this problem? I prefer next tools: htop, iotop,dstat. Let's check them one by one:
  • htop
This program can be used as complete substitute of top, and it's really cool, with a lot of options and one of the most valuable option is to sort output by I/O process values. How to use:
$ htop -->F2-->choose 'Columns' at Setup column --> Available Columns --> there are 3 related to I/O columns: IO_READ_RATE, IO_WRITE_RATE, IO_RATE(IO_WRITE_RATE + IO_READ_RATE), you can add them all or just 1 of them(for example IO_RATE) by pressing F5 button(they will appear at Active_Columns)--> press F10 to finish.
We can see that need columns are in the htop output. The next step is to sort processes by needed value, in our case it is IO_RATE: F6-->IO --> Enter. That's all: we can see the most I/O consuming process in the top of the output:


I my case it's kio_ftp, and that's right, I was downloading files from ftp server. Also we can see that there were only I/O write requests ans 0 reads.
The same way you can sort programs by READ and WRITE rate. 
  • iotop
Unlike htop iotop can be used only of I/O monitoring. The usage is very simple:
$ iotop
But I prefer to run it with -o option (Only show processes or threads actually doing I/O, instead of showing all processes or threads.)
$ iotop -o


We can see at that very moment 3 processes actively used disk, jbd2 - journaling process for ext4. dm-6-8 - dm - device multipath(used in LVM in my case), 6 - device number. The lsblk output can help to understand what device is loaded now:


From the picture is clear that dm-6 is LVM volume 'system-data' which is mounted to /data. And exactly to /data/ folder I was copying the files at that moment.
  • dstat
dstat - versatile tool for generating system resource statistics. This tool can show you all! :). But we will focus only on IO stats:

We can see that most of the time the most consuming IO process was kdeinit: kio_ftp. 
What distinguish dstat among other monitoring tools that you can combine several stats in the same output, for example next output will show IO + net stats + cpu stats!


Cool, ha? And the output combination is limited only to your imagination :). F.e: 
> dstat --top-io --bw -n -c -m -p -l, will add memory usage, procs and load average information.

And this one will show combined information about disk utilization, most IO consuming process, and disk read and write operations per second(--bw -s optimized colors for white background):


If you want to understand general performance of your disk subsystem, you can use sysbanch utility. Actually sysbench can be used for testing:


       ·   file I/O performance

       ·   scheduler performance

       ·   memory allocation and transfer speed

       ·   POSIX threads implementation performance

       ·   database server performance


If you decided to buy a server for IO bound applications(for example OLTP DB applications) seriously think about buying good RAID controller with write back cache. The IO performance will be improved drastically.
Average HDD IOPS - 125 IOPS, with RAID controller ~ 5 - 10K IOPS. IOPS - input/output operations per second. RAID with write back cache is especially important for random write operations.

Next time we will talk about network performance troubleshooting.

1 comment:

  1. Sometimes there are several processes, some doing reads, some doing writes. Reads are usually more slow, but who knows which type of reads a program does, how these operations are buffered using cache etc.

    I'd recommend to take a look at atop.
    It provides a way to obtain kernel-level process accounting, which makes finding greedy process easier. You can see the impact of the process, not only number of IOPs.

    ReplyDelete