Friday, January 20, 2017

checking the disk performance using iostat

How do we know the disk is creating the issue? Like so much write and reads are happening and disks are not able to perform according to the needs? We can check this with top and iostat command.

First try with top and check whether there is a wait for io devices.

#top -c

top - 11:58:53 up 16 days, 18:29,  1 user,  load average: 0.08, 0.13, 0.09
Tasks: 124 total,   1 running, 123 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2%us,    0.2%sy,    0.0%ni, 98.2%id,  1.4%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   3711248k total,  3562888k used,   148360k free,   183332k buffers
Swap:  3850236k total,   817960k used,  3032276k free,   146796k cached

1.4% is fine. But if you have high I/O wait, then we need to check what is causing the issue.

Now when we run the iostat command.

[root@main ~]# iostat  -x 1
Linux 2.6.32-642.3.1.el6.x86_64 (main)     01/16/2017     _x86_64_    (6 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                   0.39      0.00    0.54       0.80       0.00    98.27

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               1.69     6.66    0.28    1.06    16.99    61.73    58.92     0.03   19.46    0.82   24.29   9.81   1.31
sdb               0.00   142.17    4.73   16.41  1208.15  1268.67   117.15     0.02    0.85    0.20    1.03   1.35   2.85
dm-0              0.00     0.00    0.03    5.49     1.51    43.92     8.23     0.15   26.35    6.36   26.46   2.28   1.26
dm-1              0.00     0.00    1.94    2.23    15.49    17.80     8.00     0.07   17.13    0.38   31.70   0.12   0.05

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                  0.17      0.00      0.33     0.00       0.00      99.50

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00    19.00    0.00    3.00     0.00   176.00    58.67     0.06   20.67    0.00   20.67  13.33   4.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00   22.00     0.00   176.00     8.00     0.39   17.77    0.00   17.77   1.82   4.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

Thursday, January 19, 2017

Installation usage and understanding of nicstat command

A lot of times we need to check whether its the network causing the bottleneck. There are a lot of tools like iftop to check this. nicstat is one of the nice tools to check the network performance. In this post we will see how to install and use nicstat command. This example is executed on a centos linux system.

Download the nicstat tar file.
wget https://downloads.sourceforge.net/project/nicstat/nicstat-1.95.tar.gz

Untar the downloaded file.
tar xvzf nicstat-1.95.tar.gz

Go inside the extracted nicstat directory.
cd nicstat-1.95

Check the Readme file for the instructions.
cat README.txt

Compile
mv Makefile.Linux Makefile
make

If your system is 64bit, then you will get the following error.
[root@sysadmin nicstat-1.95]# make
gcc -O3 -m32    nicstat.c   -o nicstat
In file included from /usr/include/features.h:385,
                 from /usr/include/stdio.h:28,
                 from nicstat.c:33:
/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory
make: *** [nicstat] Error 1
[root@sysadmin nicstat-1.95]#

Then you need to install the following packages.
yum -y install glibc-devel.i686 glibc-devel

On some systems, even if you install glibc, the following error will come.

 [root@main nicstat-1.95]# make
gcc -O3 -m32    nicstat.c   -o nicstat
/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/4.4.7/libgcc_s.so when searching for -lgcc_s
/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/4.4.7/libgcc_s.so when searching for -lgcc_s
/usr/bin/ld: cannot find -lgcc_s
collect2: ld returned 1 exit status
make: *** [nicstat] Error 1
[root@main nicstat-1.95]

You have to install the following packages to resolve this.
yum install libstdc++-devel.i686

Once all the required packages are installed, compile and install.
[root@sysadmin nicstat-1.95]# make
gcc -O3 -m32    nicstat.c   -o nicstat
mv nicstat `./nicstat.sh --bin-name`
[root@sysadmin nicstat-1.95]# make install
gcc -O3 -m32    nicstat.c   -o nicstat
sudo install -o root -g root -m 4511 `./nicstat.sh --bin-name` /usr/local/bin/nicstat
sudo install -o bin -g bin -m 555 enicstat /usr/local/bin
sudo install -o bin -g bin -m 444 nicstat.1 /usr/local/share/man/man1/nicstat.1


Now we can run the command.
[root@sysadmin nicstat-1.95]# nicstat
    Time      Int   rKB/s   wKB/s   rPk/s   wPk/s    rAvs    wAvs %Util    Sat
14:46:58       lo    0.00    0.00    0.00    0.00   80.58   80.58  0.00   0.00
14:46:58     eth0   13.27    0.83   13.95    8.32   974.1   102.3  0.12   0.00

To see the stats of a particular Ethernet interface
[root@sysadmin nicstat-1.95]# nicstat -i eth0
    Time      Int   rKB/s   wKB/s   rPk/s   wPk/s    rAvs    wAvs %Util    Sat
14:47:05     eth0   13.26    0.83   13.94    8.31   973.9   102.2  0.12   0.00
[root@sysadmin nicstat-1.95]#


Use -x extended output
[root@main nicstat-1.95]# nicstat -i eth2 -x
15:00:28      RdKB    WrKB   RdPkt   WrPkt   IErr  OErr  Coll  NoCP Defer  %Util
eth2        2078.2  2289.4  2413.0  1868.8   0.00  0.00  0.00  0.00  0.00   2.54

To get the summary
[root@main nicstat-1.95]# nicstat -i eth2 -s
    Time      Int          rKB/s          wKB/s
15:01:13     eth2       2078.175       2289.413

To repeat the summary in every 2 seconds for 5 times.
[root@main nicstat-1.95]# nicstat -i eth2 -s 2 5
    Time      Int          rKB/s          wKB/s
15:01:36     eth2       2078.173       2289.411
15:01:38     eth2       1088.889       1641.253
15:01:41     eth2       1065.228       1353.393
15:01:43     eth2        943.999        952.306
15:01:44     eth2        855.518       1361.279



To show TCP statistics

[root@main nicstat-1.95]# nicstat -t
15:09:12    InKB   OutKB   InSeg  OutSeg Reset  AttF %ReTX InConn OutCon Drops
TCP         0.00    0.00  1252.8   980.8  0.05  0.28 0.000   12.4   0.52  0.00

To show UDP statistics
[root@main nicstat-1.95]# nicstat -u
15:09:48                    InDG   OutDG     InErr  OutErr
UDP                         0.59    0.59      0.00    0.00

Sunday, January 15, 2017

Distributing interrupts of network controller to all cpus

When we use PCI network card, for eg 10g fiber NIC, it will have more than one channels. Our Server will have many CPUs also. If all the channels of the NIC is interrupting only one CPU, there will be huge stress on the single CPU and it will decrease the performance of network, server etc.

The best way to handle this distribute the interrupts to all the cpus available in the system. In this post we will see how to do it.

I have one PCI Fiber NIC, eth2.

[root@main scripts]# ifconfig
eth2      Link encap:Ethernet  HWaddr 14:02:EC:6C:5A:XX
          inet addr:172.16.0.30  Bcast:172.16.0.127  Mask:255.255.255.128
          inet6 addr: fe80::1602:ecff:fe6c:5a68/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:23985923006 errors:13425 dropped:0 overruns:0 frame:13425
          TX packets:18582827575 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:21135762796818 (19.2 TiB)  TX bytes:23298305448313 (21.1 TiB)

Initially, we can see that all the channels are interrupting singe cpu CPU0
[root@main ~]# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
 75: 1937793764          0          0          0          0          0  IR-PCI-MSI-edge      eth2-TxRx-0
 76: 1033638701          0          0          0          0          0  IR-PCI-MSI-edge      eth2-TxRx-1
 77:  242534932           0          0          0          0          0  IR-PCI-MSI-edge      eth2-TxRx-2
 78:  100181128           0          0          0          0          0  IR-PCI-MSI-edge      eth2-TxRx-3
 79:   64797499            0          0          0          0          0  IR-PCI-MSI-edge      eth2-TxRx-4
 80:   50919167            0          0          0          0          0  IR-PCI-MSI-edge      eth2-TxRx-5
 81:      15920               0          0          0          0          0  IR-PCI-MSI-edge      eth2

Now we are running the set irq affinity script with eth2 as argument. You can get this script from intel websites. 
[root@main scripts]# ./set_irq_affinity eth2
IFACE CORE MASK -> FILE
=======================
eth2 0 1 -> /proc/irq/75/smp_affinity
eth2 1 2 -> /proc/irq/76/smp_affinity
eth2 2 4 -> /proc/irq/77/smp_affinity
eth2 3 8 -> /proc/irq/78/smp_affinity
eth2 4 10 -> /proc/irq/79/smp_affinity
eth2 5 20 -> /proc/irq/80/smp_affinity

Now we can see that the new interrupts are distributed over the cpus.
[root@main scripts]# cat /proc/interrupts | grep eth2
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
 75: 1938401156          0          0          0          0          0   IR-PCI-MSI-edge      eth2-TxRx-0
 76: 1033645576        416        0          0          0          0   IR-PCI-MSI-edge      eth2-TxRx-1
 77:   242536923          0         45         0          0          0   IR-PCI-MSI-edge      eth2-TxRx-2
 78:   100182754          0          0         36         0          0   IR-PCI-MSI-edge      eth2-TxRx-3
 79:     64799086          0          0          0         38         0   IR-PCI-MSI-edge      eth2-TxRx-4
 80:     50920844          0          0          0          0         41  IR-PCI-MSI-edge      eth2-TxRx-5
 81:           15920          0          0          0          0          0   IR-PCI-MSI-edge      eth2
[root@main scripts]# cat /proc/interrupts | grep eth2
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
 75: 1938413752          0          0          0          0          0    IR-PCI-MSI-edge      eth2-TxRx-0
 76: 1033645576        597        0          0          0          0    IR-PCI-MSI-edge      eth2-TxRx-1
 77:  242536923          0       1358        0          0          0    IR-PCI-MSI-edge      eth2-TxRx-2
 78:  100182754          0          0         78          0          0    IR-PCI-MSI-edge      eth2-TxRx-3
 79:    64799086          0          0           0         70         0    IR-PCI-MSI-edge      eth2-TxRx-4
 80:    50920844          0          0           0          0        115  IR-PCI-MSI-edge      eth2-TxRx-5
 81:          15920          0          0           0          0          0    IR-PCI-MSI-edge      eth2