Saturday, 16 May 2020

How to test your system infrastructure - Part 1

In this post I need to document various ways to test parts of an application infrastructure, mainly how to check CPU usage, how fast is disk IO, how fast is network infrastructure between 2 nodes, all tests assume Linux based infrastructure.


CPU:

On Linux the easiest way to check how much CPU is being used is using the top command:
top is an interactive command, clicking 1 while top is running, it will print the CPU usage per core.
top can also be run in none interactive mode as needed:

sherif@fingolfin:~$ top -b -n 1 |head
top - 14:22:23 up 52 min,  1 user,  load average: 0,29, 0,12, 0,06
Tasks: 175 total,   1 running, 129 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,5 us,  0,3 sy,  0,0 ni, 98,9 id,  0,2 wa,  0,0 hi,  0,1 si,  0,0 st
KiB Mem :  6072348 total,  4880976 free,   434896 used,   756476 buff/cache
KiB Swap:  2097148 total,  2097148 free,        0 used.  5398192 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2063 root      20   0  420504  94464  33488 S   2,3  1,6   0:25.45 Xorg
    1 root      20   0  225232   9024   6748 S   0,0  0,1   0:02.70 systemd
    2 root      20   0       0      0      0 S   0,0  0,0   0:00.00 kthreadd
sherif@fingolfin:~$

Another way to report on CPU usage is using iostat command:

[root@feanor ~]# iostat -c
Linux 3.10.0-862.el7.x86_64 (feanor)    05/16/2020      _x86_64_        (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.23    0.00    0.28    0.07    0.00   99.42

[root@feanor ~]#

One other way to benchmark the CPU execution on the system is to use the sysbench package as below:

sherif@fingolfin:~$ time sysbench --test=cpu --threads=6 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.11 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 6
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
    events per second:  3886.37
General statistics:
    total time:                          10.0011s
    total number of events:              38872
Latency (ms):
         min:                                  0.62
         avg:                                  1.54
         max:                                 29.13
         95th percentile:                      8.74
         sum:                              59820.54
Threads fairness:
    events (avg/stddev):           6478.6667/112.54
    execution time (avg/stddev):   9.9701/0.03
real    0m10,014s
user    0m29,940s
sys    0m0,012s
sherif@fingolfin:~$

The above test shows how much latency could be expected running multiple threads on the system.
More info about sysbench tool can be found in this page: https://linuxconfig.org/how-to-benchmark-your-linux-system

Memory:


To measure how fast our system memory works, we can use the small tool mbw from: https://github.com/raas/mbw.
The tools mesaures the memory bandwidth from user space, similar to what could be noticed by standard applications.
To compile the code on Centos we follow the below:

[root@feanor ~]# git clone https://github.com/raas/mbw
Cloning into 'mbw'...
remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 89 (delta 0), reused 1 (delta 0), pack-reused 85
Unpacking objects: 100% (89/89), done.
[root@feanor ~]# cd mbw
[root@feanor mbw]# ls -ltr
total 28
-rw-r--r--. 1 root root  423 May 16 16:12 README
-rw-r--r--. 1 root root  232 May 16 16:12 Makefile
-rw-r--r--. 1 root root 1255 May 16 16:12 mbw.1
-rw-r--r--. 1 root root 1640 May 16 16:12 mbw.spec
-rw-r--r--. 1 root root 8538 May 16 16:12 mbw.c
[root@feanor mbw]# make
cc     mbw.c   -o mbw
[root@feanor mbw]# ./mbw 512
Long uses 8 bytes. Allocating 2*67108864 elements = 1073741824 bytes of memory.
Using 262144 bytes as blocks for memcpy block copy test.
Getting down to business... Doing 10 runs per test.
0       Method: MEMCPY  Elapsed: 0.08889        MiB: 512.00000  Copy: 5760.122 MiB/s
1       Method: MEMCPY  Elapsed: 0.09538        MiB: 512.00000  Copy: 5368.283 MiB/s
2       Method: MEMCPY  Elapsed: 0.09289        MiB: 512.00000  Copy: 5512.133 MiB/s
3       Method: MEMCPY  Elapsed: 0.09756        MiB: 512.00000  Copy: 5247.891 MiB/s
4       Method: MEMCPY  Elapsed: 0.09414        MiB: 512.00000  Copy: 5438.593 MiB/s
5       Method: MEMCPY  Elapsed: 0.08911        MiB: 512.00000  Copy: 5745.450 MiB/s
6       Method: MEMCPY  Elapsed: 0.08720        MiB: 512.00000  Copy: 5871.627 MiB/s
7       Method: MEMCPY  Elapsed: 0.09688        MiB: 512.00000  Copy: 5284.616 MiB/s
8       Method: MEMCPY  Elapsed: 0.09409        MiB: 512.00000  Copy: 5441.598 MiB/s
9       Method: MEMCPY  Elapsed: 0.09243        MiB: 512.00000  Copy: 5539.087 MiB/s
AVG     Method: MEMCPY  Elapsed: 0.09286        MiB: 512.00000  Copy: 5513.825 MiB/s
0       Method: DUMB    Elapsed: 0.25512        MiB: 512.00000  Copy: 2006.875 MiB/s
1       Method: DUMB    Elapsed: 0.23047        MiB: 512.00000  Copy: 2221.528 MiB/s
2       Method: DUMB    Elapsed: 0.22259        MiB: 512.00000  Copy: 2300.245 MiB/s
3       Method: DUMB    Elapsed: 0.23621        MiB: 512.00000  Copy: 2167.544 MiB/s
4       Method: DUMB    Elapsed: 0.21707        MiB: 512.00000  Copy: 2358.697 MiB/s
5       Method: DUMB    Elapsed: 0.22799        MiB: 512.00000  Copy: 2245.742 MiB/s
6       Method: DUMB    Elapsed: 0.22476        MiB: 512.00000  Copy: 2277.965 MiB/s
7       Method: DUMB    Elapsed: 0.22205        MiB: 512.00000  Copy: 2305.777 MiB/s
8       Method: DUMB    Elapsed: 0.22730        MiB: 512.00000  Copy: 2252.490 MiB/s
9       Method: DUMB    Elapsed: 0.22879        MiB: 512.00000  Copy: 2237.899 MiB/s
AVG     Method: DUMB    Elapsed: 0.22924        MiB: 512.00000  Copy: 2233.515 MiB/s
0       Method: MCBLOCK Elapsed: 0.09570        MiB: 512.00000  Copy: 5350.052 MiB/s
1       Method: MCBLOCK Elapsed: 0.10106        MiB: 512.00000  Copy: 5066.197 MiB/s
2       Method: MCBLOCK Elapsed: 0.09312        MiB: 512.00000  Copy: 5498.459 MiB/s
3       Method: MCBLOCK Elapsed: 0.09769        MiB: 512.00000  Copy: 5240.961 MiB/s
4       Method: MCBLOCK Elapsed: 0.09894        MiB: 512.00000  Copy: 5174.958 MiB/s
5       Method: MCBLOCK Elapsed: 0.09634        MiB: 512.00000  Copy: 5314.456 MiB/s
6       Method: MCBLOCK Elapsed: 0.09780        MiB: 512.00000  Copy: 5235.388 MiB/s
7       Method: MCBLOCK Elapsed: 0.09487        MiB: 512.00000  Copy: 5397.086 MiB/s
8       Method: MCBLOCK Elapsed: 0.09828        MiB: 512.00000  Copy: 5209.446 MiB/s
9       Method: MCBLOCK Elapsed: 0.09942        MiB: 512.00000  Copy: 5149.973 MiB/s
AVG     Method: MCBLOCK Elapsed: 0.09732        MiB: 512.00000  Copy: 5260.924 MiB/s
[root@feanor mbw]#

The tool is available as a Debian package.
One cool test is to see when the tool tries to allocate 4GB on the above system, that machine has only 4GB of memory, and allocating that size would drive the mbw tool to get swapped out, we can see that with multiple ways, first, the bandwidth is orders of mangitude lower:

[root@feanor mbw]# ./mbw 2048
Long uses 8 bytes. Allocating 2*268435456 elements = 4294967296 bytes of memory.
Using 262144 bytes as blocks for memcpy block copy test.
Getting down to business... Doing 10 runs per test.
0       Method: MEMCPY  Elapsed: 26.14064       MiB: 2048.00000 Copy: 78.345 MiB/s
1       Method: MEMCPY  Elapsed: 42.49331       MiB: 2048.00000 Copy: 48.196 MiB/s
2       Method: MEMCPY  Elapsed: 18.70199       MiB: 2048.00000 Copy: 109.507 MiB/s
3       Method: MEMCPY  Elapsed: 55.37665       MiB: 2048.00000 Copy: 36.983 MiB/s
4       Method: MEMCPY  Elapsed: 35.01051       MiB: 2048.00000 Copy: 58.497 MiB/s
5       Method: MEMCPY  Elapsed: 20.52362       MiB: 2048.00000 Copy: 99.787 MiB/s
6       Method: MEMCPY  Elapsed: 21.93620       MiB: 2048.00000 Copy: 93.362 MiB/s
7       Method: MEMCPY  Elapsed: 37.51056       MiB: 2048.00000 Copy: 54.598 MiB/s
8       Method: MEMCPY  Elapsed: 28.07473       MiB: 2048.00000 Copy: 72.948 MiB/s
9       Method: MEMCPY  Elapsed: 14.76706       MiB: 2048.00000 Copy: 138.687 MiB/s
AVG     Method: MEMCPY  Elapsed: 30.05353       MiB: 2048.00000 Copy: 68.145 MiB/s
0       Method: DUMB    Elapsed: 11.23370       MiB: 2048.00000 Copy: 182.309 MiB/s
1       Method: DUMB    Elapsed: 10.76112       MiB: 2048.00000 Copy: 190.315 MiB/s
2       Method: DUMB    Elapsed: 15.99955       MiB: 2048.00000 Copy: 128.004 MiB/s
3       Method: DUMB    Elapsed: 23.18597       MiB: 2048.00000 Copy: 88.329 MiB/s
4       Method: DUMB    Elapsed: 28.14035       MiB: 2048.00000 Copy: 72.778 MiB/s
5       Method: DUMB    Elapsed: 31.18035       MiB: 2048.00000 Copy: 65.682 MiB/s
6       Method: DUMB    Elapsed: 31.02135       MiB: 2048.00000 Copy: 66.019 MiB/s
7       Method: DUMB    Elapsed: 36.10925       MiB: 2048.00000 Copy: 56.717 MiB/s
8       Method: DUMB    Elapsed: 51.37134       MiB: 2048.00000 Copy: 39.867 MiB/s
9       Method: DUMB    Elapsed: 60.84004       MiB: 2048.00000 Copy: 33.662 MiB/s
AVG     Method: DUMB    Elapsed: 29.98430       MiB: 2048.00000 Copy: 68.302 MiB/s
0       Method: MCBLOCK Elapsed: 67.50246       MiB: 2048.00000 Copy: 30.340 MiB/s
1       Method: MCBLOCK Elapsed: 74.09162       MiB: 2048.00000 Copy: 27.641 MiB/s
2       Method: MCBLOCK Elapsed: 77.48624       MiB: 2048.00000 Copy: 26.430 MiB/s
3       Method: MCBLOCK Elapsed: 75.32009       MiB: 2048.00000 Copy: 27.191 MiB/s
4       Method: MCBLOCK Elapsed: 94.43207       MiB: 2048.00000 Copy: 21.688 MiB/s
5       Method: MCBLOCK Elapsed: 96.87246       MiB: 2048.00000 Copy: 21.141 MiB/s
6       Method: MCBLOCK Elapsed: 102.09089      MiB: 2048.00000 Copy: 20.061 MiB/s
7       Method: MCBLOCK Elapsed: 95.71384       MiB: 2048.00000 Copy: 21.397 MiB/s
8       Method: MCBLOCK Elapsed: 89.24437       MiB: 2048.00000 Copy: 22.948 MiB/s
9       Method: MCBLOCK Elapsed: 103.73286      MiB: 2048.00000 Copy: 19.743 MiB/s
AVG     Method: MCBLOCK Elapsed: 87.64869       MiB: 2048.00000 Copy: 23.366 MiB/s
[root@feanor mbw]#

Using the iotop tool we can see that mbw tool is swapped out:
 And using the smem tool we can see that mbw tool is swapping:

[root@feanor mbw]# smem |head -1; smem|grep mbw
  PID User     Command                         Swap      USS      PSS      RSS
 4144 root     grep --color=auto mbw              0      140      326      704
 4005 root     ./mbw 2048                    584004  3610400  3610400  3610408
[root@feanor mbw]#


To collect system wide memory information, we can use top command or we can also use vmstat:

[root@feanor mbw]# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  2 2046444 112560      0 125508 8596 5262 13055  5449 2268  827  1  9 76 14  0

[root@feanor mbw]# vmstat -s
      4043552 K total memory
      3824396 K used memory
      2936656 K active memory
       782444 K inactive memory
       107972 K free memory
            0 K buffer memory
       111184 K swap cache
      4063228 K total swap
      2026724 K used swap
      2036504 K free swap
        16716 non-nice user cpu ticks
           20 nice user cpu ticks
        89870 system cpu ticks
       926212 idle cpu ticks
       171799 IO-wait cpu ticks
            0 IRQ cpu ticks
        21806 softirq cpu ticks
            0 stolen cpu ticks
    160271685 pages paged in
     66888866 pages paged out
     26373088 pages swapped in
     16147227 pages swapped out
     27846320 interrupts
     10151325 CPU context switches
   1589638134 boot time
         4248 forks
[root@feanor mbw]#

The vmstat tools uses the kernel file /proc/meminfo which contains more information about system memory usage as can be seen below:

[root@feanor mbw]# cat /proc/meminfo
MemTotal:        4043552 kB
MemFree:         3622752 kB
MemAvailable:    3570904 kB
Buffers:               0 kB
Cached:           110172 kB
SwapCached:        52480 kB
Active:            66004 kB
Inactive:         146504 kB
Active(anon):      49768 kB
Inactive(anon):    61024 kB
Active(file):      16236 kB
Inactive(file):    85480 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       4063228 kB
SwapFree:        3716148 kB
Dirty:                20 kB
Writeback:             0 kB
AnonPages:         64504 kB
Mapped:            24988 kB
Shmem:              8356 kB
Slab:              79168 kB
SReclaimable:      32396 kB
SUnreclaim:        46772 kB
KernelStack:        6752 kB
PageTables:        32416 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6085004 kB
Committed_AS:    2550352 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      105404 kB
VmallocChunk:   34359537660 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      131008 kB
DirectMap2M:     4063232 kB
[root@feanor mbw]#

Another small nice tool to report memory usage on a linux system is the free tool:

[root@feanor mbw]# free -h
              total        used        free      shared  buff/cache   available
Mem:           3.9G        235M        3.4G        8.7M        230M        3.4G
Swap:          3.9G        335M        3.5G
[root@feanor mbw]#


Here the output is similar to what we get from top, we can see how much swap is used, how much memory is used by Linux to for buffers and disk caching and how much is memory available to the system.



No comments:

Post a Comment