Sunday, 13 September 2020

Using Docker registry

One of the esiest ways to deploy applications is to ship your application as a docker container.
Application build process can include a step for creating and building a docker image for the target app.
Docker images then needs to be pushed to a docker registry, so that the deployment step can use that image to deploy the application.
In order for the pipeline to work, one needs a docker registry to host the images, either local or public.
Docker hub offers a cloud hosted registry, which is very flexible and always available, but requires your systems to have internet access.
I tested using Docker hub registry and building a simple local registry.

Running a local docker registry is simple, its just another docker container.
sherif@Luthien:~$ cat docker_registry.sh 
 docker run -d \
  -p 5100:5000 \
  --restart=always \
  --name luthien_registry \
  -v /registry:/var/lib/registry \
  registry:2
In the above script, I am mapping the default registry port 5000 on the container to port 5100 on the host machine.
To verify if the registry is running, we use the docker ps or docker container ps command to check:
sherif@Luthien:~$ docker container ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED              STATUS              PORTS                         NAMES
26009ca9e4f4        registry:2                   "/entrypoint.sh /etc…"   About a minute ago   Up About a minute   0.0.0.0:5100->5000/tcp        luthien_registry
sherif@Luthien:~$ 
Next we need to build a test image to test the new registry, to do this we use a docker file:
sherif@Luthien:~$ cat Dockerfile 
FROM busybox
CMD echo "Hello world! This is my first Docker image."
sherif@Luthien:~$
Then we use docker build to create the image locally:
sherif@Luthien:~$ docker build -f ./Dockerfile -t busybox_hello:1.0 ./ 
Sending build context to Docker daemon  3.432GB
Step 1/2 : FROM busybox
 ---> 6858809bf669
Step 2/2 : CMD echo "Hello world! This is my first Docker image."
 ---> Running in 3a7a187ab5ba
Removing intermediate container 3a7a187ab5ba
 ---> 63213a968c8e
Successfully built 63213a968c8e
Successfully tagged busybox_hello:1.0

sherif@Luthien:~$ docker image ls -a
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
busybox_hello       1.0                 7415dea3e476        22 seconds ago      1.23MB
busybox             latest              6858809bf669        3 days ago          1.23MB
sherif@Luthien:~$
Then we tag the local image using the name:port/tag:version of the image to be pushed to the registry.
sherif@Luthien:~$ docker tag busybox_hello:1.0 luthien:5100/busybox_hello:1.0 

sherif@Luthien:~$ docker image ls -a
REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
busybox_hello                1.0                 7415dea3e476        23 minutes ago      1.23MB
luthien:5100/busybox_hello   1.0                 7415dea3e476        23 minutes ago      1.23MB
busybox                      latest              6858809bf669        3 days ago          1.23MB
sherif@Luthien:~$
Then we push the locally tagged image to the registry using docker push:
sherif@Luthien:~$ docker push luthien:5100/busybox_hello:1.0
The push refers to repository [luthien:5100/busybox_hello]
be8b8b42328a: Pushed 
1.0: digest: sha256:d7c348330e06aa13c1dc766590aebe0d75e95291993dd26710b6bbdd671b30d1 size: 527
sherif@Luthien:~$
In order to confirm if the image was pushed, we use the docker registry rest API to query for our new image:
sherif@Luthien:~$ curl -LX GET http://luthien:5100/v2/_catalog
{"repositories":["busybox_hello"]}
sherif@Luthien:~$ curl -LX GET http://luthien:5100/v2/busybox_hello/tags/list
{"name":"busybox_hello","tags":["1.0"]}
sherif@Luthien:~$

If we want to push to an exsiting docker hub repository, we follow a simpler process.
We just need to build the image with the repository tag, do a docker login then do a docker push.
This should push the image as a version tag into the existing repository.
sherif@Luthien:~$ docker build -t sfattah/sfattah_r:1.0 ./

sherif@Luthien:~$ docker login 
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username: sfattah
Password: 
Login Succeeded
sherif@Luthien:~$ docker push  sfattah/sfattah_r:1.0
The push refers to repository [docker.io/sfattah/sfattah_r]
be8b8b42328a: Mounted from library/busybox 
1.0: digest: sha256:d7c348330e06aa13c1dc766590aebe0d75e95291993dd26710b6bbdd671b30d1 size: 527
sherif@Luthien:~$

Sunday, 23 August 2020

Linux LVS with docker containers

I came across Linux LVS while reading about load balancing and relalized its an old Linux concept that I have never tested.
LVS is a mechanism implemented by Linux kernel to do layer-4 (transport layer) switching, mainly to achieve load distribution for high availability purposes.
LVS is managed using a command line tool: ipvmadm  
I used docker containers running apache 2.4 to test the concept on a Virtual box VM.
First, I needed to install docker containers, and create docker volums to store the apache configuration and the content outside the containers:

root@fingolfin:~# docker volume create httpdcfg1
httpdcfg1
root@fingolfin:~# docker volume create httpdcfg2
httpdcfg2
root@fingolfin:~# docker volume create www2
www2
root@fingolfin:~# docker volume create www1
www1
root@fingolfin:~#

Once the volumes are created, we can start the containers:

root@fingolfin:~# docker run -d --name httpd1 -p 8180:80 -v www1:/usr/local/apache2/htdocs/ -v httpdcfg1:/usr/local/apache2/conf/ httpd:2.4
a6e11431a228498b8fc412dfcee6b0fc682ce241e79527fdf33e7ceb1945e54a
root@fingolfin:~#
root@fingolfin:~# docker run -d --name httpd2 -p 8280:80 -v www2:/usr/local/apache2/htdocs/ -v httpdcfg2:/usr/local/apache2/conf/ httpd:2.4
b40e29e187b0841d81b345ca975cd867bcce587be8b3f79e43a2ec0d1087aba8
root@fingolfin:~#

root@fingolfin:~# docker container ps
CONTAINER ID        IMAGE               COMMAND              CREATED             STATUS              PORTS                  NAMES
a665073770ec        httpd:2.4           "httpd-foreground"   41 minutes ago      Up 41 minutes       0.0.0.0:8280->80/tcp   httpd2
d1dc596f68a6        httpd:2.4           "httpd-foreground"   54 minutes ago      Up 45 minutes       0.0.0.0:8180->80/tcp   httpd1
root@fingolfin:~#

Then we need to change the index.html on the volumes www1 and www2 to show different nodes.
This can be done by directly accessing the file from the docker volume mount point:

root@fingolfin:~# docker volume inspect www1
[
    {
        "CreatedAt": "2020-08-23T16:59:50+02:00",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/www1/_data",
        "Name": "www1",
        "Options": {},
        "Scope": "local"
    }
]
root@fingolfin:~# cd /var/lib/docker/volumes/www1/_data/

Next step is to obtain the docker container IP address using docker container inspect:

root@fingolfin:~# docker container inspect httpd1|grep '"IPAddress":'
            "IPAddress": "172.17.0.2",
                    "IPAddress": "172.17.0.2",
root@fingolfin:~#
root@fingolfin:~# docker container inspect httpd2|grep '"IPAddress":'
            "IPAddress": "172.17.0.3",
                    "IPAddress": "172.17.0.3",
root@fingolfin:~#

One more thing, we need to install ping inside the containers, just to make sure that we can troubleshoot network in case things didn't work.
To do this we use docker exec:
root@fingolfin:~# docker exec -it httpd1 bash
root@a6e11431a228:/usr/local/apache2#
root@a6e11431a228:/usr/local/apache2# apt update; apt install iputils-ping -y

Next we will create a subinterface IP address or as people call it a vip (virtual ip) on the main ethernet device of the docker host, will use a 10.0.2.0 network address, but it should be equally fine to use any other IP addess.

root@fingolfin:~# ifconfig enp0s3:1 10.0.2.200 netmask 255.255.255.0 broadcast 10.0.2.255
root@fingolfin:~#root@fingolfin:~# ip addr show dev enp0s3
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:7e:6e:97 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
       valid_lft 74725sec preferred_lft 74725sec
    inet 172.17.60.200/16 brd 172.17.255.255 scope global enp0s3:0
       valid_lft forever preferred_lft forever
    inet 10.0.2.200/24 brd 10.0.2.255 scope global secondary enp0s3:1
       valid_lft forever preferred_lft forever
    inet6 fe80::a790:c580:9be5:55ef/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
root@fingolfin:~#

We can then ping the IP of the host, 10.0.2.15 from within the docker container, to ensure container can reach that:

root@a6e11431a228:/usr/local/apache2# ping 10.0.2.15
PING 10.0.2.15 (10.0.2.15) 56(84) bytes of data.
64 bytes from 10.0.2.15: icmp_seq=1 ttl=64 time=0.114 ms
64 bytes from 10.0.2.15: icmp_seq=2 ttl=64 time=0.127 ms
^C
--- 10.0.2.15 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 17ms
rtt min/avg/max/mdev = 0.114/0.120/0.127/0.012 ms
root@a6e11431a228:/usr/local/apache2#

Then set up the Linux LVS from the host prompt:

root@fingolfin:~# ipvsadm -A -t 172.17.60.200:80 -s rr
root@fingolfin:~# ipvsadm -a -t 172.17.60.200:80 -r 172.17.0.3:80 -m
root@fingolfin:~# ipvsadm -a -t 172.17.60.200:80 -r 172.17.0.2:80 -m
root@fingolfin:~# ipvsadm -L -n 
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn   
TCP  10.0.2.200:80 rr
  -> 172.17.0.2:80                Masq    1      0          0         
  -> 172.17.0.3:80                Masq    1      0          0    
root@fingolfin:~#

Then to test the setup, we use curl from the host command line:

root@fingolfin:~# curl http://10.0.2.200
It works! node 1
root@fingolfin:~# curl http://10.0.2.200
It works! node 2
root@fingolfin:~# curl http://10.0.2.200
It works! node 1
root@fingolfin:~# curl http://10.0.2.200
It works! node 2
root@fingolfin:~# 

root@fingolfin:~# ipvsadm -L -n --stats
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port               Conns   InPkts  OutPkts  InBytes OutBytes
  -> RemoteAddress:Port
TCP  10.0.2.200:80                       4       24       16     1576     1972
  -> 172.17.0.2:80                       2       12        8      788      986
  -> 172.17.0.3:80                       2       12        8      788      986
root@fingolfin:~#

As can be seen, the requests are equally distributed between the 2 Apache containers.
Other loadbalancing algorithms can be used to suite the need of the use case using the -s (scheduler) option.
More information can be found in the Linux ipvsadm man page: https://linux.die.net/man/8/ipvsadm

Referances:
http://www.ultramonkey.org/papers/lvs_tutorial/html/

Sunday, 16 August 2020

Tomcat Session Replication Clustering + Apache Load Balancer

I wanted to start putting in some test configurations for a future project based on Tomcat 9.

Needed to setup a test environment that runs on 1 VM, that is just to test my Apache and Tomcat clustering configuration, the test setup looks like the below:

The setup is going to have 2 tomcat 9 nodes with clustering enabled, this should allow sessions to be replicated between the 2 nodes, session persistence is not configured yet.

In order to be able to effectively use the cluster, we need to also have a load balancer to proxy the requests to the cluster nodes, I have choose to use Apache for this project.

Tomcat 9 can be downloaded from the official Tomcat website: https://tomcat.apache.org/download-90.cgi.

Once downloaded, we just need to unzip the tomcat installation in to a directory named node01 and copy that into another directory named node02.

We then need to modify the Tomcat server.xml so that the 2 nodes would use different ports since they run on the same network, this can be done by modifying the connector element and the shutdown port in the server element:

<Server port="8105" shutdown="SHUTDOWN">
.....
    <Connector port="8180" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443" />

Then we need to add the below default cluster configuration, it is essentially copied from the tomcat documentation, though we still need to modify the receiver port since we are running on the same machine:

 <Host name="localhost"  appBase="webapps"
              unpackWARs="true" autoDeploy="true">

              <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="8">

                  <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>

                  <Channel className="org.apache.catalina.tribes.group.GroupChannel">
                        <Membership className="org.apache.catalina.tribes.membership.McastService"
                                address="228.0.0.4"
                                port="45564"
                                frequency="500"
                                dropTime="3000"/>
                        <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                                address="0.0.0.0"
                                port="4001"
                                autoBind="100"
                                selectorTimeout="5000"
                                maxThreads="6"/>

                        <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
                                <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
                        </Sender>
                        <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
                        <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/>
                </Channel>

                <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
                        filter=""/>
                <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

                <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                    tempDir="/tmp/war-temp/"
                    deployDir="/tmp/war-deploy/"
                    watchDir="/tmp/war-listen/"
                    watchEnabled="false"/>

                <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
          </Cluster>

In this case I placed the cluster configuration under host element so that the deployer doesn't cause a warning in Tomat start up, as deployment is specific to a host not to the whole engine.

You can find the whole server.xml file in this URL: https://github.com/sherif-abdelfattah/ExSession_old/blob/master/server.xml.

Note that I included the JvmRouteBinderValve valve in the configuration, this is needed to be able to recover from failed session stickiness configuration, the cluster would just rewrite the JSESSIONID cookie and continue to serve the same session from the node that the request landed on.

Once we configure the 2 nodes, we then start them up and we can see the below messages in the Tomcat catalina.out:

node01 output when it starts to join node02:

16-Aug-2020 19:37:34.751 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8180"]
16-Aug-2020 19:37:34.768 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [431] milliseconds
16-Aug-2020 19:37:34.804 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
16-Aug-2020 19:37:34.804 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.37]
16-Aug-2020 19:37:34.812 INFO [main] org.apache.catalina.ha.tcp.SimpleTcpCluster.startInternal Cluster is about to start
16-Aug-2020 19:37:34.818 INFO [main] org.apache.catalina.tribes.transport.ReceiverBase.bind Receiver Server Socket bound to:[/0.0.0.0:4001]
16-Aug-2020 19:37:34.829 INFO [main] org.apache.catalina.tribes.membership.McastServiceImpl.setupSocket Setting cluster mcast soTimeout to [500]
16-Aug-2020 19:37:34.830 INFO [main] org.apache.catalina.tribes.membership.McastServiceImpl.waitForMembers Sleeping for [1000] milliseconds to establish cluster membership, start level:[4]
16-Aug-2020 19:37:35.221 INFO [Membership-MemberAdded.] org.apache.catalina.ha.tcp.SimpleTcpCluster.memberAdded Replication member added:[org.apache.catalina.tribes.membership.MemberImpl[tcp://{0, 0, 0, 0}:4002,{0, 0, 0, 0},4002, alive=3358655, securePort=-1, UDP Port=-1, id={-56 72 63 -39 -27 123 70 86 -109 22 97 122 50 -99 0 24 }, payload={}, command={}, domain={}]]
16-Aug-2020 19:37:35.832 INFO [main] org.apache.catalina.tribes.membership.McastServiceImpl.waitForMembers Done sleeping, membership established, start level:[4]
16-Aug-2020 19:37:35.836 INFO [main] org.apache.catalina.tribes.membership.McastServiceImpl.waitForMembers Sleeping for [1000] milliseconds to establish cluster membership, start level:[8]
16-Aug-2020 19:37:35.842 INFO [Tribes-Task-Receiver[localhost-Channel]-1] org.apache.catalina.tribes.io.BufferPool.getBufferPool Created a buffer pool with max size:[104857600] bytes of type: [org.apache.catalina.tribes.io.BufferPool15Impl]
16-Aug-2020 19:37:36.837 INFO [main] org.apache.catalina.tribes.membership.McastServiceImpl.waitForMembers Done sleeping, membership established, start level:[8]
16-Aug-2020 19:37:36.842 INFO [main] org.apache.catalina.ha.deploy.FarmWarDeployer.start Cluster FarmWarDeployer started.
16-Aug-2020 19:37:36.849 INFO [main] org.apache.catalina.ha.session.JvmRouteBinderValve.startInternal JvmRouteBinderValve started
16-Aug-2020 19:37:36.855 INFO [main] org.apache.catalina.startup.HostConfig.deployWAR Deploying web application archive [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/ExSession.war]
16-Aug-2020 19:37:37.000 INFO [main] org.apache.catalina.ha.session.DeltaManager.startInternal Register manager [/ExSession] to cluster element [Host] with name [localhost]
16-Aug-2020 19:37:37.000 INFO [main] org.apache.catalina.ha.session.DeltaManager.startInternal Starting clustering manager at [/ExSession]
16-Aug-2020 19:37:37.005 INFO [main] org.apache.catalina.ha.session.DeltaManager.getAllClusterSessions Manager [/ExSession], requesting session state from [org.apache.catalina.tribes.membership.MemberImpl[tcp://{0, 0, 0, 0}:4002,{0, 0, 0, 0},4002, alive=3360160, securePort=-1, UDP Port=-1, id={-56 72 63 -39 -27 123 70 86 -109 22 97 122 50 -99 0 24 }, payload={}, command={}, domain={}]]. This operation will timeout if no session state has been received within [60] seconds.
16-Aug-2020 19:37:37.121 INFO [main] org.apache.catalina.ha.session.DeltaManager.waitForSendAllSessions Manager [/ExSession]; session state sent at [8/16/20, 7:37 PM] received in [105] ms.
16-Aug-2020 19:37:37.160 INFO [main] org.apache.catalina.startup.HostConfig.deployWAR Deployment of web application archive [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/ExSession.war] has finished in [305] ms
16-Aug-2020 19:37:37.161 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/ROOT]
16-Aug-2020 19:37:37.185 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/ROOT] has finished in [24] ms
16-Aug-2020 19:37:37.185 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/docs]
16-Aug-2020 19:37:37.199 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/docs] has finished in [14] ms
16-Aug-2020 19:37:37.200 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/examples]
16-Aug-2020 19:37:37.393 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/examples] has finished in [193] ms
16-Aug-2020 19:37:37.393 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/host-manager]
16-Aug-2020 19:37:37.409 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/host-manager] has finished in [16] ms
16-Aug-2020 19:37:37.409 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/manager]
16-Aug-2020 19:37:37.422 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/manager] has finished in [13] ms
16-Aug-2020 19:37:37.422 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/ExSession_exploded]
16-Aug-2020 19:37:37.435 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/opt/tomcat_cluster/apache-tomcat-9.0.37_node1/webapps/ExSession_exploded] has finished in [12] ms
16-Aug-2020 19:37:37.436 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8180"]
16-Aug-2020 19:37:37.445 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [2676] milliseconds

These logs show that the cluster has started and that our node01 joined node02 which is listing on the receiver port 4002.

Next we need to configure Apache as a load balancer, since we have a running cluster with session replication, we have 2 options, either to setup session stickiness or not to set it up and let the Tomcat cluster handle sessions through session replication.

The Apache load balancer configuration in both cases would look like this:

# Apache Reverse proxy loadbalancer with session stickiness set to use JSESSIONID

<Proxy balancer://mycluster>
    BalancerMember http://127.0.0.1:8180  route=node01
    BalancerMember http://127.0.0.1:8280  route=node02
</Proxy>
ProxyPass /ExSession balancer://mycluster/ExSession stickysession=JSESSIONID|jsessionid
ProxyPassReverse /ExSession balancer://mycluster/ExSession stickysession=JSESSIONID|jsessionid

# Apache Reverse proxy loadbalancer with no session stickiness

#<Proxy "balancer://mycluster">
#    BalancerMember "http://127.0.0.1:8180"
#    BalancerMember "http://127.0.0.1:8280"
#</Proxy>
#ProxyPass "/ExSession" "balancer://mycluster/ExSession"
#ProxyPassReverse "/ExSession" "balancer://mycluster/ExSession"

I have commented out the config without session stickiness, and used the one with sessions stickiness to test what would happen if I deploy an application and shutdown the node I am sticky to it and see if my session will be replicated or not.

For testing this, I have wrote a small servlet based on the Tomcat servlet examples, it can be found in this URL: https://github.com/sherif-abdelfattah/ExSession_old/blob/master/ExSession.war.

One thing about the ExSession servlet application, it needs to have its sessions implementing the Java Serializable Interface so that Tomcat session managers can handle them, and the application must be marked with the <distributable/> tag in the application web.xml file.

Those are preconditions for application to be able to work with Tomcat cluster.

Once Apache is up and running we can see it is now able to display our servlet page directly from localhost on port 80:

Now take a note of our JSESSIONID cookie value, we are now served from node02, and the ID ends in CF07, the session is created on Aug. 16th and 20:19:55.

Now lets stop Tomcat node02 and refresh the browser page:

Now it can be seen that Tomcat failed over our session to node01, the session still have same time stamp and same ID, but it is served from node01.

This proves that session replication works as expected.


 

Wednesday, 22 July 2020

Mircosoft SQL Server cheat sheet

Since I have not worked much with SQL Server, its always a challenge for me to remember how to roles, permissions and do a quick backup, etc.
I don't run into it everyday so I created this cheat sheet to help me remember how things are done the SQL Server way.

Logins / users / roles SQL Server permissions:

-- List principals defined on this database:
use jira881;
go
select * from sys.database_principals

-- List all the grants given to certain principal
use jira850;
go
SELECT pr.principal_id, pr.name as pinciple_name, pr.type_desc,
   pr.authentication_type_desc, pe.state_desc, pe.permission_name
FROM sys.database_principals AS pr
JOIN sys.database_permissions AS pe
   ON pe.grantee_principal_id = pr.principal_id;

-- List all users who own a database on the server
select suser_sname(owner_sid) as 'Owner', state_desc, *
from master.sys.databases

-- List all users defined server wide
select * from master.sys.server_principals

-- List all Server wide logins available and the login type:
select sp.name as login,
   sp.type_desc as login_type,
   sl.password_hash,
   sp.create_date,
   sp.modify_date,
   case when sp.is_disabled = 1 then 'Disabled'
     else 'Enabled' end as status
from sys.server_principals sp
left join sys.sql_logins sl
     on sp.principal_id = sl.principal_id
where sp.type not in ('G', 'R')
order by sp.name;

-- list users in db_owner role for a certain database:
USE MyOptionsTest;
GO
SELECT members.name as 'members_name', roles.name as 'roles_name',roles.type_desc as 'roles_desc',members.type_desc as 'members_desc'
FROM sys.database_role_members rolemem
INNER JOIN sys.database_principals roles
ON rolemem.role_principal_id = roles.principal_id
INNER JOIN sys.database_principals members
ON rolemem.member_principal_id = members.principal_id
where roles.name = 'db_owner'
ORDER BY members.name

-- List SQL server dbrole vs database user mapping
USE MyOptionsTest;
GO
SELECT DP1.name AS DatabaseRoleName,
   isnull (DP2.name, 'No members') AS DatabaseUserName
FROM sys.database_role_members AS DRM
RIGHT OUTER JOIN sys.database_principals AS DP1
   ON DRM.role_principal_id = DP1.principal_id
LEFT OUTER JOIN sys.database_principals AS DP2
   ON DRM.member_principal_id = DP2.principal_id
WHERE DP1.type = 'R'
ORDER BY DP1.name;

-- Create a server wide login
CREATE login koki
WITH password = 'Koki_123_123'
go

-- create a user mapped to a server wide login:
USE MyOptionsTest;
GO
CREATE USER koki_mot FOR LOGIN koki
go
USE MyOptionsTest;
GO
EXEC sp_addrolemember 'db_owner', 'koki_mot';
go

USE jira881;
GO
CREATE USER koki_881 FOR LOGIN koki;
go
EXEC sp_addrolemember 'db_owner', 'koki_881';
go



Collations:

use master;
go
create database testdb1
collate Latin1_General_CI_AS
go

use master;
go
create database testdb_bin
collate SQL_Latin1_General_CP850_BIN2
go

-- list all the binary collations available on the server:
SELECT Name, Description FROM fn_helpcollations() WHERE Name like '%bin2%'



Backup:

-- take a database backup
BACKUP DATABASE MyOptionsTest
   TO DISK = '/tmp/MyOptionsTest.bak'
     WITH FORMAT;
GO

-- restore a the backup to a different db
RESTORE DATABASE MOTest
   FROM DISK = '/tmp/MyOptionsTest.bak'
   WITH
     MOVE 'MyOptionsTest' TO '/var/opt/mssql/data/MOTest.mdf',
     MOVE 'MyOptionsTest_log' TO '/var/opt/mssql/data/MOTest_log.ldf'

use MyOptionsTest
go
delete from dbo.t1
select * from dbo.t1

-- restore backup to same database
use master;
go
RESTORE DATABASE MyOptionsTest
   FROM DISK = '/tmp/MyOptionsTest.bak'
   WITH REPLACE
GO



mssql-scripter:

2023 pip install mssql-scripter
2024 mssql-scripter -S localhost -d jira850 -U koki
2028 mssql-scripter -S localhost -d jira881 -U koki --schema-and-data > ./jira881_mssql_scripter_out.sql
2029 ls -ltr
2030 vim jira881_mssql_scripter_out.sql
2031 mssql-scripter -S localhost -d jira881 -U koki --schema-and-data > ./jira881_mssql_scripter_out.sql

mssql-scripter is exceptionally useful, it allowed me to export the whole database in text format, which enables me to do text manipulations using Unix filters on the data, also allows the use of tools like diff and GUI diffuse to compare exports done after certain application operations.
Very useful tool.

Sunday, 19 July 2020

Configure Apache to increase protection against XSS and ClickJacking attacks.

Many software companies now use automated security test tools and many of them do a regular set of scans to ensure their application is well protected and secure.

One of the good tips to help minimized the reported issues and to avoid false positives, is to configure the application reverse proxy in the right way.
In this post, I will focus on Apache and how to add a simple configuration to mitigate multiple reported issues found by automated tools.

In this example, I am scanning a simple Java web form running behind Apache which acts as a reverse proxy in this case.
Below is the initial scan report:


As you can see, there are multiple findings in the scan report, mainly we have problems with the below:
  • X-Frame-Options Header Not Set
  • X-Content-Type-Options Header Missing
  • Absence of Anti-CSRF Tokens
  • Cookie Without SameSite Attribute
  • Charset Mismatch (Header Versus Meta Charset)  (informational finding)
We can mitigate most of those issues by properly setting headers in the Apache configuration, this should avoid most of the above issues.

Below is an example configuration for the domain http://172.17.0.1/:

# Set those headers to enable right CORS headers.
Header always set Access-Control-Allow-Origin "http://172.17.0.1"
Header always set Access-Control-Allow-Methods "POST, GET, OPTIONS, DELETE, PUT"
Header always set Access-Control-Max-Age "1000"
Header always set Access-Control-Allow-Headers "x-requested-with, Content-Type, origin, authorization, accept, client-security-token"
Header always set Access-Control-Expose-Headers "*"

# Added a rewrite to respond with a 200 SUCCESS on every OPTIONS request.
RewriteEngine On
RewriteCond %{REQUEST_METHOD} OPTIONS
RewriteRule ^(.*)$ $1 [R=200,L]

# Set x-frame-options to sameorigin to combat click-jacking.
Header always set X-Frame-Options  SAMEORIGIN

# Set Samesite to strict to counter potential session cookie hijacking, also helps protect 

# from CSRF though session cookie.
Header edit Set-Cookie ^(.*)$ $1;SameSite=strict

# Set X-Content-Type-Options to nosniff to avoid MIME sniffing.
Header always set X-Content-Type-Options nosniff


As you can see, the above rules address multiple issues reported in the scan, mainly to protect agains CORS problems,click-jacking, XSS though cookies and Anti-MIME sniffing.

Once that configuration is in effect, the number of reported issues is much less:


The last problem related to CSRF is an architectural problem in the application.
Modern web applications are encouraged to use tokens injected into the HTML pages and forms, dynamic generated content and on server side to validate that same user with same session is one who is submitting the request without being hijacked by some malicious attack.
Check https://en.wikipedia.org/wiki/Cross-site_request_forgery for more information.

Modsecurity does have rules that work by injecting content to help with CSRF, though, if applied without change, it could break the application functionality.





 

Sunday, 24 May 2020

How to test your system infrastructure - Part 2


In the second part of this post I need to document way for testing how fast disk IO is doing and how fast network infrastructure between 2 nodes when using standard TCP, all tests assume Linux based infrastructure.

Disk IO:

There are multiple ways to measure disk IO performance and to measure how fast we can read and write to disk.
To report on how many read and write operations are being executed, we use the tool iostat.
iostat with the -d option will report information about disk IO utilization, in terms of amount of transfers per second and bytes of reads and writes operations per filesystem.
More information can be displayed with extended statistics option -x:
More information can be found in the iostat manual page: https://linux.die.net/man/1/iostat.

One way to list the filesystems connected to your system is to use the proc filesystem as below:

[root@feanor ~]# cat /proc/partitions
major minor  #blocks  name
   8        0   94753088 sda
   8        1    1048576 sda1
   8        2   93703168 sda2
  11        0      58360 sr0
 253        0   52428800 dm-0
 253        1    4063232 dm-1
 253        2   37203968 dm-2
[root@feanor ~]#

Knowing filesystem / device information is useful when using the iostat command.
Another way is to use the lsblk command:

[root@feanor ~]# lsblk
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0 90.4G  0 disk
├─sda1            8:1    0    1G  0 part /boot
└─sda2            8:2    0 89.4G  0 part
  ├─centos-root 253:0    0   50G  0 lvm  /
  ├─centos-swap 253:1    0  3.9G  0 lvm  [SWAP]
  └─centos-home 253:2    0 35.5G  0 lvm  /home
sr0              11:0    1   57M  0 rom 
[root@feanor ~]#


One other way to monitor the IO speed on the system is to use the iotop command, iotop would show interactive information on which process is using disk reads and writes and the percentage of  time used to do swap in operation and IO waiting.

To focus on the disk IO speed on a given disk or filesystem, we have multiple commands to help measure how fast we can read and write.

Using hdparm, we can measure how fast we can read from a disk using the -t and -T options to use check cached reads and direct device reads:

[root@feanor man]# hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   23670 MB in  1.99 seconds = 11893.43 MB/sec
 Timing buffered disk reads: 1866 MB in  3.00 seconds = 621.44 MB/sec
[root@feanor man]#

Another way is to use the ioping command, this command would try to measure the disk latency and could print raw statistics:

[root@feanor man]# ioping /dev/sda1
4 KiB <<< /dev/sda1 (block device 1 GiB): request=1 time=45.8 ms (warmup)
4 KiB <<< /dev/sda1 (block device 1 GiB): request=2 time=46.8 ms
4 KiB <<< /dev/sda1 (block device 1 GiB): request=3 time=810.5 us
4 KiB <<< /dev/sda1 (block device 1 GiB): request=4 time=948.4 us
4 KiB <<< /dev/sda1 (block device 1 GiB): request=5 time=780.7 us
4 KiB <<< /dev/sda1 (block device 1 GiB): request=6 time=713.7 us
4 KiB <<< /dev/sda1 (block device 1 GiB): request=7 time=47.6 ms (slow)
^C
--- /dev/sda1 (block device 1 GiB) ioping statistics ---
6 requests completed in 97.6 ms, 24 KiB read, 61 iops, 245.8 KiB/s
generated 7 requests in 6.08 s, 28 KiB, 1 iops, 4.60 KiB/s
min/avg/max/mdev = 713.7 us / 16.3 ms / 47.6 ms / 21.9 ms
[root@feanor man]#

One other poor man's way to test the disk performance is to use the Linux dd command.
dd will print the average bytes per second speed it saw while executing the request.
You might want to execute multiple times to get a more useful average:

The below is a write test:

[root@feanor ~]# dd if=/dev/zero of=./tempfile bs=10K count=409600 status=progress conv=fdatasync
3872133120 bytes (3.9 GB) copied, 6.004911 s, 645 MB/s
409600+0 records in
409600+0 records out
4194304000 bytes (4.2 GB) copied, 8.79669 s, 477 MB/s
[root@feanor ~]# dd if=/dev/zero of=./tempfile bs=10K count=409600 status=progress conv=fdatasync
3910645760 bytes (3.9 GB) copied, 7.023097 s, 557 MB/s
409600+0 records in
409600+0 records out
4194304000 bytes (4.2 GB) copied, 9.18262 s, 457 MB/s
[root@feanor ~]#

and this one is a read test:

[root@feanor ~]# dd if=./tempfile of=/dev/null  bs=10K count=409600  status=progress
4128276480 bytes (4.1 GB) copied, 3.000355 s, 1.4 GB/s
409600+0 records in
409600+0 records out
4194304000 bytes (4.2 GB) copied, 3.0407 s, 1.4 GB/s
[root@feanor ~]# dd if=./tempfile of=/dev/null  bs=10K count=409600  status=progress
3743528960 bytes (3.7 GB) copied, 3.002333 s, 1.2 GB/s
409600+0 records in
409600+0 records out
4194304000 bytes (4.2 GB) copied, 3.307 s, 1.3 GB/s
[root@feanor ~]#

The above tests were affected by Linux disk caching, if we disable the cache, we can get results for the true disk performance:

[root@feanor ~]# echo 3 > /proc/sys/vm/drop_caches
[root@feanor ~]# dd if=./tempfile of=/dev/null  bs=10K count=409600  status=progress
3573360640 bytes (3.6 GB) copied, 3.003885 s, 1.2 GB/s
409600+0 records in
409600+0 records out
4194304000 bytes (4.2 GB) copied, 3.47044 s, 1.2 GB/s
[root@feanor ~]# dd if=./tempfile of=/dev/null  bs=10K count=409600  status=progress
3053260800 bytes (3.1 GB) copied, 3.004402 s, 1.0 GB/s
409600+0 records in
409600+0 records out
4194304000 bytes (4.2 GB) copied, 3.95025 s, 1.1 GB/s
[root@feanor ~]# dd if=./tempfile of=/dev/null  bs=10K count=409600  status=progress
3589294080 bytes (3.6 GB) copied, 3.000806 s, 1.2 GB/s
409600+0 records in
409600+0 records out
4194304000 bytes (4.2 GB) copied, 3.46599 s, 1.2 GB/s
[root@feanor ~]#

In my case it didn't seem to make a lot of difference if we drop the kernel disk caches.
One last test would be to use he bonnie++ package.
First we need to install it:

[root@feanor ~]# yum search bonnie
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * epel: mirror.i3d.net
==================================================================================== N/S matched: bonnie =====================================================================================
bonnie++.x86_64 : Filesystem and disk benchmark & burn-in suite

  Name and summary matches only, use "search all" for everything.
[root@feanor ~]# yum install bonnie++.x86_64

One bonnie++ is installed, we then run the test using a none root user and then we check the output:

[sherif@feanor ~]$ bonnie++ -f -n 0 |tee bonnie.out
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
feanor           8G           864053  70 513975  69           1430897  86  7303 271
Latency                         469ms     549ms               500ms   47897us

1.97,1.97,feanor,1,1590328006,8G,,,,864053,70,513975,69,,,1430897,86,7303,271,,,,,,,,,,,,,,,,,,,469ms,549ms,,500ms,47897us,,,,,,
[sherif@feanor ~]$

The text output is not the pretties, thus bonnie++ comes with a nice tool to convert the output to html:

[sherif@feanor ~]$ cat bonnie.out |bon_csv2html >/tmp/bonnie.out.html 2>/dev/null
[sherif@feanor sherif]# firefox /tmp/bonnie.out.html

The HTML report looks more readable:

Thus, it does seem that our disk is quite fast :)


Network:

To test network throughput and latency, we have a couple of tools to use.
For testing throughput, we can use the iperf tool.
To do the test, iperf needs to be installed on the 2 nodes involved in the test and should be running as a server on 1 of the node and a client on the other.

To run iperf as server we use the -s option:

sherif@fingon:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------


Then on the other node we run iperf as a client with the -c option, we need to provide the name of the server to connect to and optionally provide number of bytes used in the test with the -n option:

[root@feanor ~]#  iperf -n 10240000 -c fingon
------------------------------------------------------------
Client connecting to fingon, TCP port 5001
TCP window size:  280 KByte (default)
------------------------------------------------------------
[  3] local 192.168.56.104 port 48608 connected with 192.168.56.106 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 0.1 sec  9.77 MBytes  1.25 Gbits/sec
[root@feanor ~]#



Another way to test network through put is using the tool nuttcp, it is very similar to iperf and also works in client server model:

On server side we use the -S option:

[root@feanor ~]# nuttcp -S --nofork

Then on the client side, we run nuttcp with the server hostname or ip address:

sherif@fingon:~$ nuttcp -i1 feanor
  166.2500 MB /   1.00 sec = 1394.2422 Mbps     0 retrans
  204.1250 MB /   1.00 sec = 1712.4804 Mbps     0 retrans
  215.9375 MB /   1.00 sec = 1811.5092 Mbps     0 retrans
  190.5000 MB /   1.00 sec = 1597.7822 Mbps     0 retrans
   91.1875 MB /   1.00 sec =  764.9232 Mbps     0 retrans
  180.5625 MB /   1.00 sec = 1514.8680 Mbps     0 retrans
  209.0625 MB /   1.00 sec = 1753.7416 Mbps     0 retrans
  204.3750 MB /   1.00 sec = 1713.3612 Mbps     0 retrans
  206.1250 MB /   1.00 sec = 1730.2680 Mbps     0 retrans
  176.6875 MB /   1.00 sec = 1481.2068 Mbps     0 retrans

 1844.8750 MB /  10.43 sec = 1483.6188 Mbps 11 %TX 43 %RX 0 retrans 1.10 msRTT
sherif@fingon:~$


For testing network latency, we use old fashioned ping, ping reports the latency time statistics at the end of its run:

[root@feanor ~]# ping fingon
PING fingon (192.168.56.106) 56(84) bytes of data.
64 bytes from fingon (192.168.56.106): icmp_seq=1 ttl=64 time=0.628 ms
64 bytes from fingon (192.168.56.106): icmp_seq=2 ttl=64 time=1.26 ms
64 bytes from fingon (192.168.56.106): icmp_seq=3 ttl=64 time=1.39 ms
64 bytes from fingon (192.168.56.106): icmp_seq=4 ttl=64 time=1.02 ms
64 bytes from fingon (192.168.56.106): icmp_seq=5 ttl=64 time=1.12 ms
64 bytes from fingon (192.168.56.106): icmp_seq=6 ttl=64 time=1.16 ms
64 bytes from fingon (192.168.56.106): icmp_seq=7 ttl=64 time=1.15 ms
64 bytes from fingon (192.168.56.106): icmp_seq=8 ttl=64 time=1.21 ms
^C
--- fingon ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 7029ms
rtt min/avg/max/mdev = 0.628/1.120/1.393/0.215 ms
[root@feanor ~]#


Using sar:

last part of this post is dedicated the the good old sar system reporting tool.
sar offers a comprehensive set of reported statistics about the system CPU, memory and IO operations.
sar reports the data collected in various points in time and can provide very useful information about patterns of usage for system resources.
Below are a couple of examples:

[root@feanor ~]# sar -n DEV
Linux 3.10.0-862.el7.x86_64 (feanor)    05/24/2020      _x86_64_        (4 CPU)

08:15:02 AM       LINUX RESTART

08:20:02 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
08:30:01 AM    enp0s3      0.08      0.09      0.01      0.01      0.00      0.00      0.00
08:30:01 AM    enp0s8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:30:01 AM    enp0s9      0.01      0.01      0.00      0.00      0.00      0.00      0.00
08:30:01 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:30:01 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:30:01 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:40:02 AM    enp0s3      0.06      0.06      0.00      0.01      0.00      0.00      0.00
08:40:02 AM    enp0s8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:40:02 AM    enp0s9      0.13      0.01      0.01      0.00      0.00      0.00      0.07
08:40:02 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:40:02 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:40:02 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:50:01 AM    enp0s3      0.04      0.04      0.00      0.00      0.00      0.00      0.00
08:50:01 AM    enp0s8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:50:01 AM    enp0s9      0.02      0.01      0.00      0.00      0.00      0.00      0.01
08:50:01 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:50:01 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:50:01 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:00:01 AM    enp0s3      0.02      0.03      0.00      0.00      0.00      0.00      0.00
09:00:01 AM    enp0s8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:00:01 AM    enp0s9      0.01      0.01      0.00      0.00      0.00      0.00      0.00
09:00:01 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:00:01 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:00:01 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:10:01 AM    enp0s3      0.02      0.02      0.00      0.00      0.00      0.00      0.00
09:10:01 AM    enp0s8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:10:01 AM    enp0s9      0.04      0.01      0.01      0.00      0.00      0.00      0.01
09:10:01 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:10:01 AM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:10:01 AM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:20:01 AM    enp0s3     10.32      1.67     14.53      0.11      0.00      0.00      0.00
09:20:01 AM    enp0s8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
09:20:01 AM    enp0s9      0.01      0.01      0.00      0.00      0.00      0.00      0.01
.....


05:40:01 PM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:       enp0s3      0.12      0.12      0.03      0.01      0.00      0.00      0.00
Average:       enp0s8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:       enp0s9    451.53    225.65    658.82    589.50      0.00      0.00      0.01
Average:           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:    virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:       virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00



[root@feanor ~]# sar -u
Linux 3.10.0-862.el7.x86_64 (feanor)    05/24/2020      _x86_64_        (4 CPU)

08:15:02 AM       LINUX RESTART

08:20:02 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
08:30:01 AM     all      0.29      0.00      0.27      0.05      0.00     99.39
08:40:02 AM     all      0.09      0.00      0.11      0.02      0.00     99.78
08:50:01 AM     all      1.51      0.00      0.51      0.05      0.00     97.93
09:00:01 AM     all      1.67      0.00      1.29      0.75      0.00     96.29
09:10:01 AM     all      1.13      0.00      0.43      0.05      0.00     98.39
09:20:01 AM     all      0.40      0.00      0.60      0.19      0.00     98.81
09:30:01 AM     all      0.21      0.00      0.20      0.05      0.00     99.53
09:40:01 AM     all      0.39      0.00      1.60      0.17      0.00     97.85
Average:        all      0.71      0.00      0.62      0.17      0.00     98.50

09:44:57 AM       LINUX RESTART


For a complete set of data, one could use the sar -A command which will log a huge amount of information about the server in the current day.






References & good reads:

https://linuxhint.com/disk_activity_web_server/
https://wiki.archlinux.org/index.php/Benchmarking
https://www.cyberciti.biz/faq/howto-linux-unix-test-disk-performance-with-dd-command/
https://www.opsdash.com/blog/disk-monitoring-linux.html
https://haydenjames.io/linux-server-performance-disk-io-slowing-application/
https://www.unixmen.com/how-to-measure-disk-performance-with-fio-and-ioping/
https://fio.readthedocs.io/en/latest/index.html
https://dotlayer.com/how-to-use-fio-to-measure-disk-performance-in-linux/
https://linux-mm.org/Drop_Caches
https://books.google.nl/books?id=1nc5DwAAQBAJ&printsec=frontcover&hl=nl&source=gbs_ge_summary_r&cad=0#v=onepage&q=bonnie&f=false
https://www.cyberciti.biz/faq/ping-test-a-specific-port-of-machine-ip-address-using-linux-unix/
https://linoxide.com/monitoring-2/10-tools-monitor-cpu-performance-usage-linux-command-line/

Saturday, 16 May 2020

How to test your system infrastructure - Part 1

In this post I need to document various ways to test parts of an application infrastructure, mainly how to check CPU usage, how fast is disk IO, how fast is network infrastructure between 2 nodes, all tests assume Linux based infrastructure.


CPU:

On Linux the easiest way to check how much CPU is being used is using the top command:
top is an interactive command, clicking 1 while top is running, it will print the CPU usage per core.
top can also be run in none interactive mode as needed:

sherif@fingolfin:~$ top -b -n 1 |head
top - 14:22:23 up 52 min,  1 user,  load average: 0,29, 0,12, 0,06
Tasks: 175 total,   1 running, 129 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,5 us,  0,3 sy,  0,0 ni, 98,9 id,  0,2 wa,  0,0 hi,  0,1 si,  0,0 st
KiB Mem :  6072348 total,  4880976 free,   434896 used,   756476 buff/cache
KiB Swap:  2097148 total,  2097148 free,        0 used.  5398192 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2063 root      20   0  420504  94464  33488 S   2,3  1,6   0:25.45 Xorg
    1 root      20   0  225232   9024   6748 S   0,0  0,1   0:02.70 systemd
    2 root      20   0       0      0      0 S   0,0  0,0   0:00.00 kthreadd
sherif@fingolfin:~$

Another way to report on CPU usage is using iostat command:

[root@feanor ~]# iostat -c
Linux 3.10.0-862.el7.x86_64 (feanor)    05/16/2020      _x86_64_        (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.23    0.00    0.28    0.07    0.00   99.42

[root@feanor ~]#

One other way to benchmark the CPU execution on the system is to use the sysbench package as below:

sherif@fingolfin:~$ time sysbench --test=cpu --threads=6 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.11 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 6
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
    events per second:  3886.37
General statistics:
    total time:                          10.0011s
    total number of events:              38872
Latency (ms):
         min:                                  0.62
         avg:                                  1.54
         max:                                 29.13
         95th percentile:                      8.74
         sum:                              59820.54
Threads fairness:
    events (avg/stddev):           6478.6667/112.54
    execution time (avg/stddev):   9.9701/0.03
real    0m10,014s
user    0m29,940s
sys    0m0,012s
sherif@fingolfin:~$

The above test shows how much latency could be expected running multiple threads on the system.
More info about sysbench tool can be found in this page: https://linuxconfig.org/how-to-benchmark-your-linux-system

Memory:


To measure how fast our system memory works, we can use the small tool mbw from: https://github.com/raas/mbw.
The tools mesaures the memory bandwidth from user space, similar to what could be noticed by standard applications.
To compile the code on Centos we follow the below:

[root@feanor ~]# git clone https://github.com/raas/mbw
Cloning into 'mbw'...
remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 89 (delta 0), reused 1 (delta 0), pack-reused 85
Unpacking objects: 100% (89/89), done.
[root@feanor ~]# cd mbw
[root@feanor mbw]# ls -ltr
total 28
-rw-r--r--. 1 root root  423 May 16 16:12 README
-rw-r--r--. 1 root root  232 May 16 16:12 Makefile
-rw-r--r--. 1 root root 1255 May 16 16:12 mbw.1
-rw-r--r--. 1 root root 1640 May 16 16:12 mbw.spec
-rw-r--r--. 1 root root 8538 May 16 16:12 mbw.c
[root@feanor mbw]# make
cc     mbw.c   -o mbw
[root@feanor mbw]# ./mbw 512
Long uses 8 bytes. Allocating 2*67108864 elements = 1073741824 bytes of memory.
Using 262144 bytes as blocks for memcpy block copy test.
Getting down to business... Doing 10 runs per test.
0       Method: MEMCPY  Elapsed: 0.08889        MiB: 512.00000  Copy: 5760.122 MiB/s
1       Method: MEMCPY  Elapsed: 0.09538        MiB: 512.00000  Copy: 5368.283 MiB/s
2       Method: MEMCPY  Elapsed: 0.09289        MiB: 512.00000  Copy: 5512.133 MiB/s
3       Method: MEMCPY  Elapsed: 0.09756        MiB: 512.00000  Copy: 5247.891 MiB/s
4       Method: MEMCPY  Elapsed: 0.09414        MiB: 512.00000  Copy: 5438.593 MiB/s
5       Method: MEMCPY  Elapsed: 0.08911        MiB: 512.00000  Copy: 5745.450 MiB/s
6       Method: MEMCPY  Elapsed: 0.08720        MiB: 512.00000  Copy: 5871.627 MiB/s
7       Method: MEMCPY  Elapsed: 0.09688        MiB: 512.00000  Copy: 5284.616 MiB/s
8       Method: MEMCPY  Elapsed: 0.09409        MiB: 512.00000  Copy: 5441.598 MiB/s
9       Method: MEMCPY  Elapsed: 0.09243        MiB: 512.00000  Copy: 5539.087 MiB/s
AVG     Method: MEMCPY  Elapsed: 0.09286        MiB: 512.00000  Copy: 5513.825 MiB/s
0       Method: DUMB    Elapsed: 0.25512        MiB: 512.00000  Copy: 2006.875 MiB/s
1       Method: DUMB    Elapsed: 0.23047        MiB: 512.00000  Copy: 2221.528 MiB/s
2       Method: DUMB    Elapsed: 0.22259        MiB: 512.00000  Copy: 2300.245 MiB/s
3       Method: DUMB    Elapsed: 0.23621        MiB: 512.00000  Copy: 2167.544 MiB/s
4       Method: DUMB    Elapsed: 0.21707        MiB: 512.00000  Copy: 2358.697 MiB/s
5       Method: DUMB    Elapsed: 0.22799        MiB: 512.00000  Copy: 2245.742 MiB/s
6       Method: DUMB    Elapsed: 0.22476        MiB: 512.00000  Copy: 2277.965 MiB/s
7       Method: DUMB    Elapsed: 0.22205        MiB: 512.00000  Copy: 2305.777 MiB/s
8       Method: DUMB    Elapsed: 0.22730        MiB: 512.00000  Copy: 2252.490 MiB/s
9       Method: DUMB    Elapsed: 0.22879        MiB: 512.00000  Copy: 2237.899 MiB/s
AVG     Method: DUMB    Elapsed: 0.22924        MiB: 512.00000  Copy: 2233.515 MiB/s
0       Method: MCBLOCK Elapsed: 0.09570        MiB: 512.00000  Copy: 5350.052 MiB/s
1       Method: MCBLOCK Elapsed: 0.10106        MiB: 512.00000  Copy: 5066.197 MiB/s
2       Method: MCBLOCK Elapsed: 0.09312        MiB: 512.00000  Copy: 5498.459 MiB/s
3       Method: MCBLOCK Elapsed: 0.09769        MiB: 512.00000  Copy: 5240.961 MiB/s
4       Method: MCBLOCK Elapsed: 0.09894        MiB: 512.00000  Copy: 5174.958 MiB/s
5       Method: MCBLOCK Elapsed: 0.09634        MiB: 512.00000  Copy: 5314.456 MiB/s
6       Method: MCBLOCK Elapsed: 0.09780        MiB: 512.00000  Copy: 5235.388 MiB/s
7       Method: MCBLOCK Elapsed: 0.09487        MiB: 512.00000  Copy: 5397.086 MiB/s
8       Method: MCBLOCK Elapsed: 0.09828        MiB: 512.00000  Copy: 5209.446 MiB/s
9       Method: MCBLOCK Elapsed: 0.09942        MiB: 512.00000  Copy: 5149.973 MiB/s
AVG     Method: MCBLOCK Elapsed: 0.09732        MiB: 512.00000  Copy: 5260.924 MiB/s
[root@feanor mbw]#

The tool is available as a Debian package.
One cool test is to see when the tool tries to allocate 4GB on the above system, that machine has only 4GB of memory, and allocating that size would drive the mbw tool to get swapped out, we can see that with multiple ways, first, the bandwidth is orders of mangitude lower:

[root@feanor mbw]# ./mbw 2048
Long uses 8 bytes. Allocating 2*268435456 elements = 4294967296 bytes of memory.
Using 262144 bytes as blocks for memcpy block copy test.
Getting down to business... Doing 10 runs per test.
0       Method: MEMCPY  Elapsed: 26.14064       MiB: 2048.00000 Copy: 78.345 MiB/s
1       Method: MEMCPY  Elapsed: 42.49331       MiB: 2048.00000 Copy: 48.196 MiB/s
2       Method: MEMCPY  Elapsed: 18.70199       MiB: 2048.00000 Copy: 109.507 MiB/s
3       Method: MEMCPY  Elapsed: 55.37665       MiB: 2048.00000 Copy: 36.983 MiB/s
4       Method: MEMCPY  Elapsed: 35.01051       MiB: 2048.00000 Copy: 58.497 MiB/s
5       Method: MEMCPY  Elapsed: 20.52362       MiB: 2048.00000 Copy: 99.787 MiB/s
6       Method: MEMCPY  Elapsed: 21.93620       MiB: 2048.00000 Copy: 93.362 MiB/s
7       Method: MEMCPY  Elapsed: 37.51056       MiB: 2048.00000 Copy: 54.598 MiB/s
8       Method: MEMCPY  Elapsed: 28.07473       MiB: 2048.00000 Copy: 72.948 MiB/s
9       Method: MEMCPY  Elapsed: 14.76706       MiB: 2048.00000 Copy: 138.687 MiB/s
AVG     Method: MEMCPY  Elapsed: 30.05353       MiB: 2048.00000 Copy: 68.145 MiB/s
0       Method: DUMB    Elapsed: 11.23370       MiB: 2048.00000 Copy: 182.309 MiB/s
1       Method: DUMB    Elapsed: 10.76112       MiB: 2048.00000 Copy: 190.315 MiB/s
2       Method: DUMB    Elapsed: 15.99955       MiB: 2048.00000 Copy: 128.004 MiB/s
3       Method: DUMB    Elapsed: 23.18597       MiB: 2048.00000 Copy: 88.329 MiB/s
4       Method: DUMB    Elapsed: 28.14035       MiB: 2048.00000 Copy: 72.778 MiB/s
5       Method: DUMB    Elapsed: 31.18035       MiB: 2048.00000 Copy: 65.682 MiB/s
6       Method: DUMB    Elapsed: 31.02135       MiB: 2048.00000 Copy: 66.019 MiB/s
7       Method: DUMB    Elapsed: 36.10925       MiB: 2048.00000 Copy: 56.717 MiB/s
8       Method: DUMB    Elapsed: 51.37134       MiB: 2048.00000 Copy: 39.867 MiB/s
9       Method: DUMB    Elapsed: 60.84004       MiB: 2048.00000 Copy: 33.662 MiB/s
AVG     Method: DUMB    Elapsed: 29.98430       MiB: 2048.00000 Copy: 68.302 MiB/s
0       Method: MCBLOCK Elapsed: 67.50246       MiB: 2048.00000 Copy: 30.340 MiB/s
1       Method: MCBLOCK Elapsed: 74.09162       MiB: 2048.00000 Copy: 27.641 MiB/s
2       Method: MCBLOCK Elapsed: 77.48624       MiB: 2048.00000 Copy: 26.430 MiB/s
3       Method: MCBLOCK Elapsed: 75.32009       MiB: 2048.00000 Copy: 27.191 MiB/s
4       Method: MCBLOCK Elapsed: 94.43207       MiB: 2048.00000 Copy: 21.688 MiB/s
5       Method: MCBLOCK Elapsed: 96.87246       MiB: 2048.00000 Copy: 21.141 MiB/s
6       Method: MCBLOCK Elapsed: 102.09089      MiB: 2048.00000 Copy: 20.061 MiB/s
7       Method: MCBLOCK Elapsed: 95.71384       MiB: 2048.00000 Copy: 21.397 MiB/s
8       Method: MCBLOCK Elapsed: 89.24437       MiB: 2048.00000 Copy: 22.948 MiB/s
9       Method: MCBLOCK Elapsed: 103.73286      MiB: 2048.00000 Copy: 19.743 MiB/s
AVG     Method: MCBLOCK Elapsed: 87.64869       MiB: 2048.00000 Copy: 23.366 MiB/s
[root@feanor mbw]#

Using the iotop tool we can see that mbw tool is swapped out:
 And using the smem tool we can see that mbw tool is swapping:

[root@feanor mbw]# smem |head -1; smem|grep mbw
  PID User     Command                         Swap      USS      PSS      RSS
 4144 root     grep --color=auto mbw              0      140      326      704
 4005 root     ./mbw 2048                    584004  3610400  3610400  3610408
[root@feanor mbw]#


To collect system wide memory information, we can use top command or we can also use vmstat:

[root@feanor mbw]# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  2 2046444 112560      0 125508 8596 5262 13055  5449 2268  827  1  9 76 14  0

[root@feanor mbw]# vmstat -s
      4043552 K total memory
      3824396 K used memory
      2936656 K active memory
       782444 K inactive memory
       107972 K free memory
            0 K buffer memory
       111184 K swap cache
      4063228 K total swap
      2026724 K used swap
      2036504 K free swap
        16716 non-nice user cpu ticks
           20 nice user cpu ticks
        89870 system cpu ticks
       926212 idle cpu ticks
       171799 IO-wait cpu ticks
            0 IRQ cpu ticks
        21806 softirq cpu ticks
            0 stolen cpu ticks
    160271685 pages paged in
     66888866 pages paged out
     26373088 pages swapped in
     16147227 pages swapped out
     27846320 interrupts
     10151325 CPU context switches
   1589638134 boot time
         4248 forks
[root@feanor mbw]#

The vmstat tools uses the kernel file /proc/meminfo which contains more information about system memory usage as can be seen below:

[root@feanor mbw]# cat /proc/meminfo
MemTotal:        4043552 kB
MemFree:         3622752 kB
MemAvailable:    3570904 kB
Buffers:               0 kB
Cached:           110172 kB
SwapCached:        52480 kB
Active:            66004 kB
Inactive:         146504 kB
Active(anon):      49768 kB
Inactive(anon):    61024 kB
Active(file):      16236 kB
Inactive(file):    85480 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       4063228 kB
SwapFree:        3716148 kB
Dirty:                20 kB
Writeback:             0 kB
AnonPages:         64504 kB
Mapped:            24988 kB
Shmem:              8356 kB
Slab:              79168 kB
SReclaimable:      32396 kB
SUnreclaim:        46772 kB
KernelStack:        6752 kB
PageTables:        32416 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6085004 kB
Committed_AS:    2550352 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      105404 kB
VmallocChunk:   34359537660 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      131008 kB
DirectMap2M:     4063232 kB
[root@feanor mbw]#

Another small nice tool to report memory usage on a linux system is the free tool:

[root@feanor mbw]# free -h
              total        used        free      shared  buff/cache   available
Mem:           3.9G        235M        3.4G        8.7M        230M        3.4G
Swap:          3.9G        335M        3.5G
[root@feanor mbw]#


Here the output is similar to what we get from top, we can see how much swap is used, how much memory is used by Linux to for buffers and disk caching and how much is memory available to the system.