Ops Daily: haproxy

Showing posts with label haproxy. Show all posts

Thursday, 14 June 2018

Using Haproxy to send Proxy-Authorization header to an up stream authenticating proxy

I have been lately seeing a lot of challenge to get certain Java libraries like Eclipse egit to work with https/https proxy with basic authentication.
Java will not accept the http(s) proxy user and password out of the box, code needs to be written to handle those which is not the case in older versions of egit.

To come around this, I have installed HAproxy and used it as an intermediate layer between my scripts and the backend proxy.
To test this I have installed HAproxy 1.6.3 on Linux Mint along with Squid to act as forward proxy with basic authentication enabled.

HAproxy was able to send the Proxy-authorization header for me and hide the complexity of worrying about how to make egit do this.

Below is the HAproxy configuration:

global
        #log /dev/log   local0
        #log /dev/log   local1 notice
        chroot /var/lib/haproxy
        #stats socket /run/haproxy/admin.sock mode 660 level admin
        #stats timeout 30s
        user haproxy
        group haproxy
        daemon
        maxconn 1024

        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private

        # Default ciphers to use on SSL-enabled listening sockets.
        # For more information, see ciphers(1SSL). This list is from:
        # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
        ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
        ssl-default-bind-options no-sslv3

defaults
        log     global
        mode    http
        option httplog
        option dontlognull
        timeout connect 5000ms
        timeout client 50000ms
        timeout server 50000ms
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http

frontend hideproxy
        bind *:3003
        default_backend authproxy
        option http_proxy
        option http-use-proxy-header

backend authproxy
        server proxyserver localhost:3128
        reqadd Proxy-Authorization:\ Basic\ dXNlcjp1c2Vy

listen stats
    bind        :1936
    stats enable
    stats uri /

To obtain the base64 code for the user and password you can do so by using the below command:

echo -n user:password | openssl enc -a

Once HAproxy is up, the any request to port 3003 will be mapped to the squid proxy on port 3128 with the Proxy-Authorization added to it:

Turin haproxy # curl -I -x http://localhost:3003 https://www.google.com
HTTP/1.1 200 Connection established

HTTP/1.1 200 OK
Date: Thu, 14 Jun 2018 16:10:34 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2018-06-14-16; expires=Sat, 14-Jul-2018 16:10:34 GMT; path=/; domain=.google.com
Set-Cookie: NID=132=kWdwiPFfWHqMQ-X_H9_W08F60_x1eTSIM26K9GEH6TYj--ipq6veJnTM8cZHg9yKYUQWHikVKcBfVg87utujazE3MKhi6q13QoanH_Q8BXaVPpbT8X7URICo4ZlcRvSG; expires=Fri, 14-Dec-2018 16:10:34 GMT; path=/; domain=.google.com; HttpOnly
Transfer-Encoding: chunked
Alt-Svc: quic=":443"; ma=2592000; v="43,42,41,39,35"
Accept-Ranges: none
Vary: Accept-Encoding

Turin haproxy #

Sunday, 25 September 2016

Haproxy CSV stats

Haproxy exposes a very useful statistics page that can be accessed over a web browser or from the command line using a tool like curl.
Haproxy allows the stats to be exposed in a CSV format that is exceedingly useful if you are going to have a script around it.

To access the CSV stats from http interface use the below:

curl -u statsuser:statspassword 'http://haproxyhost:8001/stats/?stats;csv;' 2>/dev/null |egrep -iv "up|^#|open" |cut -d"," -f1,2,18

Where port 8001 is the statistics port as defined in the haproxy config.
The config is simple, already shown in older posts, should be something like this:

listen stats *:8001
    stats enable
    stats uri /stats
    stats hide-version
    stats auth statsuser:statspassword

The above mini script will print out only the apps and backends that show as down.
working ones will not show up.
The CVS header is also stripped as the header always starts with a "#" this makes it possible to process only the data using a grep -v as above.

The CSV result contains a big amount of info that can be used for load management and automation.
The details of the haproxy CSV stats layout can be found at:
https://cbonte.github.io/haproxy-dconv/1.5/configuration.html#9

Thursday, 15 September 2016

Apache / haproxy large Cookie size issues

Although HTTP protocol doesn't have a limit on the cookie size, which is in turn a part of the request/response headers, Apache does impose a limit to protect the webserver from denial of service attaches.

Apache controls this using the "LimitRequestFieldSize" directive which defaults to 8192 byte as a max for a header field.

If larger cookies are needed, we need to bump up this value to a bigger number.
eg:

LimitRequestFieldSize 18192

in order to test this, i set up a simple webserver config with the below config:

RequestHeader unset If-Modified-Since
RequestHeader unset If-None-Match

LimitRequestFieldSize 18192

Header set Set-Cookie "apachenode=node1; path=/;"
Header add Set-Cookie "env=test; path=/;"
Header add Set-Cookie "TimeS=NONE; path=/;"

The LimitRequestFieldSize only works on Requests not on the response, thus apache can set-Cookies with large values with no problems, its the job of the browser to validate this.

To be able to test this from Firefox, I used the Cookie Manager plugin https://addons.mozilla.org/en-US/firefox/addon/cookies-manager-plus/ to set a large cookie that is 24k in size with a random string generated from the site: http://textmechanic.com/text-tools/randomization-tools/random-string-generator/

I managed to prove the compiled limit in Centos 7 is indeed 8192 and expanded that to ~18k as seen above and it worked.

When apache fails to accept the request it responses with an http 400 error as below:

I tried to do the same test with commandline curl but seems curl truncates the large cookie when using the -b option and passing a file as below:

[root@feanor conf]# curl -b ./cookies.txt http://localhost/new.html
this is a new page :)
[root@feanor conf]# ls -ltrh ./cookies.txt
-rw-r--r-- 1 root root 25K Sep 15 12:28 ./cookies.txt
[root@feanor conf]#

Although the cookie file contains a single cookie that is 24k in size, curl seems to have truncated it and the request went in.

lowering the LimitRequestFieldSize to something like 4600 managed to have curl reproduce the same behaviors as the browsers:

[root@feanor conf]# curl -b ./cookies.txt http://localhost/new.html
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
Size of a request header field exceeds server limit.<br />
<pre>

</pre>
</p>
</body></html>
[root@feanor conf]#

Thus, curl could be tricking you if you are debugging an issue like this from the command line.

Also one more note about large Cookie sizes, if haproxy is used in the setup, eg: as a balancer in front of Apache or for SSL offloading, it could be needed to increase the tune.bufsize so that it can accept larger requests and larger Cookie sizes.

Haproxy checks the size as (tune.bufsize - tune.maxrewrite) where Maxrewrite is adjusted to half buffersize if that is larger than 1024.

Check the haproxy config at: http://www.haproxy.org/download/1.5/doc/configuration.txt.

Given the above info, tune.bufsize should be set to be double the Apache LimitRequestFieldSize.

Watch out for memory and ensure haproxy has enough resources to work with no issues.

Monday, 8 August 2016

RabbitMQ Clustering configurations

In this post we will see how to cluster rabbitMQ in 2 configurations.
one using same host, called a vertical cluster, and other using 2 differents hosts called the horizontal cluster.

Create Vertical clustering

1- Start RabbitMQ with differant node names and port numbers similar to below:

RABBITMQ_NODE_PORT=5672 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15672}]" RABBITMQ_NODENAME=rabbit ./rabbitmq-server -detached
RABBITMQ_NODE_PORT=5673 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15673}]" RABBITMQ_NODENAME=rabbit2 ./rabbitmq-server -detached

using the above env. variable form of starting up rabbitMQ vs using config file; enables us to use diffent ports and nodenames.
The config file should be empty in this case.
The above will start 2 RabbitMQ nodes on the same VM each one has its unique name and has its own uniq set of ports.
2- To enable the 2 nodes to be in a single cluster with one node as DISC and one as RAM do the below:

./rabbitmqctl -n rabbit2@beren stop_app -> stop the rabbitMQ application but node is still running on host beren
./rabbitmqctl -n rabbit2@beren join_cluster --ram rabbit@beren -> Add rabbit2 as RAM node to rabbit cluster, default is DISC
./rabbitmqctl -n rabbit2@beren start_app -> Start the rabbitMQ application
./rabbitmqctl -n rabbit@beren cluster_status -> check the cluster status should show something like this:

Cluster status of node rabbit@beren ...
[{nodes,[{disc,[rabbit@beren]},{ram,[rabbit2@beren]}]},
{running_nodes,[rabbit2@beren,rabbit@beren]},
{partitions,[]}]
...done.

3- to change node type from RAM to DISC do:

./rabbitmqctl -n rabbit2@beren stop_app
./rabbitmqctl -n rabbit2@beren change_cluster_node_type disc
./rabbitmqctl -n rabbit2@beren start_app

to change it back to RAM do:

./rabbitmqctl -n rabbit2@beren stop_app
./rabbitmqctl -n rabbit2@beren change_cluster_node_type ram
./rabbitmqctl -n rabbit2@beren start_app
Note that at least the cluster should have one disc node. Ram nodes don't offer faster response in queue performance but only offer better speed on queue creation, recommended to have at least 1 disc node on every VM to allow high availability and add ram nodes in a vertical cluster as decscribed above IF needed.

Create Horizontal clustering

To do this we need a second VM running RabbitMQ.
on the new 2nd node do the following.
1- copy the erlang cookie file from the old VM RabbitMQ user home directory to the new VM user home directory.
cookie file name is .erlang.cookie

2- go to RABBITMQ_HOME/sbin & start rabbitMQ

./rabbitmq-server -detached

3- Run the below list of commands:

./rabbitmqctl -n rabbit@vardamir stop_app
./rabbitmqctl -n rabbit@vardamir join_cluster rabbit@beren
./rabbitmqctl -n rabbit@vardamir start_app

All the queues and topics created will automatically be transfered to the new joining node.
This will create a 2 node on 2 VM cluster that will offer true HA for the rabbitMQ. queues can be configured to work in HA mode by creating a policy from the rabbitMQ admin console, the policy should containt ha-mode:all so that the queues are all working across the cluster in HA.
Also replication can be enabled in the same manner.

HAproxy config

HA proxy can be used to offer true load-balancing for a rabbitMQ cluster.
The cluster doesn't offer a common gateway by default so to benefit from it we need balancer sitting infornt of it.
In this case haproxy can help as it is very light weight and a performer.
The config for balancing 2 nodes should look like this:

# Simple configuration for an HTTP proxy listening on port 8080 on all
# interfaces and forwarding requests to a single backend "servers" with a
# single server "server1" listening on 127.0.0.1:8000

    global
    log 127.0.0.1 local0 info
    daemon

    defaults
    log     global
    mode     tcp
    option tcplog
    option dontlognull
    retries 3
    option redispatch
    maxconn    2000
    timeout connect 10s
    timeout client     120s
    timeout    server     120s

    listen rabbitmq_local_cluster 0.0.0.0:8080

    balance roundrobin
    server rabbitnode1    beren:5672 check inter 5000 rise 2 fall 3 #Disk Node
    server rabbitnode2 vardamir:5672 check inter 5000 rise 2 fall 3 #Disk Node


    listen private_monitoring :8101
    mode http
    option httplog
    stats enable
    stats uri /stats
    stats refresh 30s

The above makes use of haproxy tcp mode since rabbitMQ uses amqp which is not http compatible.

Monday, 11 July 2016

HAproxy configuration as an HTTP router / balancer

HAproxy is a very flexible reliable TCP proxy and balancer middleware.
It can be used as a generic TCP proxy / port mapper or as a TCP load balancer.
It also supports using HTTP protocol mode where it is able to work as an http proxy server and loadbalancer.

In this post i will focus on the http mode, it is used more for implementing web proxies providing high availability for web applications. below is a sample config for HAproxy as an http proxy and router:

#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#   http://haproxy.1wt.eu/download/1.3/doc/configuration.txt
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log         127.0.0.1
    maxconn     400000
    user        sherif
    group       sherif
    daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode        http
    log         global
    option      dontlognull
    option      httpclose
    option      httplog
    option      forwardfor
    option      redispatch
    timeout connect 100000 # default 10 second time out if a backend is not found
    timeout client 600000
    timeout server 600000
    maxconn     600000
    retries     3
    errorfile 503 /etc/haproxy/errors/503.http
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend main vardamire:8080
acl app1_url                     path_beg     /app1
acl test_url                    path_beg    /tst
acl app2_url                    path_dir    app2
###################################################################
use_backend app1_8080                        if   app1_url
use_backend app2_8081                        if   test_url app2_url
use_backend default_services
###################################################################
backend app1_8080
    balance     roundrobin
    option httpchk GET /app1/alive
    cookie JSESSIONID prefix nocache
    server      app1    server1:8080 check cookie a1
    server      app2    server2:8080 check cookie a2
    server      app3    server3:8080 check cookie a3

backend app2_8081
    balance     roundrobin
    option httpchk GET /tst/app2/alive
    server      app1     server2:8081 check
    server      app2     server3:8081 check

backend default_services
   server app1 web.toplevelbalancer:8080 check

listen stats *:1936
    stats enable
    stats uri /
    stats hide-version
    stats auth admin:admin

The above config is an example for doing http routing and balancing based on URL patterens.
Also it shows how HAproxy can handle session stickiness.

The keyword acl defines a url pattern using either pathbeg (path begins) or path_dir (path directory portion).
Then the backend keyword is used to define a couple of application backends to HAproxy, those will do the actual serving of the content.
the balance keyword is used to tell HAproxy to do a round robin load balancing between the defined servers, also it adds the option httpchk which will do an http check on the given URI for each defined server to determine if it is up or not.
Also cookie keyward is used to append a part to the JSESSIONID cookie and have it checked by HAproxy to be able to maintain session stickiness. HAproxy will prefix JSESSIONID with the cookie defined in each server and thus will be able to keep track which session goes to which server.

Lastly we are enabling HAproxy statistics so that we can monitor the status of our backends and also the stats of the requests coming to them.

This config was used successfully to route and balance 40+ services for a big project and it working fairly smooth even under load testing.

HAproxy is very light weight and can handle 10s of thousands of connections without issues.