Apache Archiva behind HAProxy with SSL termination

So maybe you’ve followed our post on how to compile HAProxy or maybe you even read the one on how to configure internal company services to use SSL. And maybe you haven’t and just really want to make Apache Archiva work behind your SSL-terminating proxy.

As soon as you place Archiva behind an SSL-terminating proxy you’ll get errors like these from Jetty (web-server powering Archiva):

This is because Jetty is protecting you from Cross-Origin Resource Sharing (CORS) attacks. You could probably alter something on Jetty side to make it compliant with the fact that you’re accessing the service from an HTTPS proxy but we’re a little more familiar with HAProxy so we’re just going to re-write the queries there.

The configuration below will probably not differ much from what you have:

But there are a couple of very important details!

option forwardfor,┬ámakes HAProxy sent the X-Forwarded-For to your webserver. For a lot of services this is all you would need to make it workable so it’s a good pratice to include.

http-request set-header Origin http://%[hdr(host)]/, this is where the magic happens. We change the header Origin and replace the https with http, while keeping the same host thus making the Origin HTTP header what Jetty/Apache Archiva is expecting.

Thats all! You’ll have a working Apache Archiva behind HAProxy with SSL termination.

What about performance?

Notice that we’re using a very low value for maxconn? You can profile your setup for the best throughput but as a rule of thumb it’s ideal to have the backend servers respond very fast to very few queries. Imagine your setup as an Ice-Cream shop; the Archiva Server is the guy piling scoops. If you start overloading him with requests, he won’t respond faster but those incoming solicitations will take a toll on his performance so, overall, everything will run slower.

This is more or less what happens with servers. Your server will have the ability to respond to way less requests than HAProxy can handle so it’s more efficient to just let HAProxy deal with the pile and have the servers work at maximum speed.

The guys at Lucid Chart wrote a post on why turning http2 on for them was a mistake. It’s a good read and one you should probably be aware. You see, even if you limit the number of connections to the backend server(s), as http2 puts all requests in a single stream they will pile up very fast.

On Lucid Chart the performance plummeted. Backend servers were overwhelmed, probably started swapping and just spiraled out of control. Mind you that they were not using a load-balancer with the amount of precision HAProxy gives you.

What could you do?

tune.h2.max-concurrent-streams 5, simple as that. Now every server will take at most 5 connections and any of those 5 connections can have up to 5 parallel streams so that makes it at most 5*5=25 simultaneous requests per server.

Maybe the ideal number for you is even less. Don’t be shy on tuning parallelism down, what matters is throughput.

Leave a Reply

Your email address will not be published. Required fields are marked *