1

We have a web service behind a HAProxy server running in caching reverse proxy configuration. The backend servers send Cache-Control headers correctly for all responses so HAProxy can cache all responses according to HTTP spec.

However, when the end user hits the Shift+Reload button in e.g. Google Chrome, the client (Chrome) sends Pragma: no-cache and Cache-Control: no-cache which forces HAProxy to always fetch the request from the backend server. Obviously, DDoS attacks can use this same trick to easily cause more load on the backend servers.

As we know that the cache headers are correct, how can we configure HAProxy to ignore client submitted Pragma: no-cache and avoid calling backend when the request could be directly fulfilled from HAProxy cache?

I know that ignoring this header would not be okay for a generic proxy use, but in this case we control both the reverse proxy and the backend so we know this is fine.

Here's an example of a response from backend server that will be re-done from the backend when the client sends cache-control: no-cache and pragma: no-cache:

cache-control: public, max-age=31536000, s-maxage=31536000 content-length: 463 content-type: image/svg+xml date: Thu, 24 Jun 2021 14:14:19 GMT etag: "338" expires: Fri, 24 Jun 2022 14:14:19 GMT server: Apache x-content-type-options: nosniff 

It's obviously totally pointless to fetch this from backend servers again because its valid for one year for any user using the given URL. Also worth noting is that NGINX does not honor the [client] Pragma header by default.

3
  • In reality we have multiple redundant frontends running in parallel and multiple redundant backends but it really doesn't change anything about the problem. So I wrote the above simplified question like there were only one frontend and one backend server. Commented Jun 24, 2021 at 14:32
  • haproxy.com/documentation/aloha/latest/traffic-management/… says that "Objects are cached only if all the following are true: [...] Response does not have a "Cache-Control: no-cache" header" which suggests that this is not currently configurable. That's technically about Aloha component but I'd guess it might be applicable to whole HAProxy, too. Commented Jun 24, 2021 at 15:02
  • Note that nowdays (at least with HTTP/2) Google Chrome sends cache-control: max-age=0 instead of Pragma: non-cache but the HAProxy behavior is still the same. Commented Jan 31, 2023 at 14:30

1 Answer 1

2

Web browsers send Cache-Control and Pragma headers that mess with HAProxy and make the caching virtually unusable due to its inability to cache if cache-control or pragma headers are "no-cache". To bypass this, you just need to make sure you delete the Cache-Control header first, then the Pragma header second with http-request del-header before you attempt to use or store anything in the cache:

http-request del-header Cache-Control http-request del-header Pragma http-request cache-use mycache http-response cache-store mycache 
1
  • 2
    Great finding, and welcome to Serverfault! I verified that this is indeed the correct way to configure HAProxy to allow caching in all cases. Nitpick: I would recommend to just include the meaningful details in answer in the future. For example, you can edit the answer to remove the first paragraph about how hard this info was to find. You should assume that somebody is reading this answer 10 years into the future and write the answer accordingly. Commented Mar 8, 2023 at 11:51

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.