Home / PHP / Drupal and Varnish a short introduction

Drupal and Varnish a short introduction

Varnish is a HTTP accelerator (or reverse proxy). Its very much capable of processing 100,000 requests in a sec. Most likely quicker then Drupal, even with page caching on.

Want to see how it works?

 

  • Cache hit
    1. User requests a URL
    2. Varnish checks it’s cache
    3. Varnish retrieves the data from the cache
    4. Varnish delivers the data to the user.
  • Cache miss
    1. User requests a URL
    2. Varnish checks it’s cache but the data isn’t cached
    3. Varnish requests the URL from the backend
    4. Drupal processes the request and delivers a response to Varnish
    5. Varnish caches the response
    6. Varnish forwards the response to the user

Varnish magical tricks

Varnish is fully capable of performing outstanding features. It can be used for:

  • Load balancing between a Series of Drupal Servers
  • Serving assets (images/css/swfs…) from a light-weight backend whilst serving content from Drupal
  • ESI – Edge Side Includes – allowing personalised pages to be cached
  • Maintainance-mode, where Varnish can serve a “Site is being updated” page without traffic hitting the backend

Cons of Varnish?

Like any solution, Varnish brings its own set of issues.

  • Statistics – content served by Varnish won’t hit the backend, so traditional stats (log files, the statistics module) won’t show the correct results. Use a client-side solution such as Google Analytics instead.
  • Personalisation – personalised pages are hard to cache. You will need things like ESI and custom VCL logic. Not for the faint-hearted.
  • Caching-rules complexity – choosing what to cache, how long to cache it for, and how to purge the cache if content is updated, is all complex, and needs more modules, more VCL rules. Choose a balance between complexity, performance, and investment in hardware.
  • Decreased performanceWhat? I hear you cry…I thought Varnish was meant to improve performance! Well, if something’s cached, it’s quicker. But cache-misses are generally slower via varnish than a direct request, because every additional system in the request-route decreases performance. In reality, most web-sites would benefit (if only to cache images and css), but it might not be suitable for a web-service.

Getting setup: How to add varnish to your existing web site

This how-to presumes that you already have your drupal site up and running, using apache on port 80.

  1. install varnish (apt-get install varnish / yum install varnish)
  2. change the varnish config to listen on port 80
    Edit /etc/default/varnish
    Change the VARNISH_LISTEN_PORT to 80
    NB: the behaviour has changed somewhat across varnish – check the docs for your version!
  3. change apache config to listen on port 8080 (or another suitable port, if something’s already running on 8080)
  4. edit the VCL to forward backend requests to apache
    Edit /etc/varnish/default.vcl

    backend default {
      .host = "127.0.0.1";
      .port = "8080"; 
    }
  5. restart apache
    I usually use “apache2ctl graceful”, rather than “/etc/init.d/apache2 restart” because this sanity-checks the configs before restarting!
  6. restart varnish: “/etc/init.d/varnish restart”
    NB: each time Varnish is restarted, its cache is cleared.

This will give you a very basic implementation – to see the benefit of varnish, you’ll want to do more.

Optimising Varnish and Drupal to work together

  • Varnish module: The varnish module provides an admin dashboard to show you the status of the Varnish server(s), and provides integration points which will clear the Varnish cache when a node is edited (when integrated with the expire module).
  • Expire module: The expire module has been separated out from Boost, and provides a generic cache-management tool which runs at page-level. Expire will purge the appropriate pages from the varnish cache when the content changes.
  • Memcache et al: use all of the normal performance tools as well. So memcache, APC, cache-sets/gets in your custom modules, etc. Varnish is an addition to the performance tools, not a replacement.
  • ESI module: Edge-side includes are difficult, but the ESI module makes them a little easier.

Some custom VCL logic

Long live assets!

Assets should live a long time. CSS, images, JS, SWFs: should all hang around in the cache.

1
2
3
4
5
6
7
8
9
10
11
12
13
sub vcl_recv {
  # Assets are pulled from the cache, even if we have a NO_CACHE cookie.
  if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {
    return (lookup);
  }
}
sub vcl_fetch {
  # Don't cache cookies
  remove beresp.http.Set-Cookie;
  # Set a long TTL (1 day)
  set beresp.ttl = 86400s;
  return(deliver)
}

Is it working?

To give you some reassurance that Varnish is doing it’s job, it’s nice to see if you’re getting hits or misses. This code will add an HTTP header to report back.

1
2
3
4
5
6
7
sub vcl_deliver {
  if (obj.hits > 0) {
    set resp.http.X-Cache = "HIT";
  } else {
    set resp.http.X-Cache = "MISS";
  }
}

How to clear the varnish cache

In case you follow all these instructions, but end up with some really private data in your cache, here’s a quick clear-cache how-to!

  • command-line: SSH into the server, and run:
    /etc/init.d/varnish restart
  • Telnet: Telnet to the admin port of Varnish (by default port 6082) and use the purge command
    telnet 192.168.0.10 6082
    url.purge /

    url.purge will take regex – see the Varnish docs for more info.

About Sallah Ud Din Sarwar

I am a multi skilled Software Engineer with over 4 year’s industry experience in designing and developing Object oriented solutions in Multi layered & N-Tired. I have substantial success to my credit. I have consistently handled complex problems to the satisfaction of my clients by working in stressful situations. I enjoy learning and applying new technologies in a dynamic and forward thinking professional environment.

Check Also

Find Control in .NET

It is a way of handling Nested control (NET controls on NET Container Controls) because ...

Leave a Reply

Your email address will not be published. Required fields are marked *