Roblog

home about me text processing with ruby

Command-line purging of Varnish caches

9 December 2014

I’m a big fan of Varnish, the caching reverse proxy. It’s straightforwardly designed, blisteringly fast, has a powerful configuration language, and with edge side includes allows you to combine the speed of statically cached content with the flexibility of dynamic pages.

The one thing I find myself having to do a lot, though, is purging. Not just purging a simple URL, which might easily be dealt with using the varnishadm tool; but more complex tasks, like purging a page and the resources within the page, or purging a whole domain, or purging an entire site and then quickly spidering all of the pages within it so that the cache stays warm.

At first, I performed these purges manually. But it quickly became clear that this wasn’t tenable. That’s mainly because it required SSHing into the proxy server and trying to remember the correct invocation for the ban command, but it’s also because it was too manual; it’s often important to purge a page immediately after a deploy, for example, and yet this purging could often take a few minutes.

Varnisher is a command-line application for purging your Varnish server, and my attempt at solving this problem. Fundamentally, it lets you purge from your local machine, but it also lets you perform more complex purges and has some features for keeping your cache primed after purges. Let’s take a look at what it can do.

Installation

Varnisher can be installed from Ruby Gems:

$ gem install varnisher

It requires Ruby 1.9.3 or above.

Necessary VCL changes

Varnisher assumes that you have the following elements within your Varnish configuration file. These instruct Varnish to respond to PURGE requests by purging the requested URL, and to DOMAINPURGE requests by purging the requested host name.

acl auth {
	"localhost";
	"127.0.0.1";
}

sub vcl_fetch {
	set beresp.http.x-url  = req.url;
	set beresp.http.x-host = req.http.host;
}

sub vcl_recv {
	if (req.request == "PURGE") {
		if ( client.ip ~ auth ) {
			ban("obj.http.x-url == " + req.url + " && obj.http.x-host == " + req.http.host);
			error 200 "Purged.";
		}
	}

	if ( req.request == "DOMAINPURGE" ) {
		if ( client.ip ~ auth ) {
			ban("obj.http.x-host == " + req.http.host);
			error 200 "Purged.";
		}
	}
}

The first block is an access control list, and tells Varnish which IPs or hostnames are allowed to make purge requests. The second block stores the hostname and URL of the original request on the object as it’s stored in the cache, allowing us to check for these values when processing bans. Finally, the vcl_recv block actually handles the purge requests, setting bans appropriately.

Purging a URL and all of its resources

Sometimes, if we make changes to a web page, in order for people to see all of those changes properly and not see a half-broken page, it’s necessary to purge not just the page itself but also the resources within it: images, CSS files, scripts, and so on.

With Varnisher, we can do this simply by passing a URL to the purge command:

$ varnisher purge http://www.example.com/path/to/page

Varnisher will check for images, CSS files, and JavaScript files, and issue individual purge commands for each one it finds on the page.

Purging a domain

If we pass the purge command a hostname instead of a full URL, Varnisher will instead issue a domain-wide purge:

$ varnisher purge www.example.com

This will create a ban for all pages on the www.example.com domain, ensuring that the next request to each and every page on the site will hit the backend and not come from the cache. This comes in handy when you’ve made a change to a global element within a site — such as the primary navigation or the footer — and want that change to be reflected everywhere consistently, rather than waiting for each individual page to expire.

Spidering a domain

If you’d like to prime a cache or make sure it’s kept warm, you can use the spider command:

$ varnisher spider http://www.example.com/

You can use any URL as the starting point; Varnisher will only follow URLs that are on the same domain as the URL that you originally pass to it, and naturally it will only visit pages once.

The spidering runs in parallel, so it should be fairly quick.

If you’d like to purge and spider a domain in one step, you can pass the --reindex flag to purge:

$ varnisher purge --reindex www.example.com

This is equivalent to calling:

$ varnisher purge www.example.com
$ varnisher spider www.example.com

Summing up

That’s about it; if you find yourself wanting to do complex purging, or just want to be able to purge from your own machine, Varnisher might just come in handy.