Caching and Compression for Apache and Mod_rails

Optimizing your web server configuration is an important step for any production web application. Compression and caching are two complementary techniques that can greatly improve the performance of your site. We won’t go into a lot of detail on the rationale for these changes.  Most of that is covered in profuse detail by that Yahoo Performance Team who produce the excellent YSlow! plugin for Firefox.

The code in this post is used for a soon-to-be production Ruby on Rails application using this stack:

Reliable, Performant Pre-Compression

Compressing text files in your application can lower bandwidth usage by a factor of 10 and decrease the amount of time to retrieve a web resource by the same amount.  In Apache, mod_deflate is the easiest way to enable compression. Mod_deflate will compress content on each request - for dynamic content, this is expected, however for infrequently changing static content such as CSS or Javascript files, this is redundant and can increase CPU load significantly.  To get a little more control over this, we choose to pre-compress static files on our site and serve them when appropriate to compatible browsers.

In the configuration below, we use apache mod_rewrite to handle this outside of our application:

IF Request is CSS or Javascript AND
  the browser can handle gzip compression, AND 
  the browser is not Safari AND 
  there is a file with the same name with an additional .GZ extension
THEN 
  Serve this compressed file instead of the original request

Evidently, some versions of Safari can get tripped up by this particular use of compression, so we leave them out of the fun for now.  It would be great to re-enable this if we can verify it is no longer an issue or has been resolved in the latest version of Safari.  TODO: Verify this assumption.

To handle pre-compressing files, there are a variety of approaches.  For the current Rails application I’m working on, we’ve integrated AssetPackager -  which can optimize, combine and compress these files as part of a build or deployment process.  It’s an excellent addition to the toolbox.

The section below enhances the configuration suggested by The If Works folks.

[sourcecode language=’xml’]

USE PRE-COMPRESSED GZ FILES IF THEY EXIST - WE DON”T WANT TO COMPRESS ON EVERY REQUEST

RewriteCond %{REQUEST_FILENAME} .(js|css)$ RewriteCond %{HTTP:Accept-encoding} gzip RewriteCond %{HTTP_USER_AGENT} !Safari RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME}.gz -f RewriteRule ^(.*)$ $1.gz [QSA,L]

[/sourcecode] Practical Caching Strategy The next important step to enable is a reasonable caching strategy for our site.  Caching is critical to your web application for several reasons:
  • Users can navigate your site with less requests, improving perceived responsiveness
  • It enables the use of a high-performance acceleration or CDN layer
For high-volume web sites, proper attention to caching rules and application design should enable you to achieve caching rates in the 75-90% range.  Some sites have more dynamic content than others of course, but every site has a variety of images, static CSS and Javascript files which can benefit from a caching strategy.
In our current configuration, we want to identify a set of file extensions that are cacheable, and let proxies or browsers cache them for up to 1 week.  It is easy to expand this configuration to set different amounts of times for different file types.
[sourcecode language=’xml’] Header set Cache-Control “max-age=604800, public” ExpiresDefault A604800 Header unset Last-Modified Header unset Pragma FileETag None Header unset ETag [/sourcecode] A couple important points about this configuration:
  1. We disable etags for these types, since it can be unreliable in clustered applications
  2. We leverage both Expires and Cache-Control since different browsers may rely on either one to be the definitive rule (Cache-Control is the new standard)
Deploying new JS or CSS files in our app could cause problems.  In our case, because we are leveraging AssetPackager, we get unique keyed filenames for these resources which change each time there are updates. For example, AssetPackager merges 3 Javascript files into a single resource called base_timestamp.js where timestamp will get updated if any of the source files are updated.  This allows us to avoid any stale cache issues we might encounter after site updates.  You can see that if you change the content of one of these cached file types without also changing the name, some users will continue to reference the older files until their local cache expires.  An alternative remedy for frequently updated files is to set the cache timeout to a much lower value - 4 hours or 1 day, so that stale files won’t live as long. While this is certainly not the end-all-be-all of configurations for applications, it is working well for us. The Charles Proxy was very helpful in verifying that the configuration we have is fully working as intended. Have more best practices that we should incorporate into this configuration?  I’d love to hear them.  We will update this config with improvements as we find them. Complete Config: [sourcecode language=’xml’] # BASIC SERVER CONFIG ServerName www.yourserver.com ServerAlias yourserver.com DocumentRoot /srv/www/myapp/public ServerAdmin [email protected] ErrorLog /var/log/httpd/yourserver.com/apache_error_log CustomLog /var/log/httpd/yourserver.com/apache_access_log combined # ENSURE WE ARE IN PRODUCTION MODE RailsEnv production RewriteEngine On AddEncoding gzip .gz # IF YOU NEED TO DEBUG REWRITES #RewriteLog “/tmp/rewrite.log” #RewriteLogLevel 9 # USE PRE-COMPRESSED GZ FILES IF THEY EXIST - WE DON”T WANT TO COMPRESS ON EVERY REQUEST RewriteCond %{REQUEST_FILENAME} .(js|css)$ RewriteCond %{HTTP:Accept-encoding} gzip RewriteCond %{HTTP_USER_AGENT} !Safari RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME}.gz -f RewriteRule ^(.*)$ $1.gz [QSA,L] # MAKE SURE THE BROWSER UNDERSTANDS WHAT TYPE OF DATA IT IS RECEIVING ForceType text/javascript Header set Content-Encoding: gzip ForceType text/css Header set Content-Encoding: gzip #CACHE FOR A ONE WEEK Header set Cache-Control “max-age=604800, public” ExpiresDefault A604800 Header unset Last-Modified Header unset Pragma FileETag None Header unset ETag [/sourcecode]
Comments