Prevent Browser Caching of Dynamic Content

Rob Gravelle
Just when you thought that Ajax was paving the way to a richer and more interactive browsing experience, it has run into some resistance from proponents of the Web 2.0 paradigm. In a nutshell, Web 2.0 is about making websites machine readable so that content can be shared seamlessly between unrelated sites. The rich client goal of Ajax-based frameworks run contrary to the Web 2.0 ideals in many ways. It is Web developers who now find themselves in the crossfire. So what's a developer to do? Stick with straight HTML, CSS, and a bit of JavaScript to enable easy data sharing or use Flash to emulate a desktop application - thus fostering a more rich client experience? Personally, I am all for ease of data sharing, but I believe that the Web should be about people first.Today we'll be discussing the use of Ajax in client-side caching. It's really a two pronged topic: first, it is beneficial for performance to cache the data once it's been downloaded because it minimizes server calls; second, you have to be able to force the browser to refresh some content when a server call is made. We tackled the first issue in Building a Client-Side Ajax Cache . This article will concentrate on the second one concerning content refreshing.

Before we get started, you may want to brush up on the basics of Ajax , since this article assumes that you already have a working knowledge of it.

Browser Caching of Dynamic Content

A typical usage for Ajax is to refresh a portion of the page at a set interval. For example, imagine a page that contains the current weather conditions. The latest conditions should be updated every so often, say every five minutes. The Prototype Framework has just the thing: it's called the PeriodicalUpdater . All we need to do is supply it with the ID of the page element that we want to update, the URL of the server resource, and a JavaScript object literal containing some optional parameters. We need to supply the frequency because it differs from the default of every two seconds. We'll attach it to the window's onload event using the Event.observe() method. It takes the object, the event name (without the 'on'), and the function to bind to it: Event.observe( window, 'load', function() { new Ajax.PeriodicalUpdater( 'weather_panel', 'weatherService.php', { frequency: 5*60 //five minutes } ); } );

You have to admit that that was pretty easy, so far. The trouble is that different browsers implement caching in different ways. Internet Explorer, always the black sheep of browsers, caches dynamic pages, in addition to static ones. This is a problem because, although the URL of the page may not change, the content will. This is especially true in the case of partial page updates. One work around for this situation is to add a unique time stamp and/or random number, to the end of the URL. This is achieved using the Date object and/or Math.random() . Here is our PeriodicalUpdater with a random number ID appended as a query parameter called sid : new Ajax.PeriodicalUpdater( 'weather_panel', 'weatherService.asmx/getWeather?sid=' + Math.random(), { frequency: 10 } );

This will prevent the page from being cached, but there is a downside in that each request will now be stored in the cache, adding useless and never reused content that could be better utilized for more static pages. Not to mention, more useful data will be purged from cache to make way for these one-time responses.

A less obtrusive solution is to set the server call's request headers in such a way as to prevent caching of the dynamic content. The idea is basically to trick the browser into thinking that the content has expired so that a trip to the server is necessary. Most browsers implement some sort of caching, but there are differences in how and when the cached data is revalidated. For instance, Firefox revalidates the cached response every time the page is refreshed, issuing an If-Modified-Since header with value set to the value of the Last-Modified header of the cached response. Internet Explorer, on the other hand, does so only if the cached response is expired (I.E., after the date of received Expires header). The following set of headers should work all major browsers: new Ajax.PeriodicalUpdater( 'weather_panel', 'weatherService.asmx/getWeather', { frequency: 10, requestHeaders: { "Pragma": "no-cache", "Cache-Control": "no-store, no-cache, must-revalidate, post-check=0, pre-check=0", "Expires": 0, "Last-Modified": new Date(0), // January 1, 1970 "If-Modified-Since": new Date(0) } } );

In the above example, we modified the previous Ajax call to include the relevant request headers. Prototype allows you to include custom HTTP request headers using the requestHeaders option. It accepts name-value pairs as a hash (as in our example) or in a flattened array, like: ['X-Custom-1', 'value', 'X-Custom-2', 'other value'] .

A Brief Lesson on Request Header Fields

As per the HTTP protocol, a request message may contain one or more header fields, which are formatted one line per header, in the form Header-Name: value , ending with a Carriage Return and a Line Feed (CRLF). There are forty six headers defined by HTTP 1.1, but only one - the Host - is required in requests. Usually, the From and User-Agent fields are also sent, at a minimum. The headers that interest us are those which deal with caching.

Typically, a request header field will contain the name of the field and its directive(s), or instruction(s), separated by a colon. Each header has its own set of possible directives to choose from. Multiple directives can be provided by separating each by a comma (,). Some headers take name=value pair tokens instead of directives. Here is the format for the Pragma directive: Pragma = "Pragma" ":" 1#pragma-directive pragma-directive = "no-cache" | extension-pragma extension-pragma = token [ "=" ( token | quoted-string ) ]

The easiest way to include request headers in your pages that you never want cached is to use META tags. These provide information about the HTML document and are typically used to specify page description, keywords, author of the document, last modified, and other metadata. Metadata is not displayed on the page, but is machine parsable:

The first request header field in the example above is the Pragma field. It's used to include implementation-specific directives that might apply to any recipient along the request/response chain. All pragma directives specify optional behavior from the viewpoint of the protocol. It is the replacement for the Control-Cache in HTTP 1.1 and accepts one directive of no-cache . The next request header field is the Cache-Control header. It is used to specify directives that must be obeyed by all caching mechanisms along the request/response chain. The directives specify behavior intended to prevent caches from adversely interfering with the request or response. Hence, these directives typically override the default caching algorithms. It is by far the most common header for caching and accepts many directives (for a complete list, there are a few sources in the references section at the end of this article). The Cache-Control header above combines five of the most common directives and tokens, including the no-cache directive, in order to maintain backwards compatibility with the HTTP 1.0 protocol.

The Expires header field gives the date and time after which the response is considered stale. A stale cache entry may not normally be returned by a cache unless it is first validated with the origin server. The format is an absolute date and time as defined by the HTTP-date standard, such as: Expires: Thu, 01 Dec 1994 16:00:00 GMT . Interestingly, HTTP/1.1 clients must treat other invalid date formats, especially including the value " 0 ", as in the past (I.E., "already expired").

The Last-Modified and If-Modified-Since dates will be evaluated against the server's in order to determine whether or not to refetch the document. The strategy here is to create a new Date with a parameter of zero. By doing so, the JavaScript engine will create a date using Unix time, or POSIX time, which is the number of seconds which have elapsed since midnight, January first, 1970. Using a date so far in the past all but guarantees that the server's will be newer.

Server Solutions

Another approach is to tackle the problem from the server-side. Here, the goal is to modify the response headers. Just as the request contains one or more request headers, the server reply contains at least one response header. Typical ones include the Content-Type of the resource, the Content-Length of the reply, and the Server name and version information. Here are the response headers that were supplied by my Abyss testing server. All I had to do was add the following line to the Ajax.Request 's options hash: //new code: //onSuccess: function(transport) { alert( transport.getAllResponseHeaders() ); } //new function, showing response headers: new Ajax.PeriodicalUpdater( 'weather_panel', 'weatherService.asmx/getWeather', { frequency: 10, requestHeaders: { "Pragma": "no-cache", "Cache-Control": "no-store, no-cache, must-revalidate, post-check=0, pre-check=0", "Expires": 0, "Last-Modified": new Date(0), // January 1, 1970 "If-Modified-Since": new Date(0) }, onSuccess: function(transport) { alert( transport.ResponseText ); } } );

Figure 1

How you set the response headers depends heavily on what language you are working in, but all Web languages include some variation of setResponseHeader() , including setHeader() , or even just header() . While it would be highly impractical to demonstrate how to set the response headers in every language out there, we can look at a few popular ones, to give you a foundation as to what headers to include and how one would go about setting them.

Here is some Java servlet code setting the relevant fields: protected void before(ActionInvocation actionInvocation) throws Exception { HttpServletResponse response = ServletActionContext.getResponse(); response.setHeader("Pragma", "no-cache"); response.setHeader("Cache-Control", "must-revalidate"); response.setHeader("Cache-Control", "no-cache"); response.setHeader("Cache-Control", "no-store"); response.setDateHeader("Expires", 0); }

Here is some PHP, embedded in the page:

Finally, here are two ways to disable browser caching using ASP.NET:

Using code:

By including the following declaration in your page: <%@ OutputCache Location="None" VaryByParam="None" %>

Here is what came back when I switched the weatherService with an ASP.NET page and included the C# snippet above:

Figure 2

Aside from the Cache-Control, Pragma, and Expires cache-related lines, you'll notice that there is one extra header called Test . As you can see, it is possible to create any header you want using the Request.AppendHeader() method.

Having the ability to manipulate request and response headers opens up a world of possibilities. Once you can prevent browsers from caching your dynamic page content, it's only a stone's throw to controlling the circumstances of when and for how long page elements are cached. ASP.NET's Cache object is a class that supports advanced caching mechanisms. Caching can be done on both the client or server and can be dictated based on a variety of factors, including, query string parameters, form controls, and even the browser type. As the Web becomes increasingly dynamic, efficient caching strategy will become more integral to the performance and success of websites.

Published by Rob Gravelle

Rob Gravelle resides in Ottawa, Canada, and is the founder of GravelleConsulting.com. Rob has built systems for Intelligence-related organizations such as Canada Border Services, CSIS as well as for numerous...  View profile

To comment, please sign in to your Yahoo! account, or sign up for a new account.