Application Cache is a Douchebag

by Jake ArchibaldMay 08, 2012

Published in Application Development, HTML, JavaScript

Good morning! Over in “castle Lanyrd” we recently launched our mobile site, which caches data on events you’re attending for viewing offline. I’ve boiled the offline bits down to a simple demo and posted all the code on Github. But before we delve into the code, let me tell you a true story. Totally true.

Article Continues Below

I was at a party, one where the guests were mostly strangers to one another. I was part of a little huddle that was awkwardly trying to make introductions. A rather pretty lady turned to one of the shyer members of the group, introduced herself as “Dev,” and asked “So, what do you do then?”

“Oh, I’m the LocalStorage” he replied, shuffling uncomfortably. “I provide a scripting interface for text storage maintained across pages and browser sessions.”

“Yeah, he’s basically a shelf!” interrupted another. The group tittered at LocalStorage’s expense. I stayed silent, as I’d come to know this guy pretty well.

Another member of the group piped up. “Hi, I’m ApplicationCache,” he said, as he reached over to shake Dev’s hand. “I turn your offline experience from sucks-ass, to success. Just one extra file, and bosh! It works. No fuss, no ‘scripting’ necessary.” Yes, he did finger-quotes while saying ‘scripting.’ I’m gritting my teeth at this point, because I know he’s greatly exaggerating his abilities and the others don’t see it. However, if I call “bullshit” on it I’ll seem like the jerk.

I felt bad for not doing anything on that completely real evening. It’s painful to see articles that praise ApplicationCache’s ease of use, written by people who’ve clearly only met him in passing. I must set the record straight: I’m here to tell you ApplicationCache is a douchebag.

Now, I don’t mean he’s useless or should be avoided, you just have to be very careful when and how you work with him. If you get it wrong, the douchebaggery oozes through onto the end user. By reading through my own painful experiences with AppCache, you’ll know what to expect from AppCache and how to deal with it.

When is offline access useful?#section2

We’re better connected than we’ve ever been, but we’re not always connected. For instance, I’m writing this on a train hurtling through the data-barren plains of West Sussex. Alternatively, you may choose to be offline. When I use data abroad I can almost hear the champagne corks popping in the offices of my network provider. I know I’m haemorrhaging money when I data-roam, but the internet has the data I need, and it won’t let me at most of it without a working connection.

Sites that are useful offline generally fall into two categories, ones that let you do stuff and ones that let you look stuff up.

“Look stuff up” sites include Wikipedia, YouTube, and Twitter. The heavy lifting tends to be on the server and there’s a large amount of data available but users only use a fraction of it.

“Do stuff” sites include Cut the Rope, CSS Lint, and Google Docs. With these sites the heavy lifting tends to be done on the client. They offer a limited amount of data, but you can use that data in multiple ways, or create your own data. This is the case Application Cache was designed for, so we’ll look at that first for a nice easy introduction.

Offlining a “do stuff” site#section3

Yes, that’s right, I verbified “offline.” Yes, I verbified “verb.” Feel free to inbox me grammar complaints that I’ll trashinate.

Sprite Cow fits into the “do stuff” box. CSS sprites are great for performance, but finding out the size and position of an item in the sprite sheet can be fiddly. Sprite Cow loads sprite sheets, and spits out the CSS to display a particular portion of it. It’s one html file, a few assets, and all the processing is done on the client; the server does nothing except serve files.

It would be nice to be able to use Sprite Cow on the train, as we can with native apps. To do this, we create a manifest file listing all the assets the site needs:

CACHE MANIFEST
assets/6/script/mainmin.js
assets/6/style/mainmin.css
assets/6/style/fonts/pro.ttf
assets/6/style/imgs/sprites1.png

…then link that manifest to the html page via an attribute:

<html manifest="offline.appcache">

The HTML page itself isn’t listed in the manifest. Pages that associate with a manifest become part of it.

In practice, if you visit Sprite Cow with a data connection, you’ll be able to visit it subsequently without one.

The resource tab in Chrome’s Web Inspector will show you the files picked up by the manifest, and which page pointed to it. If you need to clean these caches, see chrome://appcache-internals/.

At first, this can seem like a magic bullet to the problem. Unfortunately I was lying when I said this would be an easy introduction. The ApplicationCache spec is like an onion: it has many many layers, and as you peel through them you’ll be reduced to tears.

Gotcha #1: Files always come from the ApplicationCache, even if you’re online#section4

When you visit Sprite Cow, you’ll instantly get the version from your cache. Once the page has finished rendering, the browser will look for updates to the manifest and cached files.

This sounds like a bizarre way of doing things, but it means the browser doesn’t have to wait for connections to time out before deciding you’re offline.

The ApplicationCache fires an updateready event to let us know there’s updated content, but we can’t simply refresh the page at this point, because the user may have already interacted with the version they already have.

This isn’t a big deal here, because the old version is probably good enough. If need be, we can display a little “An update is available. Refresh to update” message. You may have seen this on Google apps such as Reader and Gmail.

Oh, remember four paragraphs ago when I said that ApplicationCache looks for updated content after rendering the page? I lied.

Gotcha #2: The ApplicationCache only updates if the content of the manifest itself has changed#section5

HTTP already has a caching model. Each file can define how it should be cached. Even at a basic level, individual files can say “never cache me,” or “check with the server, it’ll tell you if there’s an update,” or “assume I’m good until 1st April 2022.”

However, imagine you had 50 html pages in your manifest. Each time you visit any of them while online, the browser would have to make 50 http requests to see if they need to be updated.

As a slightly unusual workaround, the browser will only look for updates to files listed in the manifest if the manifest file itself has changed since the browser last checked. Any change that makes the manifest byte-for-byte different will do.

This works pretty transparently for static assets which are ideally served via a content delivery network and never change. When the CSS/JavaScript/etc., changes it’s served under a different url, which means the content of the manifest changes. If you’re unfamiliar with far-future caching and CDNs, check out Yahoo!’s performance best-practice guide.

Some resources cannot simply change their url like this, such as our HTML pages for instance. ApplicationCache won’t look for updates to these files without a friendly nudge. The simplest way to do this is to add a comment to your manifest and change that when necessary.

CACHE MANIFEST
# v1whatever.html

Comments start with # in manifests. If I update whatever.html I’d change my comment to # v2, triggering an update. You could automate this by having a build script output something similar to ETAGs for each file in the manifest as a comment, so each file change is certain to change the content of the manifest.

However, updating the text in the manifest doesn’t guarantee the resources within will be updated from the server.

Gotcha #3: The ApplicationCache is an additional cache, not at alternative one#section6

When the browser updates the ApplicationCache, it requests urls as it usually would. It obeys regular caching instructions: If an item’s header says “assume I’m good until 1st April 2022” the browser will assume that resource is indeed good until 1st April 2022, and won’t trouble the server for an update.

This is a good thing, because you can use it to cut down the number of requests the browser needs to make when the manifest changes.

This can catch people out while they’re experimenting if their servers don’t serve cache headers. Without specifics, the browser will take a guess at the caching. You could update whatever.html and the manifest, but the browser won’t update the file because it’ll “guess” that it doesn’t need updating.

All files you serve should have cache headers and this is especially important for everything in your manifest and the manifest itself. If a file is very likely to update, it should be served with no-cache. If the file changes infrequently must-revalidate is a better bet. For example, must-revalidate is a good choice for the manifest file itself. Oh, while we’re on the subject…

Gotcha #4: Never ever ever far-future cache the manifest#section7

You might think you can treat your manifest as a static file, as in “assume I’m good until 1st April 2022,” then change the url to the manifest when you need to make updates.

No! Don’t do that! *slap*

Remember Gotcha #1: When the user visits a page a second time they’ll get the ApplicationCached version. If you’ve changed the url to the manifest, bad luck, the user’s cached version still points at the old manifest, which is obviously byte-for-byte the same so nothing updates, ever.

Gotcha #5: Non-cached resources will not load on a cached page#section8

If you cache index.html but not cat.jpg, that image will not display on index.html even if you’re online. No, really, that’s intended behaviour, see for yourself.

To disable this behaviour, use the NETWORK section of the manifest

CACHE MANIFEST
# v1index.htmlNETWORK:
*

The * indicates that the browser should allow all connections to non-cached resources from a cached page. Here, you can see it applied to the previous example. Obviously, these connections will still fail while offline.

Well done; you made it through the simple playing-to-its-strengths ApplicationCache example. Yes, really, that’s the simple case. Sorry. Let’s try something a bit tougher.

Offlinerifying a “look up stuff” site#section9

As I mentioned at the start of this article (Remember how much happier you were back then?), Lanyrd recently launched a mobile website so people could look up conference schedules, locations, attendees, etc. Offline access to this data is important while you’re travelling and faced with data roaming charges.

There’s too much content to offlinificate everything, but a single user is generally only interested in events they’re participating in.

The late Dive into HTML5 gives us an example of how you could offlinerize Wikipedia, another “look up stuff” site. It works by having an almost-empty manifest which every page links to, so as users navigate around the site, those pages implicitly become part of their cache. While offline, they’ll be able to visit any of the pages they previously visited.

That solution is brilliantly simple, but thanks to a few “lumpy bits” in the specification, it’s completely disastrous. For starters, the user isn’t given any indication of which content is available while they’re offline, and there isn’t a JavaScript API we could use to get at that information. We could download and parse the manifest with JavaScript, but all these Wikipedia pages are implicitly cached, so they aren’t listed.

Furthermore, remember Gotcha #1: the cached version will be shown rather than a version from the server. The page will be frozen for the user when they first look at it, but as we found in Gotcha #2, we can trigger the browser to look for updates by changing the text in the manifest file. However, when do we change the manifest file? Whenever a Wikipedia entry is updated? That would be way too frequent, in fact if a manifest changes between starting and ending an update, the browser will consider this an update failure (step 24).

The frequency of these updates is a problem, but it’s the weight of these updates that’s the real killer. Consider the number of Wikipedia pages you’ve browsed—hundreds? thousands? An AppCache update would involve downloading every single one of those pages. AppCache doesn’t give us a way to remove implicitly cached items, so that number is going to keep growing and growing until it hits some kind of browser cache limit and the world explodes. That’s not great.

What do we want from an offlinable reference site?#section10

My requirements for a reference site with offline capabilities are thus. It must:

show up-to-date data while online, as it would without ApplicationCache,
allow us (the developers) to control which content is cached, when it’s cached, and how it’s cached,
allow us to defer some of that control to the users, perhaps in the form of a “Save offline” or “Read later” button, and
a single visit to any page must give the browser what it needs to show content offline.

ApplicationCache, for all its bragging, doesn’t make this easy. If a page has a manifest, it becomes cached. You can’t have a page tell the browser about offline capabilities without that page being cached itself.

Limiting the reach of the ApplicationCache#section11

The easiest way to make a page behave as it would without ApplicationCache is to serve it without a manifest. We can tell the browser about the offline stuff by injecting a hidden iframe pointing to a page that does have a manifest.

Let’s give that a spin. Visit this page while online. Once you’ve done that you’ll be able to visit this page while offline. The first page isn’t part of the cache, so the user will always get the most up-to-date information from it.

That’s not very impressive though. If you visit the first page while offline, you’ll get nothing.

Falling back#section12

The ApplicationCache manifest lets us specify a fallback resource to use when another request fails.

CACHE MANIFEST
FALLBACK:
/ fallback.html
/assets/imgs/avatars/ assets/imgs/avatars/default-v1.png

This tells the ApplicationCache to display fallback.html whenever a request fails, unless the request fails within /assets/imgs/avatars/, in which case a fallback image will be used.

To see how this works in practice, visit this page. Visit it again without a network connection, and a fallback page will be shown instead. Notice how it isn’t a hard redirect? The url for the original page remains, and this will be useful.

Incidentally, having a fallback relaxes the network blocking rules we encountered in Gotcha #5. Now connections are allowed within the same domain, but you’ll still need to use the network wildcard for connections to other domains.

Now we’re getting somewhere—we’re showing a cached page only if the regular page didn’t succeed.

Using ApplicationCache for static content only#section13

By “static content,” I mean content that never changes. Images, scripts, styles, and our fallback page.

CACHE MANIFEST
js/script-v1.js
css/style-v1.css
img/logo-v1.png
FALLBACK:
/ fallback/v1.html
/imgs/avatars/ imgs/avatars/default-v1.png

If we needed to make a change to the JavaScript, we’d upload a new file at a new url, script-v2.js, for instance.

This works around Gotchas #1 and #2: The user will never be served an out-of-date script or style because it’ll have a fresh url when it changes. We don’t have to deal with comment-based version numbers in the manifest because the text change in the url is enough to trigger an update. All resources will have a far-future cache, so only changed files will need an http request to update.

Gotcha #6: Bye bye conditional downloads#section14

Oh come on, you didn’t think you’d get though a whole article without a reference to responsive design did you?

Do you have two sets of design images? Is one of them much smaller and lighter for people viewing on mobile devices? Do you use media queries to decide which of these to display? Well, ApplicationCache hates you and your family.

All images required to render your site go in the manifest and the browser downloads all of these. In the case of responsive images, the user ends up downloading both versions of the same asset. This defeats the point. Just use desktop-resolution imagery and resize it down on the client using CSS background size.

If the mobile version has a completely different design, at least put them into a sprite sheet along with the “high res” graphics so they can benefit from palette-based png compression together.

The same rule is true of fonts. I saw some idiot recommending using lots of font formats, which is all well and good for regular sites, but we can’t have all those in the manifest. For offline use, go with True Type Fonts (TTF) only. “Hey, isn’t Web Open Font Format (WOFF) the future?” Yeah, probably, but only for legal reasons. There’s no technical benefit to WOFF over TTF. Ok, WOFF has built-in compression but it’s no better than gzipping a TTF. Also, WOFF isn’t supported by the older versions of many browsers, whereas TTF support extends much further.

Anyway, back to Application Cache.

Using LocalStorage for dynamic offlinerification#section15

We can’t offlinerify all our content because there’s too much. We want the user to pick what they want to have available offline. We can use LocalStorage to store that data.

Yes, LocalStorage is just a shelf, but it’s an extremely useful shelf that’s very simple to use. You can put whatever text data you want in there and get it back later, from any page on the same domain.

LocalStorage is stored on disk, so using it is cheap but not free. Because of this, we should keep the number of reads and writes down and we don’t want to be reading and writing more than we have to per read/write.

We’re going to use an entry for each page we store, and an additional entry to keep track of what we’ve saved offline, along with their page titles. This means we can list all the pages we have cached with one read, and display a particular page with two.

So, to save the page articles/1.html for offline use, we do this:

// Get the page content
var pageHtml = document.body.innerHTML;
var urlPath = location.pathName;
// Save it in local storage
localStorage.setItem( urlPath, pageHtml );
// Get our index
var index = JSON.parse( localStorage.getItem( index' ) );
// Set the title and save it
index[ urlPath ] = document.title;
localStorage.setItem( 'index', JSON.stringify( index ) );

Then, if the user visits articles/1.html without a connection, they’ll get fallback.html, which does the following:

var pageHtml = localStorage.getItem( urlPath );
if ( !pageHtml ) {
 document.body.innerHTML = 'Page not available#section16';
}
else {
 document.body.innerHTML = localStorage.getItem( urlPath );
 document.title = localStorage.getItem( 'index' )[ urlPath ];
}

We can also iterate over localStorage.getItem( ‘index’ ) to get details on all the pages the user has cached.

Putting it all together#section17

Here’s a demo of the above in action. The article pages can be cached via a button in the top-right of the page, and the index page will indicate which pages are available offline.

All of the code is available on GitHub. Any page can be cached with a call to offliner.cacheCurrentPage(). This is called on every visit to the index page, and on every visit to a page the user wishes to be cached.

If the user ends up on the fallback page, offliner.renderCurrentPage() is called, which renders the intended page. If we don’t have a page to show, an error message is displayed. Oh, that reminds me…

Gotcha #7: We don’t know how we got to the fallback page#section18

When we can’t display a particular page, the error we show is rather vague. According to the spec, the fallback page is shown if the original request results in “a redirect to a resource with another origin (indicative of a captive portal), or a 4xx or 5xx status code or equivalent, or if there were network errors (but not if the user cancelled the download).”

This is good in some ways. If the user is online but our site goes down, their browser will just show cached data and the user might not even notice! Unfortunately, we’re not given access to the reason for the fallback. It could be because the user has no connection, it could be that they’ve followed a broken url or mistyped it, or we could have a server fault. We just don’t know.

Oh, did you spot that little bit about redirects?

Gotcha #8: Redirects to other domains are treated as failures#section19

Yes, that’s right, another gotcha. I’m surprised you’re still reading. If you want to go and lock yourself in a toilet cubicle and refuse to come out until the internet has gone away, I’d totally understand.

If one of your urls decides it needs to redirect to Twitter or Facebook to do some authentication, our friendly Application Cache will decide that’s NOT ALLOWED and show our fallback page instead.

This rule has good intentions. If the user tries to visit your site and the wifi they’re using redirects them to http://rubbish-network/pay-for-wifi-access, showing our site’s fallback page instead is great.

In the case of intended spontaneous auth redirects, white-listing these in the NETWORK section doesn’t work. Instead, you’ll have to use a JavaScript or a meta-redirect. Urgh.

Downsides to the LocalStorage approach#section20

“Oh, we’re onto the downsides now? And what exactly was the rest of the article?” Yes, I know, please don’t hit me. There are some disadvantages over the plain ApplicationCache solution.

JavaScript is required, whereas Sprite Cow’s use of ApplicationCache isn’t JavaScript-dependent (although the rest of the site is). I’m going to stick my neck out and say there’ll be very few users with ApplicationCache support but no JavaScript support. That’s not to say no-JavaScript support isn’t important. Lanyrd’s mobile site works fine without JavaScript. In fact, we avoid parsing JavaScript on older devices to keep things simple and quick.

The experience on a temperamental connection isn’t great. The problem with FALLBACK is that the original connection needs to fail before any falling-back can happen, which could take a while if the connection comes and goes. In this case, Gotcha #1 (files always come from the ApplicationCache) is pretty useful.

It doesn’t work in Opera. Opera doesn’t support FALLBACK sections in manifests properly. Hopefully they’ll fix this soon.

What does m.lanyrd.com do differently?#section21

The demo I showed earlier is a simplification of what we do at Lanyrd. Instead of storing the page’s HTML for offline use, we store JSON data in LocalStorage and templates for that data in ApplicationCache. This means we can update the templates independently of the data, and use one set of data in multiple templates.

Our JSON data is versioned per conference. These version numbers are checked when you navigate around the site. If they don’t match what you have stored, an update is downloaded. This means you only make a request for updated data when there’s an update.

Rather than provide a button to let the user store a particular page offline, we cache data for events the user is tracking or attending. As a result, the server knows what the user wants offline, so if they change devices or lose their cache somehow, we can quickly repopulate it.

Switching pages is done via XMLHttpRequest and pushState. This is much faster on mobile devices as it doesn’t have to reparse the JavaScript on each page load, and it makes it feel a bit more like an app than a site.

Oh, go on, for old times sake…

Gotcha #9: An extra hoop to jump through for XHR#section22

You can make XHR requests to cached resources while offline, unfortunately older versions of WebKit finish the request with a statusCode of 0, which popular libraries interpret as a failure.

// In jQuery…
$.ajax( url ).done( function(response) {
 // Hooray, it worked!
}).fail( function(response) {
 // Unfortunately, some requests
 // that worked end up here
});

In the wild, you’ll see this on the Blackberry Playbook, and devices that use iOS3 and Android 3/4. Android 2’s browser doesn’t have this bug. Oddly, it seems to run a newer version of WebKit. It also supports history.pushState whereas the browsers on later versions of Android do not. FANKS ANDROID. Here’s how you work around this issue:

$.ajax( url ).always( function(response) {
 // Exit if this request was deliberately aborted
 if (response.statusText === 'abort') { return; } // Does this smell like an error?
 if (response.responseText !== undefined) {
  if (response.responseText && response.status < 400) {
   // Not a real error, recover the content    resp
  }
  else {
   // This is a proper error, deal with it
   return;
  }
 } // do something with 'response'
});

And here’s a demo where you can test that out.

ApplicationCache: your friendly douchebag#section23

I’m not saying that ApplicationCache should be avoided, it’s extremely useful. We all know someone who talks themselves up, or needs “observing” more than others in case they do something really stupid. ApplicationCache is one of those people.

ApplicationCache can, under careful instruction, do stuff no one else can. But when he says “You don’t need to hire a plumber! I’ll fit your bathroom for you! I did all the bathrooms in Buckingham Palace, y’see…” turn him down gently.

If you’re creating anything more complicated than a self-contained client-side “app,” you’ll have happier times using it to an absolute minimum and getting LocalStorage to do the rest.

48 Reader Comments

Seannachie says:

May 8, 2012 at 10:56 am

Well, I didn’t read this article simply because of the title. Regardless of author credentials, knowledge, or even just a desire to grab reader attention, the article title lacks any sort of professionalism, not to mention class. And people wonder why this industry is often viewed with disrespect and populated by the generally immature and anti-social.
Jake Archibald says:

May 8, 2012 at 11:27 am

Despite the title of your comment being a sarcastic dig, I decided to give you the benefit of the doubt & read your full comment. In terms professionalism and complaining about immaturity and anti-social behaviour, I think you set off on the wrong foot there.

If you _had_ read the article you would have found its tone to be playful rather than agressive or anti-social, or that was certainly my intention. Characterising the specification as a douchebag was simply a storytelling device, trying to make the article less bland than a specification. If my intentions don’t come across in the article as a whole, that’s my fault. If you judge my article after sampling the first 5 words, that’s your problem I’m afraid.

If you’re interested in my thoughts on offensiveness, I rambled a bit over at “hacker news”:http://news.ycombinator.com/item?id=3807640
designbradford says:

May 8, 2012 at 11:48 am

I fought a very similar, or nearly identical battle 6 month ago, but gave up after Gotcha #5! Wish I would have had this resource then.
stephband says:

May 8, 2012 at 12:44 pm

It’s FULL of insight and specific detail, and is going to help we who are treading in your footsteps enormously. And yes, the tone hits the mark perfectly. Thanks!
DyanaRose says:

May 8, 2012 at 12:49 pm

I wish people would use the term “anti-social” properly.

Annoys the hell out of me.

That being said, ta for the article. Just getting into offline storage myself and quite useful for thinking about what it is I need the storage to do vs using what it is I’ve seen used before.
Seannachie says:

May 8, 2012 at 12:50 pm

I’m all for adding jokes and humor to articles and presentations, they make the content more interesting and memorable for viewers/listeners, and more enjoyable for the author/presenter. And I also agree that it is impossible to please everyone and that you are bound to offend someone regardless of how hard you try not to.

However, my point here is that the use of humor, metaphors, and analogies, even critically, can be done using terms and references that aren’t blatantly derogatory or that are likely to offend a large number of people. I would hope your ultimate goal would be for your article to reach the largest audience possible, but I feel your title limits that potential.

As an author you can write on any topic you desire and entitle it however you wish, and I respect that. I just politely disagree with your choice of title and subsequent analogy and feel you could have conveyed the same ideas using more “user-friendly” terminology.
grayghostvisuals says:

May 8, 2012 at 12:50 pm

Wow! what an unprofessional, classless writer you are with no taste and humor whatsoever! :)p

I was excited to see that Sprite Cow was used as an example of whats possible with this App Cache thingy and native Web apps.

First and foremost I had the pleasure of catching a Lazy Web Request on Github by Paul Irish which lead to this project on Github https://github.com/h5bp/mothereffinganimatedgif. When I saw the request I immediately thought about Sprite Cow and proceeded to jump in on the chance to implement the app cache for this project. Without going into every detail I will say Jonathan Stark was a huge help for me with his research and suggestions.

Charles is a great proxy sniffer and will certainly help with debugging the app cache. Also I wrote about the app cache with WordPress and Typekit and I will say it is very tricky especially with Typekit (see my article and Gist I included).

Article
http://blog.grayghostvisuals.com/wordpress/cache-manifest-for-wordpress

Github Gist from Article Above
https://gist.github.com/1629154

The App Cache can def be a tricky, unrelenting, finicky D-Bag for sure 😉
Seannachie says:

May 8, 2012 at 12:59 pm

Apparently I’ve used the term “anti-social” in a manner that conflicts with your own personal definition. Might I recommend you review what I wrote and the context in which the term was used, particularly what I was referring to as being “anti-social”.

I also suggest you edit the last line of your own post for grammar and sentence structure.
PlasticSturgeon says:

May 8, 2012 at 1:40 pm

I 100% Agree with @Seannachie – including not reading the article. Seriously, ALA, this is way below par for good journalism. Do you think you’re Cracked.com?
Know what would have made this article better? A different headline.
Mykola says:

May 8, 2012 at 2:43 pm

D-Bag would have been quite enough. Yet, it’s free and I learned a lot, so no complaints, “anti-social” is way over the top.
jsebrech says:

May 8, 2012 at 2:49 pm

I’ve also just built an offline app, and found it became much easier to work with appcache when I switched my mental model to that of a native app.

The manifest is just the description of the app bundle that you upload. When you upload a new version, you have to put a new version in the app bundle, just like native apps. You have to explicitly market everything that the bundle contains, just like native apps. Upgrading the app is done out of band, and the app always loads from the offline copy, just like native apps.

Reasoning from there it’s obvious to me that (A) the app should be a single page js app, (B) FALLBACK is a legacy mechanism not to be used for new apps, and (C) all dynamic content belongs in localstorage.

The problems described with bad connectivity occur on native apps as well. The smoothest native apps solve this by doing all online communication out-of-band (so the user can keep interacting), and by letting the user trigger when to download content (only the user can know when their network connectivity is good enough).
fritz from london says:

May 8, 2012 at 6:35 pm

Maybe I just had a long day, but I found the comedy intro a bit off putting. It’s a long article already. I don’t mind the title, although classy it isn’t.

Anyway, thanks for sharing the info.
superted says:

May 8, 2012 at 7:07 pm

So I was a little bit amazed that anyone could be offended by the word douchebag.

I thought maybe there were some cultural differences at play here and that on the other side of the pond (jake is british) the D word is more akin to the C word. See, over here douchebag is almost comedic in it’s lack of offensiveness, it’s the kind of word you might hear in elementary school before kids learn to cuss properly.

So I thought I’d do some research and it didn’t take me long to find this article:
http://www.nytimes.com/2009/11/14/business/media/14vulgar.html

It states in the article that the word douche ‘has surfaced at least 76 times already this year on 26 prime-time network series’.

Now you may not think this is right but what it does tell you is that the word you have taken offence to is now broadly deemed inoffensive enough to broadcast during primetime.

I’d say if a word is inoffensive enough for prime time it’s definitely inoffensive enough for a tech blog post dealing with advanced javascript topics.

On a final note I’d like to say that you taking offense to an an inoffensive word has nothing to do with Jake’s professionalism.
einarlove says:

May 8, 2012 at 9:14 pm

Truly loved the wit and humor. It made the read enjoyable and was greatly appreciated!
mnot says:

May 8, 2012 at 11:14 pm

Nice writeup. AppCache does indeed oversell itself.

Personally, I think we need something better. FWIW I’ve written more here:

* “Fixing AppCache”:http://www.mnot.net/blog/2011/06/19/offline_web

* “Better Browser Caching”:http://www.mnot.net/blog/2011/08/28/better_browser_caching

but really, they only scratch the surface.
Jake Archibald says:

May 9, 2012 at 8:34 am

Agreed, things get a lot easier when you act like you’re building an app, that’s why making Spritecow work offline was _relatively_ simple compared to Lanyrd.

Unfortunately, building sites in that way makes usual best practices such as progressive enhancement pretty tough.
BrainSurgeon says:

May 9, 2012 at 8:40 pm

I’m surprised people couldn’t get over the title enough to appreciate the content. This was some of the most comprehensive AppCache guidance I’ve seen yet.

Thanks for compiling it and making me laugh at the same time. To the other d-bags on here, enjoy your myopic lives.
BrainSurgeon says:

May 9, 2012 at 8:42 pm

http://www.quora.com/English-language/When-did-the-term-douchebag-enter-the-popular-parlanc
JoeSnellPDX says:

May 10, 2012 at 7:55 pm

I totally dug your article. The information is on point, easy to follow and has already answered some of my questions regarding this topic. That being said, the wit, humor and creativity you employed was quite enjoyable. I’ll be looking forward to reading another article from you in the future.

Thanks!
davidzuch says:

May 16, 2012 at 3:21 pm

Thank you for the really informative article, it really helped me understand a lot of the problems I encountered with the Application Cache. However, you kind of lost me here:

bq. // Get the page content
var pageHtml = document.body[removed];
var urlPath = location.pathName;
// Save it in local storage
localStorage.setItem( urlPath, pageHtml );
// Get our index
var index = JSON.parse( localStorage.getItem( index’ ) );
// Set the title and save it
index[ urlPath ] = document.title;
localStorage.setItem( ‘index’, JSON.stringify( index ) );

I tried running that in the console, but this line here returns null (after adding the missing quote, of course):

bq. var index = JSON.parse( localStorage.getItem( index’ ) );

I have to assume you left out a step? Otherwise the code beyond that point just doesn’t do anything (besides returning errors).
Jake Archibald says:

May 21, 2012 at 1:42 pm

That particular code example fails if ‘index’ isn’t an object, which would be done as part of an init script. It’s not far from pseudo-code. See the full code at https://github.com/jakearchibald/appcache-demo/blob/master/www/localstorage-cache/js/offliner-v1.js#L62 for an end-to-end example.
xmlblog says:

May 27, 2012 at 1:30 pm

My generation has a penchant for foul language. On our watch bitch and son-of-a-bitch have made it past the censors on the public airwaves. So you understand where I’m coming from, I’m no puritan—I use “colorful” language myself (perhaps too often I’m ashamed to say). But there are still places where I don’t. I don’t use expletives in meetings, or when toasting at weddings, in churches or temples, at the dinner table, and a few other places where the clearly seem inappropriate. On the other hand, great writers do often include distasteful words to make a sharp point. But they do so judiciously. If I understand you correctly, you made a conscious decision to employ the term, “douchebag.” Fine, but I think it was unwarranted and counterproductive in this case.
Kris Meister says:

June 5, 2012 at 9:29 am

Nice to see a technical article with sass.

I finished a project recently which created a mobile Web-App for distribution on iOS and Android. Web-Apps are an especially convenient use of AppCache. The user bookmarks the app to their homescreen and upon opening the app runs in full-screen outside the browser and with AppCache will run without web connectivity.

In my case I was only building a prototype and didn’t run into many of the update issue you faced. But the client was very relieved that they didn’t need to integrate the whole iOS provisioning profile system.

I agree with your article that bookmarking individual content articles needs a solution and your technique of using LocalStorage fits the bill.

However, I think the future of AppCache is in creating Mobile Web-Apps that circumvent the proprietary vendor marketplaces.
jpatokal says:

July 4, 2012 at 9:50 pm

Well done, sir, both for the technical content and writing style. Having hit most of those gotchas during the last year, I can only wish this had existed earlier.

@Seannachie: “Blatantly derogatory” to whom, exactly? Our nation’s poor, oppressed douchebag minority? The manufacturers of vaginal nozzle syringes? Or perhaps you’re afraid he’s hurting ApplicationCache’s feelings? But go ahead, please suggest an equally snappy, fetching and accurate alternative title then.
duppyconquer says:

July 5, 2012 at 7:35 pm

I had to create this account to say…

@SEANNACHIE is a Douchebag! LMAO.
Ganesh says:

January 31, 2013 at 9:00 am

Its a great article 🙂
バイオの買物.com 代表加々美直史 says:

February 3, 2013 at 10:44 pm

Great article. Doing something similar myself, the use of FALLBACK was especially informative.
Eric Brown 1 says:

February 14, 2013 at 1:46 pm

Good article. I’m just laughing at some of the cry baby comments. @SEANNACHIE… what an AppCache!!
Jason Sebring says:

March 2, 2013 at 4:00 pm

Thank you for the article. I am similar in my approach to development in terms of attitude. We don’t have to take ourselves too seriously and I find the ones that do have much less to contribute.
Zero Point Surfer says:

March 13, 2013 at 9:35 am

That is a great article, thank you very much. I learned a lot from it. Gotcha #6 could easily be avoided by providing different .manifest files for different platforms though.
blessdyb says:

April 2, 2013 at 2:52 am

Thanks very much for your greatful job. But I find that the “iframe hack” is not work in chrome(26), Each time I refresh the page(http://appcache-demo.s3-website-us-east-1.amazonaws.com/localstorage-cache/), static resources in fallback.html load from cache, but the same resources in index.html still download from server.
Jempey says:

April 19, 2013 at 5:04 pm

First of all, a great article and a great introduction into the story of going offline. The application cache is indeed not that easy to implement and there are a lot of pitfalls. But when it works, it works, and it works perfectly! To say… for the lastest popular desktop and mobile browsers.

In an attempt for a newspaper project I had to develop an android html5/javascript application which needed to work offline. Not a big problem at all, but when migrating everything to an android webview, allmost everything went wrong. Making the application cache work in an android webview seems to be extremely difficult. We haven’t found a solution yet, but for instance, the demo page of A List a part doesn’t work either.

More on this issue @ https://github.com/kurti-vdb/AppcacheDemo
Tom Love says:

May 2, 2013 at 6:02 pm

Caching in general is not easy to get right, and caches on top of caches are always going to be a barrel of fun to debug, especially when they’re incredibly sensitive to all kinds of implicit assumptions and state.

For most purposes, AppCache does nothing more than the Expires HTTP header has done for years.

Content sent with future-dated Expires even works offline. (Until the user hits refresh, anyway — which is really a sign of what browser vendors have had to do to accommodate bad caching implementations. I’m interested to see if they will hold out any better against bad AppCache implementations.)

But AppCache is also a pre-loader, which is an entirely separate thing from caching, and I’m not sure they should be conflated. For a given resource you might reasonably want to cache but not pre-load, and for another pre-load but not cache.

For any non-trivial application, which is to say any application for which caching and preloading might be useful, the work of cache invalidation and decisions about what to pre-load, end up being done in application code. Letting the browser make some of the decisions independently of the app code leads to coupling, which leads to complexity, which leads to bugs.
David Boden says:

May 31, 2013 at 4:09 am

Great article and writing style.

Based on:
>> AppCache doesn’t give us a way to remove implicitly cached items

I’m going to experiment to see if returning an HTTP status of 410 (Gone forever) from my server for one of the cached items results in it being removed from the appcache. Would make sense to me if it did…
Petr Kunc says:

June 20, 2013 at 11:04 am

Well, I cited this article in my diploma thesis as it is a great resource of valuable information but the title does not look very good in cited sources 🙂
If anybody is interested, I designed a framework to overcome some known issues of application cache available on my GitHub. The thesis is also there.
Vishal Chandra says:

June 24, 2013 at 5:04 am

Great article, very helpful. You made it an easy read. Thanks!

Couldn’t figure out the fuss over the title though. And you only called some software functionality “db” not even a real person or organization… wow!
Daniel Nordstrom says:

July 1, 2013 at 9:50 pm

I simply had to comment after reading the douchebags commenting on the word “douchebag,” and on the introduction being whatever and whatnot. I shouldn’t read comments at this time of the day.

It’s 3 AM and I’ve had a really long day myself—won’t be done for a few more hours either—but I still read the whole thing, and I found it really helpful to what I’m building. Something which will hopefully do good and provoke positive change in the world.

Do you know why I read the whole thing? Because I don’t give a shit about the headline or the writing style, so long as the article gives me more value than 99.9% of the misleading material you find via Google these days. I usually skim all articles: this time I read.

I’m a big proponent of proper writing—but also of freedom of speech and expression in your individual writing style. If it doesn’t suit me, I would just skim or not read it—but I definitely won’t complain (in the first comment anyone will see) about subjective trivialities of an article that gives a ton of value to the majority of its readers. That would be a waste of my time, it would be self-centered, and it wouldn’t serve me or the Internet society any good.

Luckily, for you as a well-grounded writer, such a comment feels comparable to a mosquito sting for a second, before you welcome and accept the criticism, learning what you can from it.

This helped me a lot—after countless hours researching, and years in the industry. In fact, you made a good choice of headline. It drew my attention, which made me click the link. Otherwise I wouldn’t have bothered in the middle of the night.

Had you not posted this here, I hope I would have gotten to read it elsewhere. Thank you!

To cite the first paragraph of the ALA style guide—which I assume all articles are peer reviewed against before publication:

“. . . use an informal, conversational tone, though not at the cost of clarity or correctness. Experts require neither excessive formality nor excessive casualness to express their authority. If you write with ALA’s readership in mind and sound like yourself, you’re most of the way there already.”

You’re apparently all the way there. Sorry for the rant, but sometimes people just frustrate me enough to motivate it.
David de Manbey says:

July 17, 2013 at 2:35 pm

One of the best articles I have read on the application cache. Went back to it many times as I struggled to get things working. He nails it on the head – practical, not a cakewalk, and not for the feint of heart.
Stacey Reiman says:

July 18, 2013 at 8:26 pm

The title was why I clicked on this article. Opinions in programming are how us mere mortals figure all this stuff out. Thanks very much for the info.
sacah says:

September 29, 2013 at 10:10 pm

My issue is with the ‘true’ story at the very beginning. I’m a pretty smart guy, and I have a feeling it didn’t really happen! How do I know any of this article is real if you start off with a dubious sounding ‘true’ story?

(-:
Great article, after reading it I understand the title.
Joel Worrall says:

October 14, 2013 at 10:13 am

A great evaluation of the AppCache in a jovial, readable tone. Thanks for writing this and calling out the highlights (good and bad) @jaffathecake
Brandon Lockaby says:

October 29, 2013 at 3:41 pm

I was able to get rid of my app’s cache by 404’ing the cache manifest. My app usually doesn’t use 404, so I was frustrated by this. Good luck!
Joe Mordetsky says:

November 20, 2013 at 10:02 pm

awesome article. best.title.ever.
Марија Костадиновска says:

November 21, 2013 at 12:35 pm

This article truly depicts the nature of application cache… Its a “complete douche-bag”. 64 thumbs up.
Dieter Donnert says:

January 19, 2014 at 8:24 am

I admit that I was led astray by the title. I expected an anti-AppCache rant. Instead, I found the best article on the AppCache I have read.
sebastianauer says:

March 7, 2014 at 6:23 pm

Just read this and must admit that I did not know of many of the caveats. Not only was this article very helpful in educating me, but it did so in an entertaining manner. I read enough dry material on a daily basis. A big thank you.

Since the article is a little older, has anything changed with regards to the AppCache recently? Have browsers adopted more options or better support and handling of scenarios like? How can one better leverage content updates with UX?
Brian Smith 1 says:

March 31, 2014 at 12:43 pm

+1 to @DieterDonnert and @SebastianAuer as their comments & question are exactly what I wanted to express to the author. Thanks!
Heather Salem says:

April 25, 2014 at 4:28 am

I tested Gotcha #5 on Safari 7.0.3 and the non-cached page always displayed the images. Chrome worked as you described (no images on reload). Is this something that’s changed in Safari?