Request with Intent: Caching Strategies in the Age of PWAs

Once upon a time, we relied on browsers to handle caching for us; as developers in those days, we had very little control. But then came Progressive Web Apps (PWAs), Service Workers, and the Cache API—and suddenly we have expansive power over what gets put in the cache and how it gets put there. We can now cache everything we want to… and therein lies a potential problem.

Article Continues Below

Media files—especially images—make up the bulk of average page weight these days, and it’s getting worse. In order to improve performance, it’s tempting to cache as much of this content as possible, but should we? In most cases, no. Even with all this newfangled technology at our fingertips, great performance still hinges on a simple rule: request only what you need and make each request as small as possible.

To provide the best possible experience for our users without abusing their network connection or their hard drive, it’s time to put a spin on some classic best practices, experiment with media caching strategies, and play around with a few Cache API tricks that Service Workers have hidden up their sleeves.

Best intentions#section2

All those lessons we learned optimizing web pages for dial-up became super-useful again when mobile took off, and they continue to be applicable in the work we do for a global audience today. Unreliable or high latency network connections are still the norm in many parts of the world, reminding us that it’s never safe to assume a technical baseline lifts evenly or in sync with its corresponding cutting edge. And that’s the thing about performance best practices: history has borne out that approaches that are good for performance now will continue being good for performance in the future.

Before the advent of Service Workers, we could provide some instructions to browsers with respect to how long they should cache a particular resource, but that was about it. Documents and assets downloaded to a user’s machine would be dropped into a directory on their hard drive. When the browser assembled a request for a particular document or asset, it would peek in the cache first to see if it already had what it needed to possibly avoid hitting the network.

We have considerably more control over network requests and the cache these days, but that doesn’t excuse us from being thoughtful about the resources on our web pages.

Request only what you need#section3

As I mentioned, the web today is lousy with media. Images and videos have become a dominant means of communication. They may convert well when it comes to sales and marketing, but they are hardly performant when it comes to download and rendering speed. With this in mind, each and every image (and video, etc.) should have to fight for its place on the page.

A few years back, a recipe of mine was included in a newspaper story on cooking with spirits (alcohol, not ghosts). I don’t subscribe to the print version of that paper, so when the article came out I went to the site to take a look at how it turned out. During a recent redesign, the site had decided to load all articles into a nearly full-screen modal viewbox layered on top of their homepage. This meant requesting the article required requests for all of the assets associated with the article page plus all the contents and assets for the homepage. Oh, and the homepage had video ads—plural. And, yes, they auto-played.

I popped open DevTools and discovered the page had blown past 15 MB in page weight. Tim Kadlec had recently launched What Does My Site Cost?, so I decided to check out the damage. Turns out that the actual cost to view that page for the average US-based user was more than the cost of the print version of that day’s newspaper. That’s just messed up.

Sure, I could blame the folks who built the site for doing their readers such a disservice, but the reality is that none of us go to work with the goal of worsening our users’ experiences. This could happen to any of us. We could spend days scrutinizing the performance of a page only to have some committee decide to set that carefully crafted page atop a Times Square of auto-playing video ads. Imagine how much worse things would be if we were stacking two abysmally-performing pages on top of each other!

Media can be great for drawing attention when competition is high (e.g., on the homepage of a newspaper), but when you want readers to focus on a single task (e.g., reading the actual article), its value can drop from important to “nice to have.” Yes, studies have shown that images excel at drawing eyeballs, but once a visitor is on the article page, no one cares; we’re just making it take longer to download and more expensive to access. The situation only gets worse as we shove more media into the page.

We must do everything in our power to reduce the weight of our pages, so avoid requests for things that don’t add value. For starters, if you’re writing an article about a data breach, resist the urge to include that ridiculous stock photo of some random dude in a hoodie typing on a computer in a very dark room.

Request the smallest file you can#section4

Now that we’ve taken stock of what we do need to include, we must ask ourselves a critical question: How can we deliver it in the fastest way possible? This can be as simple as choosing the most appropriate image format for the content presented (and optimizing the heck out of it) or as complex as recreating assets entirely (for example, if switching from raster to vector imagery would be more efficient).

Offer alternate formats#section5

When it comes to image formats, we don’t have to choose between performance and reach anymore. We can provide multiple options and let the browser decide which one to use, based on what it can handle.

You can accomplish this by offering multiple sources within a picture or video element. Start by creating multiple formats of the media asset. For example, with WebP and JPG, it’s likely that the WebP will have a smaller file size than the JPG (but check to make sure). With those alternate sources, you can drop them into a picture like this:

<picture>
  <source srcset="my.webp" type="image/webp">
  <img src="my.jpg" alt="Descriptive text about the picture.">
</picture>

Browsers that recognize the picture element will check the source element before making a decision about which image to request. If the browser supports the MIME type “image/webp,” it will kick off a request for the WebP format image. If not (or if the browser doesn’t recognize picture), it will request the JPG.

The nice thing about this approach is that you’re serving the smallest image possible to the user without having to resort to any sort of JavaScript hackery.

You can take the same approach with video files:

<video controls>
  <source src="my.webm" type="video/webm">
  <source src="my.mp4" type="video/mp4">
  <p>Your browser doesn’t support native video playback,
    but you can <a href="my.mp4" download>download</a>
    this video instead.</p>
</video>

Browsers that support WebM will request the first source, whereas browsers that don’t—but do understand MP4 videos—will request the second one. Browsers that don’t support the video element will fall back to the paragraph about downloading the file.

The order of your source elements matters. Browsers will choose the first usable source, so if you specify an optimized alternative format after a more widely compatible one, the alternative format may never get picked up.

Depending on your situation, you might consider bypassing this markup-based approach and handle things on the server instead. For example, if a JPG is being requested and the browser supports WebP (which is indicated in the Accept header), there’s nothing stopping you from replying with a WebP version of the resource. In fact, some CDN services—Cloudinary, for instance—come with this sort of functionality right out of the box.

Offer different sizes#section6

Formats aside, you may want to deliver alternate image sizes optimized for the current size of the browser’s viewport. After all, there’s no point loading an image that’s 3–4 times larger than the screen rendering it; that’s just wasting bandwidth. This is where responsive images come in.

Here’s an example:

<img src="medium.jpg"
  srcset="small.jpg 256w,
    medium.jpg 512w,
    large.jpg 1024w"
  sizes="(min-width: 30em) 30em, 100vw"
  alt="Descriptive text about the picture.">

There’s a lot going on in this super-charged img element, so I’ll break it down:

This img offers three size options for a given JPG: 256 px wide (small.jpg), 512 px wide (medium.jpg), and 1024 px wide (large.jpg). These are provided in the srcset attribute with corresponding width descriptors.
The src defines a default image source, which acts as a fallback for browsers that don’t support srcset. Your choice for the default image will likely depend on the context and general usage patterns. Often I’d recommend the smallest image be the default, but if the majority of your traffic is on older desktop browsers, you might want to go with the medium-sized image.
The sizes attribute is a presentational hint that informs the browser how the image will be rendered in different scenarios (its extrinsic size) once CSS has been applied. This particular example says that the image will be the full width of the viewport (100vw) until the viewport reaches 30 em in width (min-width: 30em), at which point the image will be 30 em wide. You can make the sizes value as complicated or as simple as you want; omitting it causes browsers to use the default value of 100vw.

You can even combine this approach with alternate formats and crops within a single picture. 🤯

All of this is to say that you have a number of tools at your disposal for delivering fast-loading media, so use them!

Defer requests (when possible)#section7

Years ago, Internet Explorer 11 introduced a new attribute that enabled developers to de-prioritize specific img elements to speed up page rendering: lazyload. That attribute never went anywhere, standards-wise, but it was a solid attempt to defer image loading until images are in view (or close to it) without having to involve JavaScript.

There have been countless JavaScript-based implementations of lazy loading images since then, but recently Google also took a stab at a more declarative approach, using a different attribute: loading.

The loading attribute supports three values (“auto,” “lazy,” and “eager”) to define how a resource should be brought in. For our purposes, the “lazy” value is the most interesting because it defers loading the resource until it reaches a calculated distance from the viewport.

Adding that into the mix…

<img src="medium.jpg"
  srcset="small.jpg 256w,
    medium.jpg 512w,
    large.jpg 1024w"
  sizes="(min-width: 30em) 30em, 100vw"
  loading="lazy"
  alt="Descriptive text about the picture.">

This attribute offers a bit of a performance boost in Chromium-based browsers. Hopefully it will become a standard and get picked up by other browsers in the future, but in the meantime there’s no harm in including it because browsers that don’t understand the attribute will simply ignore it.

This approach complements a media prioritization strategy really well, but before I get to that, I want to take a closer look at Service Workers.

Manipulate requests in a Service Worker#section8

Service Workers are a special type of Web Worker with the ability to intercept, modify, and respond to all network requests via the Fetch API. They also have access to the Cache API, as well as other asynchronous client-side data stores like IndexedDB for resource storage.

When a Service Worker is installed, you can hook into that event and prime the cache with resources you want to use later. Many folks use this opportunity to squirrel away copies of global assets, including styles, scripts, logos, and the like, but you can also use it to cache images for use when network requests fail.

Keep a fallback image in your back pocket#section9

Assuming you want to use a fallback in more than one networking recipe, you can set up a named function that will respond with that resource:

function respondWithFallbackImage() {
  return caches.match( "/i/fallbacks/offline.svg" );
}

Then, within a fetch event handler, you can use that function to provide that fallback image when requests for images fail at the network:

self.addEventListener( "fetch", event => {
  const request = event.request;
  if ( request.headers.get("Accept").includes("image") ) {
    event.respondWith(
      return fetch( request, { mode: 'no-cors' } )
        .then( response => {
          return response;
        })
        .catch(
          respondWithFallbackImage
        );
    );
  }
});

When the network is available, users get the expected behavior:

Social media avatars are rendered as expected when the network is available.

But when the network is interrupted, images will be swapped automatically for a fallback, and the user experience is still acceptable:

Screenshot showing a series of identical generic user images in place of the individual ones which have not loaded — A generic fallback avatar is rendered when the network is unavailable.

On the surface, this approach may not seem all that helpful in terms of performance since you’ve essentially added an additional image download into the mix. With this system in place, however, some pretty amazing opportunities open up to you.

Respect a user’s choice to save data#section10

Some users reduce their data consumption by entering a “lite” mode or turning on a “data saver” feature. When this happens, browsers will often send a Save-Data header with their network requests.

Within your Service Worker, you can look for this header and adjust your responses accordingly. First, you look for the header:

let save_data = false;
if ( 'connection' in navigator ) {
  save_data = navigator.connection.saveData;
}

Then, within your fetch handler for images, you might choose to preemptively respond with the fallback image instead of going to the network at all:

self.addEventListener( "fetch", event => {
  const request = event.request;
  if ( request.headers.get("Accept").includes("image") ) {
    event.respondWith(
      if ( save_data ) {
        return respondWithFallbackImage();
      }
      // code you saw previously
    );
  }
});

You could even take this a step further and tune respondWithFallbackImage() to provide alternate images based on what the original request was for. To do that you’d define several fallbacks globally in the Service Worker:

const fallback_avatar = "/i/fallbacks/avatar.svg",
      fallback_image = "/i/fallbacks/image.svg";

Both of those files should then be cached during the Service Worker install event:

return cache.addAll( [
  fallback_avatar,
  fallback_image
]);

Finally, within respondWithFallbackImage() you could serve up the appropriate image based on the URL being fetched. In my site, the avatars are pulled from Webmention.io, so I test for that.

function respondWithFallbackImage( url ) {
  const image = avatars.test( /webmention\.io/ ) ? fallback_avatar
                                                 : fallback_image;
  return caches.match( image );
}

With that change, I’ll need to update the fetch handler to pass in request.url as an argument to respondWithFallbackImage(). Once that’s done, when the network gets interrupted I end up seeing something like this:

A webmention that contains both an avatar and an embedded image will render with two different fallbacks when the Save-Data header is present.

Next, we need to establish some general guidelines for handling media assets—based on the situation, of course.

The caching strategy: prioritize certain media#section11

In my experience, media—especially images—on the web tend to fall into three categories of necessity. At one end of the spectrum are elements that don’t add meaningful value. At the other end of the spectrum are critical assets that do add value, such as charts and graphs that are essential to understanding the surrounding content. Somewhere in the middle are what I would call “nice-to-have” media. They do add value to the core experience of a page but are not critical to understanding the content.

If you consider your media with this division in mind, you can establish some general guidelines for handling each, based on the situation. In other words, a caching strategy.

Media loading strategy, broken down by how critical an asset is to understanding an interface
Media category	Fast connection	`Save-Data`	No network
Critical	Load media		Replace with placeholder
Nice-to-have	Load media	Replace with placeholder
Non-critical	Remove from content entirely

When it comes to disambiguating the critical from the nice-to-have, it’s helpful to have those resources organized into separate directories (or similar). That way we can add some logic into the Service Worker that can help it decide which is which. For example, on my own personal site, critical images are either self-hosted or come from the website for my book. Knowing that, I can write regular expressions that match those domains:

const high_priority = [
    /aaron\-gustafson\.com/,
    /adaptivewebdesign\.info/
  ];

With that high_priority variable defined, I can create a function that will let me know if a given image request (for example) is a high priority request or not:

function isHighPriority( url ) {
  // how many high priority links are we dealing with?
  let i = high_priority.length;
  // loop through each
  while ( i-- ) {
    // does the request URL match this regular expression?
    if ( high_priority[i].test( url ) ) {
      // yes, it’s a high priority request
      return true;
    }
  }
  // no matches, not high priority
  return false;
}

Adding support for prioritizing media requests only requires adding a new conditional into the fetch event handler, like we did with Save-Data. Your specific recipe for network and cache handling will likely differ, but here was how I chose to mix in this logic within image requests:

// Check the cache first
  // Return the cached image if we have one
  // If the image is not in the cache, continue

// Is this image high priority?
if ( isHighPriority( url ) ) {

  // Fetch the image
    // If the fetch succeeds, save a copy in the cache
    // If not, respond with an "offline" placeholder

// Not high priority
} else {

  // Should I save data?
  if ( save_data ) {

    // Respond with a "saving data" placeholder

  // Not saving data
  } else {

    // Fetch the image
      // If the fetch succeeds, save a copy in the cache
      // If not, respond with an "offline" placeholder
  }
}

We can apply this prioritized approach to many kinds of assets. We could even use it to control which pages are served cache-first vs. network-first.

Keep the cache tidy#section12

The ability to control which resources are cached to disk is a huge opportunity, but it also carries with it an equally huge responsibility not to abuse it.

Every caching strategy is likely to differ, at least a little bit. If we’re publishing a book online, for instance, it might make sense to cache all of the chapters, images, etc. for offline viewing. There’s a fixed amount of content and—assuming there aren’t a ton of heavy images and videos—users will benefit from not having to download each chapter separately.

On a news site, however, caching every article and photo will quickly fill up our users’ hard drives. If a site offers an indeterminate number of pages and assets, it’s critical to have a caching strategy that puts hard limits on how many resources we’re caching to disk.

One way to do this is to create several different blocks associated with caching different forms of content. The more ephemeral content caches can have strict limits around how many items can be stored. Sure, we’ll still be bound to the storage limits of the device, but do we really want our website to take up 2 GB of someone’s hard drive?

Here’s an example, again from my own site:

const sw_caches = {
  static: {
    name: `${version}static`
  },
  images: {
    name: `${version}images`,
    limit: 75
  },
  pages: {
    name: `${version}pages`,
    limit: 5
  },
  other: {
    name: `${version}other`,
    limit: 50
  }
}

Here I’ve defined several caches, each with a name used for addressing it in the Cache API and a version prefix. The version is defined elsewhere in the Service Worker, and allows me to purge all caches at once if necessary.

With the exception of the static cache, which is used for static assets, every cache has a limit to the number of items that may be stored. I only cache the most recent 5 pages someone has visited, for instance. Images are limited to the most recent 75, and so on. This is an approach that Jeremy Keith outlines in his fantastic book Going Offline (which you should really read if you haven’t already—here’s a sample).

With these cache definitions in place, I can clean up my caches periodically and prune the oldest items. Here’s Jeremy’s recommended code for this approach:

function trimCache(cacheName, maxItems) {
  // Open the cache
  caches.open(cacheName)
  .then( cache => {
    // Get the keys and count them
    cache.keys()
    .then(keys => {
      // Do we have more than we should?
      if (keys.length > maxItems) {
        // Delete the oldest item and run trim again
        cache.delete(keys[0])
        .then( () => {
          trimCache(cacheName, maxItems)
        });
      }
    });
  });
}

We can trigger this code to run whenever a new page loads. By running it in the Service Worker, it runs in a separate thread and won’t drag down the site’s responsiveness. We trigger it by posting a message (using postMessage()) to the Service Worker from the main JavaScript thread:

// First check to see if you have an active service worker
if ( navigator.serviceWorker.controller ) {
  // Then add an event listener
  window.addEventListener( "load", function(){
    // Tell the service worker to clean up
    navigator.serviceWorker.controller.postMessage( "clean up" );
  });
}

The final step in wiring it all up is setting up the Service Worker to receive the message:

addEventListener("message", messageEvent => {
  if (messageEvent.data == "clean up") {
    // loop though the caches
    for ( let key in sw_caches ) {
      // if the cache has a limit
      if ( sw_caches[key].limit !== undefined ) {
        // trim it to that limit
        trimCache( sw_caches[key].name, sw_caches[key].limit );
      }
    }
  }
});

Here, the Service Worker listens for inbound messages and responds to the “clean up” request by running trimCache() on each of the cache buckets with a defined limit.

This approach is by no means elegant, but it works. It would be far better to make decisions about purging cached responses based on how frequently each item is accessed and/or how much room it takes up on disk. (Removing cached items based purely on when they were cached isn’t nearly as useful.) Sadly, we don’t have that level of detail when it comes to inspecting the caches…yet. I’m actually working to address this limitation in the Cache API right now.

Your users always come first#section13

The technologies underlying Progressive Web Apps are continuing to mature, but even if you aren’t interested in turning your site into a PWA, there’s so much you can do today to improve your users’ experiences when it comes to media. And, as with every other form of inclusive design, it starts with centering on your users who are most at risk of having an awful experience.

Draw distinctions between critical, nice-to-have, and superfluous media. Remove the cruft, then optimize the bejeezus out of each remaining asset. Serve your media in multiple formats and sizes, prioritizing the smallest versions first to make the most of high latency and slow connections. If your users say they want to save data, respect that and have a fallback plan in place. Cache wisely and with the utmost respect for your users’ disk space. And, finally, audit your caching strategies regularly—especially when it comes to large media files.Follow these guidelines, and every one of your users—from folks rocking a JioPhone on a rural mobile network in India to people on a high-end gaming laptop wired to a 10 Gbps fiber line in Silicon Valley—will thank you.

No Comments

Got something to say?

We have turned off comments, but you can see what folks had to say before we did so.

More from ALA

Designed for a Dead Language

by Shrey Shah

Every language app in your pocket inherited a teaching method built for Latin. Understanding why that happened is a more useful design lesson than anything the apps themselves can teach you.

Good designers, bad websites: a proposal

by Alan Dalton

Designers are good people. Some designs exclude people anyway. Alan Dalton offers a practical fix: accessibility personas that help you recognize problems while you're designing, not after. Homework included.

“Successful” or “Unsuccessful”: the Post-“Good Design” Vocabulary

by Justin Dauer

Design for Amiability: Lessons from Vienna

by Mark Bernstein

Computing was born in a Viennese café. Between 1928 and 1934, while Hitler plotted and Europe crumbled, a motley crew of mathematicians, philosophers, architects, and economists gathered weekly to puzzle out the limits of reason—and invented Computer Science in the process. What made their collaboration possible wasn't just brilliance (though they had plenty). It was amiability: the careful design of a social space where difficult people could disagree without destroying each other. Longtime A List Apart contributing author Mark Bernstein mines this forgotten history for lessons that might just save today's embattled web from its worst impulses. Spoiler: it involves better coffee service and the looming threat of public humiliation.

Design Dialects: Breaking the Rules, Not the System

by Michel Ferreira

Design systems aren't component libraries—they’re living languages. Rigid adherence to visual rules creates brittle systems that break under contextual pressure. Fluent systems bend without breaking.