Once upon a time, we relied on browsers to handle caching for us; as developers in those days, we had very little control. But then came Progressive Web Apps (PWAs), Service Workers, and the Cache API—and suddenly we have expansive power over what gets put in the cache and how it gets put there. We can now cache everything we want to… and therein lies a potential problem.
Media files—especially images—make up the bulk of average page weight these days, and it’s getting worse. In order to improve performance, it’s tempting to cache as much of this content as possible, but should we? In most cases, no. Even with all this newfangled technology at our fingertips, great performance still hinges on a simple rule: request only what you need and make each request as small as possible.
To provide the best possible experience for our users without abusing their network connection or their hard drive, it’s time to put a spin on some classic best practices, experiment with media caching strategies, and play around with a few Cache API tricks that Service Workers have hidden up their sleeves.
All those lessons we learned optimizing web pages for dial-up became super-useful again when mobile took off, and they continue to be applicable in the work we do for a global audience today. Unreliable or high latency network connections are still the norm in many parts of the world, reminding us that it’s never safe to assume a technical baseline lifts evenly or in sync with its corresponding cutting edge. And that’s the thing about performance best practices: history has borne out that approaches that are good for performance now will continue being good for performance in the future.
Before the advent of Service Workers, we could provide some instructions to browsers with respect to how long they should cache a particular resource, but that was about it. Documents and assets downloaded to a user’s machine would be dropped into a directory on their hard drive. When the browser assembled a request for a particular document or asset, it would peek in the cache first to see if it already had what it needed to possibly avoid hitting the network.
We have considerably more control over network requests and the cache these days, but that doesn’t excuse us from being thoughtful about the resources on our web pages.
Request only what you need#section3
As I mentioned, the web today is lousy with media. Images and videos have become a dominant means of communication. They may convert well when it comes to sales and marketing, but they are hardly performant when it comes to download and rendering speed. With this in mind, each and every image (and video, etc.) should have to fight for its place on the page.
A few years back, a recipe of mine was included in a newspaper story on cooking with spirits (alcohol, not ghosts). I don’t subscribe to the print version of that paper, so when the article came out I went to the site to take a look at how it turned out. During a recent redesign, the site had decided to load all articles into a nearly full-screen modal viewbox layered on top of their homepage. This meant requesting the article required requests for all of the assets associated with the article page plus all the contents and assets for the homepage. Oh, and the homepage had video ads—plural. And, yes, they auto-played.
I popped open DevTools and discovered the page had blown past 15 MB in page weight. Tim Kadlec had recently launched What Does My Site Cost?, so I decided to check out the damage. Turns out that the actual cost to view that page for the average US-based user was more than the cost of the print version of that day’s newspaper. That’s just messed up.
Sure, I could blame the folks who built the site for doing their readers such a disservice, but the reality is that none of us go to work with the goal of worsening our users’ experiences. This could happen to any of us. We could spend days scrutinizing the performance of a page only to have some committee decide to set that carefully crafted page atop a Times Square of auto-playing video ads. Imagine how much worse things would be if we were stacking two abysmally-performing pages on top of each other!
Media can be great for drawing attention when competition is high (e.g., on the homepage of a newspaper), but when you want readers to focus on a single task (e.g., reading the actual article), its value can drop from important to “nice to have.” Yes, studies have shown that images excel at drawing eyeballs, but once a visitor is on the article page, no one cares; we’re just making it take longer to download and more expensive to access. The situation only gets worse as we shove more media into the page.
We must do everything in our power to reduce the weight of our pages, so avoid requests for things that don’t add value. For starters, if you’re writing an article about a data breach, resist the urge to include that ridiculous stock photo of some random dude in a hoodie typing on a computer in a very dark room.
Request the smallest file you can#section4
Now that we’ve taken stock of what we do need to include, we must ask ourselves a critical question: How can we deliver it in the fastest way possible? This can be as simple as choosing the most appropriate image format for the content presented (and optimizing the heck out of it) or as complex as recreating assets entirely (for example, if switching from raster to vector imagery would be more efficient).
Offer alternate formats#section5
When it comes to image formats, we don’t have to choose between performance and reach anymore. We can provide multiple options and let the browser decide which one to use, based on what it can handle.
You can accomplish this by offering multiple
sources within a
video element. Start by creating multiple formats of the media asset. For example, with WebP and JPG, it’s likely that the WebP will have a smaller file size than the JPG (but check to make sure). With those alternate sources, you can drop them into a
picture like this:
Browsers that recognize the
picture element will check the
source element before making a decision about which image to request. If the browser supports the MIME type “image/webp,” it will kick off a request for the WebP format image. If not (or if the browser doesn’t recognize
picture), it will request the JPG.
You can take the same approach with video files:
Browsers that support WebM will request the first
source, whereas browsers that don’t—but do understand MP4 videos—will request the second one. Browsers that don’t support the
video element will fall back to the paragraph about downloading the file.
The order of your
source elements matters. Browsers will choose the first usable
source, so if you specify an optimized alternative format after a more widely compatible one, the alternative format may never get picked up.
Depending on your situation, you might consider bypassing this markup-based approach and handle things on the server instead. For example, if a JPG is being requested and the browser supports WebP (which is indicated in the
Accept header), there’s nothing stopping you from replying with a WebP version of the resource. In fact, some CDN services—Cloudinary, for instance—come with this sort of functionality right out of the box.
Offer different sizes#section6
Formats aside, you may want to deliver alternate image sizes optimized for the current size of the browser’s viewport. After all, there’s no point loading an image that’s 3–4 times larger than the screen rendering it; that’s just wasting bandwidth. This is where responsive images come in.
Here’s an example:
There’s a lot going on in this super-charged
img element, so I’ll break it down:
imgoffers three size options for a given JPG: 256 px wide (
small.jpg), 512 px wide (
medium.jpg), and 1024 px wide (
large.jpg). These are provided in the
srcsetattribute with corresponding width descriptors.
srcdefines a default image source, which acts as a fallback for browsers that don’t support
srcset. Your choice for the default image will likely depend on the context and general usage patterns. Often I’d recommend the smallest image be the default, but if the majority of your traffic is on older desktop browsers, you might want to go with the medium-sized image.
sizesattribute is a presentational hint that informs the browser how the image will be rendered in different scenarios (its extrinsic size) once CSS has been applied. This particular example says that the image will be the full width of the viewport (
100vw) until the viewport reaches 30 em in width (
min-width: 30em), at which point the image will be 30 em wide. You can make the
sizesvalue as complicated or as simple as you want; omitting it causes browsers to use the default value of
You can even combine this approach with alternate formats and crops within a single
All of this is to say that you have a number of tools at your disposal for delivering fast-loading media, so use them!
Defer requests (when possible)#section7
Years ago, Internet Explorer 11 introduced a new attribute that enabled developers to de-prioritize specific
img elements to speed up page rendering:
loading attribute supports three values (“auto,” “lazy,” and “eager”) to define how a resource should be brought in. For our purposes, the “lazy” value is the most interesting because it defers loading the resource until it reaches a calculated distance from the viewport.
Adding that into the mix…
This attribute offers a bit of a performance boost in Chromium-based browsers. Hopefully it will become a standard and get picked up by other browsers in the future, but in the meantime there’s no harm in including it because browsers that don’t understand the attribute will simply ignore it.
This approach complements a media prioritization strategy really well, but before I get to that, I want to take a closer look at Service Workers.
Manipulate requests in a Service Worker#section8
Service Workers are a special type of Web Worker with the ability to intercept, modify, and respond to all network requests via the Fetch API. They also have access to the Cache API, as well as other asynchronous client-side data stores like IndexedDB for resource storage.
When a Service Worker is installed, you can hook into that event and prime the cache with resources you want to use later. Many folks use this opportunity to squirrel away copies of global assets, including styles, scripts, logos, and the like, but you can also use it to cache images for use when network requests fail.
Keep a fallback image in your back pocket#section9
Assuming you want to use a fallback in more than one networking recipe, you can set up a named function that will respond with that resource:
Then, within a
fetch event handler, you can use that function to provide that fallback image when requests for images fail at the network:
When the network is available, users get the expected behavior:
But when the network is interrupted, images will be swapped automatically for a fallback, and the user experience is still acceptable:
On the surface, this approach may not seem all that helpful in terms of performance since you’ve essentially added an additional image download into the mix. With this system in place, however, some pretty amazing opportunities open up to you.
Respect a user’s choice to save data#section10
Some users reduce their data consumption by entering a “lite” mode or turning on a “data saver” feature. When this happens, browsers will often send a
Save-Data header with their network requests.
Within your Service Worker, you can look for this header and adjust your responses accordingly. First, you look for the header:
Then, within your
fetch handler for images, you might choose to preemptively respond with the fallback image instead of going to the network at all:
You could even take this a step further and tune
respondWithFallbackImage() to provide alternate images based on what the original request was for. To do that you’d define several fallbacks globally in the Service Worker:
Both of those files should then be cached during the Service Worker install event:
respondWithFallbackImage() you could serve up the appropriate image based on the URL being fetched. In my site, the avatars are pulled from Webmention.io, so I test for that.
With that change, I’ll need to update the
fetch handler to pass in
request.url as an argument to
respondWithFallbackImage(). Once that’s done, when the network gets interrupted I end up seeing something like this:
Next, we need to establish some general guidelines for handling media assets—based on the situation, of course.
The caching strategy: prioritize certain media#section11
In my experience, media—especially images—on the web tend to fall into three categories of necessity. At one end of the spectrum are elements that don’t add meaningful value. At the other end of the spectrum are critical assets that do add value, such as charts and graphs that are essential to understanding the surrounding content. Somewhere in the middle are what I would call “nice-to-have” media. They do add value to the core experience of a page but are not critical to understanding the content.
If you consider your media with this division in mind, you can establish some general guidelines for handling each, based on the situation. In other words, a caching strategy.
|Media category||Fast connection||
||Slow connection||No network|
|Critical||Load media||Replace with placeholder|
|Nice-to-have||Load media||Replace with placeholder|
|Non-critical||Remove from content entirely|
When it comes to disambiguating the critical from the nice-to-have, it’s helpful to have those resources organized into separate directories (or similar). That way we can add some logic into the Service Worker that can help it decide which is which. For example, on my own personal site, critical images are either self-hosted or come from the website for my book. Knowing that, I can write regular expressions that match those domains:
high_priority variable defined, I can create a function that will let me know if a given image request (for example) is a high priority request or not:
Adding support for prioritizing media requests only requires adding a new conditional into the
fetch event handler, like we did with
Save-Data. Your specific recipe for network and cache handling will likely differ, but here was how I chose to mix in this logic within image requests:
We can apply this prioritized approach to many kinds of assets. We could even use it to control which pages are served cache-first vs. network-first.
Keep the cache tidy#section12
The ability to control which resources are cached to disk is a huge opportunity, but it also carries with it an equally huge responsibility not to abuse it.
Every caching strategy is likely to differ, at least a little bit. If we’re publishing a book online, for instance, it might make sense to cache all of the chapters, images, etc. for offline viewing. There’s a fixed amount of content and—assuming there aren’t a ton of heavy images and videos—users will benefit from not having to download each chapter separately.
On a news site, however, caching every article and photo will quickly fill up our users’ hard drives. If a site offers an indeterminate number of pages and assets, it’s critical to have a caching strategy that puts hard limits on how many resources we’re caching to disk.
One way to do this is to create several different blocks associated with caching different forms of content. The more ephemeral content caches can have strict limits around how many items can be stored. Sure, we’ll still be bound to the storage limits of the device, but do we really want our website to take up 2 GB of someone’s hard drive?
Here’s an example, again from my own site:
Here I’ve defined several caches, each with a
name used for addressing it in the Cache API and a
version prefix. The
version is defined elsewhere in the Service Worker, and allows me to purge all caches at once if necessary.
With the exception of the
static cache, which is used for static assets, every cache has a
limit to the number of items that may be stored. I only cache the most recent 5 pages someone has visited, for instance. Images are limited to the most recent 75, and so on. This is an approach that Jeremy Keith outlines in his fantastic book Going Offline (which you should really read if you haven’t already—here’s a sample).
With these cache definitions in place, I can clean up my caches periodically and prune the oldest items. Here’s Jeremy’s recommended code for this approach:
We can trigger this code to run whenever a new page loads. By running it in the Service Worker, it runs in a separate thread and won’t drag down the site’s responsiveness. We trigger it by posting a message (using
The final step in wiring it all up is setting up the Service Worker to receive the message:
Here, the Service Worker listens for inbound messages and responds to the “clean up” request by running
trimCache() on each of the cache buckets with a defined
This approach is by no means elegant, but it works. It would be far better to make decisions about purging cached responses based on how frequently each item is accessed and/or how much room it takes up on disk. (Removing cached items based purely on when they were cached isn’t nearly as useful.) Sadly, we don’t have that level of detail when it comes to inspecting the caches…yet. I’m actually working to address this limitation in the Cache API right now.
Your users always come first#section13
The technologies underlying Progressive Web Apps are continuing to mature, but even if you aren’t interested in turning your site into a PWA, there’s so much you can do today to improve your users’ experiences when it comes to media. And, as with every other form of inclusive design, it starts with centering on your users who are most at risk of having an awful experience.
Draw distinctions between critical, nice-to-have, and superfluous media. Remove the cruft, then optimize the bejeezus out of each remaining asset. Serve your media in multiple formats and sizes, prioritizing the smallest versions first to make the most of high latency and slow connections. If your users say they want to save data, respect that and have a fallback plan in place. Cache wisely and with the utmost respect for your users’ disk space. And, finally, audit your caching strategies regularly—especially when it comes to large media files.Follow these guidelines, and every one of your users—from folks rocking a JioPhone on a rural mobile network in India to people on a high-end gaming laptop wired to a 10 Gbps fiber line in Silicon Valley—will thank you.
Got something to say?
We have turned off comments, but you can see what folks had to say before we did so.
More from ALA
Personalization Pyramid: A Framework for Designing with User Data
Mobile-First CSS: Is It Time for a Rethink?
Designers, (Re)define Success First
Breaking Out of the Box
How to Sell UX Research with Two Simple Questions