Now THAT’S What I Call Service Worker!

The Service Worker API is the Dremel of the web platform. It offers incredibly broad utility while also yielding resiliency and better performance. If you’ve not used Service Worker yet—and you couldn’t be blamed if so, as it hasn’t seen wide adoption as of 2020—it goes something like this:

Article Continues Below

On the initial visit to a website, the browser registers what amounts to a client-side proxy powered by a comparably paltry amount of JavaScript that—like a Web Worker—runs on its own thread.
After the Service Worker’s registration, you can intercept requests and decide how to respond to them in the Service Worker’s fetch() event.

What you decide to do with requests you intercept is a) your call and b) depends on your website. You can rewrite requests, precache static assets during install, provide offline functionality, and—as will be our eventual focus—deliver smaller HTML payloads and better performance for repeat visitors.

Getting out of the woods#section2

Weekly Timber is a client of mine that provides logging services in central Wisconsin. For them, a fast website is vital. Their business is located in Waushara County, and like many rural stretches in the United States, network quality and reliability isn’t great.

A screenshot of a wireless coverage map for Waushara County, Wisconsin with a color overlay. Most of the overlay is colored tan, which represents areas of the county which have downlink speeds between 3 and 9.99 megabits per second. There are sparse light blue and dark blue areas which indicate faster service, but are far from being the majority of the county. — **Figure 1.** A wireless coverage map of Waushara County, Wisconsin. The tan areas of the map indicate downlink speeds between 3 and 9.99 Mbps. Red areas are even slower, while the pale and dark blue areas are faster.

Wisconsin has farmland for days, but it also has plenty of forests. When you need a company that cuts logs, Google is probably your first stop. How fast a given logging company’s website is might be enough to get you looking elsewhere if you’re left waiting too long on a crappy network connection.

I initially didn’t believe a Service Worker was necessary for Weekly Timber’s website. After all, if things were plenty fast to start with, why complicate things? On the other hand, knowing that my client services not just Waushara County, but much of central Wisconsin, even a barebones Service Worker could be the kind of progressive enhancement that adds resilience in the places it might be needed most.

The first Service Worker I wrote for my client’s website—which I’ll refer to henceforth as the “standard” Service Worker—used three well-documented caching strategies:

Precache CSS and JavaScript assets for all pages when the Service Worker is installed when the window’s load event fires.
Serve static assets out of CacheStorage if available. If a static asset isn’t in CacheStorage, retrieve it from the network, then cache it for future visits.
For HTML assets, hit the network first and place the HTML response into CacheStorage. If the network is unavailable the next time the visitor arrives, serve the cached markup from CacheStorage.

These are neither new nor special strategies, but they provide two benefits:

Offline capability, which is handy when network conditions are spotty.
A performance boost for loading static assets.

That performance boost translated to a 42% and 48% decrease in the median time to First Contentful Paint (FCP) and Largest Contentful Paint (LCP), respectively. Better yet, these insights are based on Real User Monitoring (RUM). That means these gains aren’t just theoretical, but a real improvement for real people.

A screenshot of request/response timings in Chrome's developer tools. It depicts a service worker on a page serving a static asset from CacheStorage in roughly 23 milliseconds. — **Figure 2.** A breakdown of request/response timings depicted in Chrome’s developer tools. The request is for a static asset from `CacheStorage`. Because the Service Worker doesn’t need to access the network, it takes about 23 milliseconds to “download” the asset from `CacheStorage`.

This performance boost is from bypassing the network entirely for static assets already in CacheStorage—particularly render-blocking stylesheets. A similar benefit is realized when we rely on the HTTP cache, only the FCP and LCP improvements I just described are in comparison to pages with a primed HTTP cache without an installed Service Worker.

If you’re wondering why CacheStorage and the HTTP cache aren’t equal, it’s because the HTTP cache—at least in some cases—may still involve a trip to the server to verify asset freshness. Cache-Control’s immutable flag gets around this, but immutable doesn’t have great support yet. A long max-age value works, too, but the combination of Service Worker API and CacheStorage gives you a lot more flexibility.

Details aside, the takeaway is that the simplest and most well-established Service Worker caching practices can improve performance. Potentially more than what well-configured Cache-Control headers can provide. Even so, Service Worker is an incredible technology with far more possibilities. It’s possible to go farther, and I’ll show you how.

A better, faster Service Worker#section3

The web loves itself some “innovation,” which is a word we equally love to throw around. To me, true innovation isn’t when we create new frameworks or patterns solely for the benefit of developers, but whether those inventions benefit people who end up using whatever it is we slap up on the web. The priority of constituencies is a thing we ought to respect. Users above all else, always.

The Service Worker API’s innovation space is considerable. How you work within that space can have a big effect on how the web is experienced. Things like navigation preload and ReadableStream have taken Service Worker from great to killer. We can do the following with these new capabilities, respectively:

Reduce Service Worker latency by parallelizing Service Worker startup time and navigation requests.
Stream content in from CacheStorage and the network.

Moreover, we’re going to combine these capabilities and pull out one more trick: precache header and footer partials, then combine them with content partials from the network. This not only reduces how much data we download from the network, but it also improves perceptual performance for repeat visits. That’s innovation that helps everyone.

Grizzled, I turn to you and say “let’s do this.”

Laying the groundwork#section4

If the idea of combining precached header and footer partials with network content on the fly seems like a Single Page Application (SPA), you’re not far off. Like an SPA, you’ll need to apply the “app shell” model to your website. Only instead of a client-side router plowing content into one piece of minimal markup, you have to think of your website as three separate parts:

The header.
The content.
The footer.

For my client’s website, that looks like this:

A screenshot of the Weekly Timber website color coded to delineate each partial that makes up the page. The header is color coded as blue, the footer as red, and the main content in between as yellow. — **Figure 3.** A color coding of the Weekly Timber website’s different partials. The Footer and Header partials are stored in `CacheStorage`, while the Content partial is retrieved from the network unless the user is offline.

The thing to remember here is that the individual partials don’t have to be valid markup in the sense that all tags need to be closed within each partial. The only thing that counts in the final sense is that the combination of these partials must be valid markup.

To start, you’ll need to precache separate header and footer partials when the Service Worker is installed. For my client’s website, these partials are served from the /partial-header and /partial-footer pathnames:

self.addEventListener("install", event => {
  const cacheName = "fancy_cache_name_here";
  const precachedAssets = [
    "/partial-header",  // The header partial
    "/partial-footer",  // The footer partial
    // Other assets worth precaching
  ];

  event.waitUntil(caches.open(cacheName).then(cache => {
    return cache.addAll(precachedAssets);
  }).then(() => {
    return self.skipWaiting();
  }));
});

Every page must be fetchable as a content partial minus the header and footer, as well as a full page with the header and footer. This is key because the initial visit to a page won’t be controlled by a Service Worker. Once the Service Worker takes over, then you serve content partials and assemble them into complete responses with the header and footer partials from CacheStorage.

If your site is static, this means generating a whole other mess of markup partials that you can rewrite requests to in the Service Worker’s fetch() event. If your website has a back end—as is the case with my client—you can use an HTTP request header to instruct the server to deliver full pages or content partials.

The hard part is putting all the pieces together—but we’ll do just that.

Putting it all together#section5

Writing even a basic Service Worker can be challenging, but things get real complicated real fast when assembling multiple responses into one. One reason for this is that in order to avoid the Service Worker startup penalty, we’ll need to set up navigation preload.

Implementing navigation preload#section6

Navigation preload addresses the problem of Service Worker startup time, which delays navigation requests to the network. The last thing you want to do with a Service Worker is hold up the show.

Navigation preload must be explicitly enabled. Once enabled, the Service Worker won’t hold up navigation requests during startup. Navigation preload is enabled in the Service Worker’s activate event:

self.addEventListener("activate", event => {
  const cacheName = "fancy_cache_name_here";
  const preloadAvailable = "navigationPreload" in self.registration;

  event.waitUntil(caches.keys().then(keys => {
    return Promise.all([
      keys.filter(key => {
        return key !== cacheName;
      }).map(key => {
        return caches.delete(key);
      }),
      self.clients.claim(),
      preloadAvailable ? self.registration.navigationPreload.enable() : true
    ]);
  }));
});

Because navigation preload isn’t supported everywhere, we have to do the usual feature check, which we store in the above example in the preloadAvailable variable.

Additionally, we need to use Promise.all() to resolve multiple asynchronous operations before the Service Worker activates. This includes pruning those old caches, as well as waiting for both clients.claim() (which tells the Service Worker to assert control immediately rather than waiting until the next navigation) and navigation preload to be enabled.

A ternary operator is used to enable navigation preload in supporting browsers and avoid throwing errors in browsers that don’t. If preloadAvailable is true, we enable navigation preload. If it isn’t, we pass a Boolean that won’t affect how Promise.all() resolves.

With navigation preload enabled, we need to write code in our Service Worker’s fetch() event handler to make use of the preloaded response:

self.addEventListener("fetch", event => {
  const { request } = event;

  // Static asset handling code omitted for brevity
  // ...

  // Check if this is a request for a document
  if (request.mode === "navigate") {
    const networkContent = Promise.resolve(event.preloadResponse).then(response => {
      if (response) {
        addResponseToCache(request, response.clone());

        return response;
      }

      return fetch(request.url, {
        headers: {
          "X-Content-Mode": "partial"
        }
      }).then(response => {
        addResponseToCache(request, response.clone());

        return response;
      });
    }).catch(() => {
      return caches.match(request.url);
    });

    // More to come...
  }
});

Though this isn’t the entirety of the Service Worker’s fetch() event code, there’s a lot that needs explaining:

The preloaded response is available in event.preloadResponse. However, as Jake Archibald notes, the value of event.preloadResponse will be undefined in browsers that don’t support navigation preload. Therefore, we must pass event.preloadResponse to Promise.resolve() to avoid compatibility issues.
We adapt in the resulting then callback. If event.preloadResponse is supported, we use the preloaded response and add it to CacheStorage via an addResponseToCache() helper function. If not, we send a fetch() request to the network to get the content partial using a custom X-Content-Mode header with a value of partial.
Should the network be unavailable, we fall back to the most recently accessed content partial in CacheStorage.
The response—regardless of where it was procured from—is then returned to a variable named networkContent that we use later.

How the content partial is retrieved is tricky. With navigation preload enabled, a special Service-Worker-Navigation-Preload header with a value of true is added to navigation requests. We then work with that header on the back end to ensure the response is a content partial rather than the full page markup.

However, because navigation preload isn’t available in all browsers, we send a different header in those scenarios. In Weekly Timber’s case, we fall back to a custom X-Content-Mode header. In my client’s PHP back end, I’ve created some handy constants:

<?php

// Is this a navigation preload request?
define("NAVIGATION_PRELOAD", isset($_SERVER["HTTP_SERVICE_WORKER_NAVIGATION_PRELOAD"]) && stristr($_SERVER["HTTP_SERVICE_WORKER_NAVIGATION_PRELOAD"], "true") !== false);

// Is this an explicit request for a content partial?
define("PARTIAL_MODE", isset($_SERVER["HTTP_X_CONTENT_MODE"]) && stristr($_SERVER["HTTP_X_CONTENT_MODE"], "partial") !== false);

// If either is true, this is a request for a content partial
define("USE_PARTIAL", NAVIGATION_PRELOAD === true || PARTIAL_MODE === true);

?>

From there, the USE_PARTIAL constant is used to adapt the response:

<?php

if (USE_PARTIAL === false) {
  require_once("partial-header.php");
}

require_once("includes/home.php");

if (USE_PARTIAL === false) {
  require_once("partial-footer.php");
}

?>

The thing to be hip to here is that you should specify a Vary header for HTML responses to take the Service-Worker-Navigation-Preload (and in this case, the X-Content-Mode header) into account for HTTP caching purposes—assuming you’re caching HTML at all, which may not be the case for you.

With our handling of navigation preloads complete, we can then move onto the work of streaming content partials from the network and stitching them together with the header and footer partials from CacheStorage into a single response that the Service Worker will provide.

Streaming partial content and stitching together responses#section7

While the header and footer partials will be available almost instantaneously because they’ve been in CacheStorage since the Service Worker’s installation, it’s the content partial we retrieve from the network that will be the bottleneck. It’s therefore vital that we stream responses so we can start pushing markup to the browser as quickly as possible. ReadableStream can do this for us.

This ReadableStream business is a mind-bender. Anyone who tells you it’s “easy” is whispering sweet nothings to you. It’s hard. After I wrote my own function to merge streamed responses and messed up a critical step—which ended up not improving page performance, mind you—I modified Jake Archibald’s mergeResponses() function to suit my needs:

async function mergeResponses (responsePromises) {
  const readers = responsePromises.map(responsePromise => {
    return Promise.resolve(responsePromise).then(response => {
      return response.body.getReader();
    });
  });

  let doneResolve,
      doneReject;

  const done = new Promise((resolve, reject) => {
    doneResolve = resolve;
    doneReject = reject;
  });

  const readable = new ReadableStream({
    async pull (controller) {
      const reader = await readers[0];

      try {
        const { done, value } = await reader.read();

        if (done) {
          readers.shift();

          if (!readers[0]) {
            controller.close();
            doneResolve();

            return;
          }

          return this.pull(controller);
        }

        controller.enqueue(value);
      } catch (err) {
        doneReject(err);
        throw err;
      }
    },
    cancel () {
      doneResolve();
    }
  });

  const headers = new Headers();
  headers.append("Content-Type", "text/html");

  return {
    done,
    response: new Response(readable, {
      headers
    })
  };
}

As usual, there’s a lot going on:

mergeResponses() accepts an argument named responsePromises, which is an array of Response objects returned from either a navigation preload, fetch(), or caches.match(). Assuming the network is available, this will always contain three responses: two from caches.match() and (hopefully) one from the network.
Before we can stream the responses in the responsePromises array, we must map responsePromises to an array containing one reader for each response. Each reader is used later in a ReadableStream() constructor to stream each response’s contents.
A promise named done is created. In it, we assign the promise’s resolve() and reject() functions to the external variables doneResolve and doneReject, respectively. These will be used in the ReadableStream() to signal whether the stream is finished or has hit a snag.
The new ReadableStream() instance is created with a name of readable. As responses stream in from CacheStorage and the network, their contents will be appended to readable.
The stream’s pull() method streams the contents of the first response in the array. If the stream isn’t canceled somehow, the reader for each response is discarded by calling the readers array’s shift() method when the response is fully streamed. This repeats until there are no more readers to process.
When all is done, the merged stream of responses is returned as a single response, and we return it with a Content-Type header value of text/html.

This is much simpler if you use TransformStream, but depending on when you read this, that may not be an option for every browser. For now, we’ll have to stick with this approach.

Now let’s revisit the Service Worker’s fetch() event from earlier, and apply the mergeResponses() function:

self.addEventListener("fetch", event => {
  const { request } = event;

  // Static asset handling code omitted for brevity
  // ...

  // Check if this is a request for a document
  if (request.mode === "navigate") {
    // Navigation preload/fetch() fallback code omitted.
    // ...

    const { done, response } = await mergeResponses([
      caches.match("/partial-header"),
      networkContent,
      caches.match("/partial-footer")
    ]);

    event.waitUntil(done);
    event.respondWith(response);
  }
});

At the end of the fetch() event handler, we pass the header and footer partials from CacheStorage to the mergeResponses() function, and pass the result to the fetch() event’s respondWith() method, which serves the merged response on behalf of the Service Worker.

Are the results worth the hassle?#section8

This is a lot of stuff to do, and it’s complicated! You might mess something up, or maybe your website’s architecture isn’t well-suited to this exact approach. So it’s important to ask: are the performance benefits worth the work? In my view? Yes! The synthetic performance gains aren’t bad at all:

A bar graph comparing First Contentful Paint and Largest Contentful Paint performance for the Weekly Timber website for scenarios in which there is no service worker, a "standard" service worker, and a streaming service worker that stitches together content partials from CacheStorage and the network. The first two scenarios are basically the same, while the streaming service worker delivers measurably better performance for both FCP and LCP—especially for FCP! — **Figure 4.** A bar chart of median FCP and LCP synthetic performance data across various Service Worker types for the Weekly Timber website.

Synthetic tests don’t measure performance for anything except the specific device and internet connection they’re performed on. Even so, these tests were conducted on a staging version of my client’s website with a low-end Nokia 2 Android phone on a throttled “Fast 3G” connection in Chrome’s developer tools. Each category was tested ten times on the homepage. The takeaways here are:

No Service Worker at all is slightly faster than the “standard” Service Worker with simpler caching patterns than the streaming variant. Like, ever so slightly faster. This may be due to the delay introduced by Service Worker startup, however, the RUM data I’ll go over shows a different case.
Both LCP and FCP are tightly coupled in scenarios where there’s no Service Worker or when the “standard” Service Worker is used. This is because the content of the page is pretty simple and the CSS is fairly small. The Largest Contentful Paint is usually the opening paragraph on a page.
However, the streaming Service Worker decouples FCP and LCP because the header content partial streams in right away from CacheStorage.
Both FCP and LCP are lower in the streaming Service Worker than in other cases.

A bar chart comparing the RUM median FCP and LCP performance of no service worker, a "standard" service worker, and a streaming service worker. Both the "standard" and streaming service worker offer better FCP and LCP performance over no service worker, but the streaming service worker excels at FCP performance, while only being slightly slower at LCP than the "standard" service worker. — **Figure 5.** A bar chart of median FCP and LCP RUM performance data across various Service Worker types for the Weekly Timber website.

The benefits of the streaming Service Worker for real users is pronounced. For FCP, we receive an 79% improvement over no Service Worker at all, and a 63% improvement over the “standard” Service Worker. The benefits for LCP are more subtle. Compared to no Service Worker at all, we realize a 41% improvement in LCP—which is incredible! However, compared to the “standard” Service Worker, LCP is a touch slower.

Because the long tail of performance is important, let’s look at the 95th percentile of FCP and LCP performance:

The 95th percentile of RUM data is a great place to assess the slowest experiences. In this case, we see that the streaming Service Worker confers a 40% and 51% improvement in FCP and LCP, respectively, over no Service Worker at all. Compared to the “standard” Service Worker, we see a reduction in FCP and LCP by 19% and 43%, respectively. If these results seem a bit squirrely compared to synthetic metrics, remember: that’s RUM data for you! You never know who’s going to visit your website on which device on what network.

While both FCP and LCP are boosted by the myriad benefits of streaming, navigation preload (in Chrome’s case), and sending less markup by stitching together partials from both CacheStorage and the network, FCP is the clear winner. Perceptually speaking, the benefit is pronounced, as this video would suggest:

Figure 7. Three WebPageTest videos of a repeat view of the Weekly Timber homepage. On the left is the page not controlled by a Service Worker, with a primed HTTP cache. On the right is the same page controlled by a streaming Service Worker, with CacheStorage primed.

Now ask yourself this: If this is the kind of improvement we can expect on such a small and simple website, what might we expect on a website with larger header and footer markup payloads?

Caveats and conclusions#section9

Are there trade-offs with this on the development side? Oh yeah.

As Philip Walton has noted, a cached header partial means the document title must be updated in JavaScript on each navigation by changing the value of document.title. It also means you’ll need to update the navigation state in JavaScript to reflect the current page if that’s something you do on your website. Note that this shouldn’t cause indexing issues, as Googlebot crawls pages with an unprimed cache.

There may also be some challenges on sites with authentication. For example, if your site’s header displays the current authenticated user on log in, you may have to update the header partial markup provided by CacheStorage in JavaScript on each navigation to reflect who is authenticated. You may be able to do this by storing basic user data in localStorage and updating the UI from there.

There are certainly other challenges, but it’ll be up to you to weigh the user-facing benefits versus the development costs. In my opinion, this approach has broad applicability in applications such as blogs, marketing websites, news websites, ecommerce, and other typical use cases.

All in all, though, it’s akin to the performance improvements and efficiency gains that you’d get from an SPA. Only the difference is that you’re not replacing time-tested navigation mechanisms and grappling with all the messiness that entails, but enhancing them. That’s the part I think is really important to consider in a world where client-side routing is all the rage.

“What about Workbox?,” you might ask—and you’d be right to. Workbox simplifies a lot when it comes to using the Service Worker API, and you’re not wrong to reach for it. Personally, I prefer to work as close to the metal as I can so I can gain a better understanding of what lies beneath abstractions like Workbox. Even so, Service Worker is hard. Use Workbox if it suits you. As far as frameworks go, its abstraction cost is very low.

Regardless of this approach, I think there’s incredible utility and power in using the Service Worker API to reduce the amount of markup you send. It benefits my client and all the people that use their website. Because of Service Worker and the innovation around its use, my client’s website is faster in the far-flung parts of Wisconsin. That’s something I feel good about.

Special thanks to Jake Archibald for his valuable editorial advice, which, to put it mildly, considerably improved the quality of this article.

No Comments

Got something to say?

We have turned off comments, but you can see what folks had to say before we did so.

More from ALA

Design for Amiability: Lessons from Vienna

by Mark Bernstein

Computing was born in a Viennese café. Between 1928 and 1934, while Hitler plotted and Europe crumbled, a motley crew of mathematicians, philosophers, architects, and economists gathered weekly to puzzle out the limits of reason—and invented Computer Science in the process. What made their collaboration possible wasn't just brilliance (though they had plenty). It was amiability: the careful design of a social space where difficult people could disagree without destroying each other. Longtime A List Apart contributing author Mark Bernstein mines this forgotten history for lessons that might just save today's embattled web from its worst impulses. Spoiler: it involves better coffee service and the looming threat of public humiliation.

Design Dialects: Breaking the Rules, Not the System

by Michel Ferreira

Design systems aren't component libraries—they’re living languages. Rigid adherence to visual rules creates brittle systems that break under contextual pressure. Fluent systems bend without breaking.

An Holistic Framework for Shared Design Leadership

Having both a Design Manager and a Lead Designer on the same team is beautiful, but can be messy. To make it work without creating confusion, overlap, or “too many cooks,” check Michel Ferreira’s Holistic Framework for Shared Design Leadership.

From Beta to Bedrock: Build Products that Stick.

by Liam Nugent

Building towards bedrock means sacrificing some short-term growth potential in favour of long-term stability. But the payoff is worth it: products built with a focus on bedrock will outlast and outperform their competitors, and deliver sustained value to users over time. Liam Nugent shows us how.

User Research Is Storytelling

by Gerry Duffy

At a time when budgets for user experience research seem to have reached an all-time low, how do we get stakeholders and executives alike invested in this crucial discipline? Gerry Duffy walks us through how the research we conduct is much like telling a compelling story, complete with a three-act narrative structure, character development, and conflict resolution—with a happy ending for researchers and stakeholders alike.