A List Apart

Menu
Issue № 164

Retooling Slashdot with Web Standards

by Published in Browsers, CSS, HTML, Accessibility94 Comments

{Part I of a two-part series.}

Ask an IT person if they know what Slashdot’s tagline is and they’ll reply, “News for Nerds. Stuff that Matters.” Slashdot is a very prominent site, but underneath the hood you will find an old jalopy that could benefit from a web standards mechanic.

In this article we will show how an engine overhaul could take place by converting a single Slashdot page from their current HTML 3.2 code, nested tables, and invalid, nonsemantic markup, to a finely tuned web standards racing engine. The goal is not to change Slashdot, but to rebuild it with web standards and show the benefits of the transition.

Before you panic because I’m picking on Slashdot, let me inform you that I asked Rob “CmdrTaco” Malda, the guru behind Slashdot, for permission to post this information, and he stated in his reply email:

Have fun. Feel free to submit patches back to us if you come up with anything useful. Slashdot’s source code is open source and available at www.slashcode.com.

The breakdown

We started by freezing a copy of Slashdot on Tuesday July 22, 2003. Once we had a copy of the page, the first step was to remove all non-essential tags. The only tags left were anchors, lists, forms, images, scripts and header information. From the stripped down version, the code was converted to XHTML 1.0 Transitional, and validated. At this point, the page looks like a minefield of links in a sea of information. It’s valid, but not pretty to look at, so on to the next step.

The semantic conversion

While viewing the now-valid markup, it became apparent that most of the information would be better described as lists. For example, the images for the categories of Slashdot were now just sitting next to each other. Essentially, anything that there was more than two of was put it into a list, for example: login, sections, help, stories, about, services, etc. Lists can be described and positioned in any way that we want, and by stating that elements are part of a list, we are describing their relationship to each other.

The next step was to use header tags. The page has lots of titles and information, but none of the information was described appropriately or explained its relationship to other bits of information. So we gave the title of an article an <h1>, the author information an <h2>, the department received an <h3>, and the “read more” area an <h4>. This then uniquely identified each part of information of an article, while describing the relationship of those parts. Then came the simple part: identify paragraphs, and clean up the code.

The benefit of the semantic conversion is that we are using tags for what they were meant for. It is clear that a list of objects belongs together, and there is a title hierarchy. Another benefit is that this also helps with search engine optimization.

Boxes everywhere

What we have now is a a jumbled mess of well-described information. The information needs to be bound together, with relationships to other information. To begin with, each article is placed into a <div class=“node”>. Now all information about an article belongs together and all articles are equal, with hierarchy established by the physical order they are placed in.

Next, we uniquely identify each remaining information group, and encapsulate them in their own <div>. For instance:

  • <div id=“advertisement”>
  • <div id=“header”>
  • <div id=“leftcolumn”>
  • <div id=“centercolumn”>
  • <div id=“footer”>

The purpose of boxing the information into <div> is so that information is logically grouped together, which makes shaping the information easier. The CSS can now address each information group and assign attributes to it, such as layout and design. It’s not necessarily semantic, but it is necessary for the presentation. Here’s the semantically organized example. There’s no CSS layout, yet; it’s just structured markup.

The reconstruction of the skeleton

Now that each information group is identified by a <div>, the page is shaped with CSS so that the design matches the old look and feel. This is a matter of time, patience, and practice. The first goal is not to mimic the old site, but to get things to position themselves correctly in the three-column format with an overall header / masthead and footer. This becomes the first CSS file: layout.css.

The benefit of positioning a page with a single CSS file is simple: you know where to look if there is a positioning problem. Often, if you have a problem, it is usually with the positioning. In this step, we were mindful of the page’s behavior in a variety of browsers, so we choose to utilize the @import feature, as any browsers that don’t support that directive will not get the layout. This includes web-enabled cell phones, PDA devices, old browsers, and other Internet devices. Here’s the page with the positioning CSS applied.

Applying the skin

Now we have the page displaying the correct layout, but it still doesn’t look like Slashdot. The second CSS file that is attached is the markup.css, which contains information about fonts, colors, background images, and the way lists are displayed. Here’s the final example.

We also have the ability to add a second skin if we want to give the user an option on how they want to view the page. The second skin doesn’t have to duplicate all of the layout information, which should already be cached from the layout.css file.

The CSS link

We link the CSS files in the header to complete the transition.

<link rel="stylesheet" type="text/css"
href="styles/layout.css" media='screen' />
<style type="text/css">
@import "styles/markup.css";
</style>

In this example, the layout.css file is linked with a media type of screen. This is intentional. The information there is only important for display on a screen, it doesn’t have any benefit for printed media type, or any other (aural, tv, braille, etc.) for that matter. The markup.css file, which is the “skin” of the page, is imported, and thus hidden from noncompliant web devices because some of its features could be harmful or interpreted incorrectly.

Benefits!

The page will now correctly render in standards-compliant browsers, just as it did before, and will fail gracefully for non-standard browsers. So, while the design is not as pretty in very old browsers, the content is still available to their users. It is also much cleaner and more predictable with screen readers. By having the CSS fail gracefully, content is even available to PDAs and web phones. Plus, there are no horizontal scroll bars! Finally, there is also a printer-friendly version using only CSS (no separate “printer-friendly” page). Perhaps the biggest benefit of this particular example is the bandwidth savings:

  • Savings per page without caching the CSS file: ~2KB per request
  • Savings per page with caching the CSS file: ~9KB per request

Though a few KB doesn’t sound like a lot of bandwidth, let’s add it up. Slashdot’s FAQ, last updated 13 June 2000, states that they serve 50 million pages in a month. When you break down the figures, that’s ~1,612,900 pages per day or ~18 pages per second. Bandwidth savings are as follows:

  • Savings per day without caching the CSS files: ~3.15 GB bandwidth
  • Savings per day with caching the CSS files: ~14 GB bandwidth

Most Slashdot visitors would have the CSS file cached, so we could ballpark the daily savings at ~10 GB bandwidth. A high volume of bandwidth from an ISP could be anywhere from $1 - $5 cost per GB of transfer, but let’s calculate it at $1 per GB for an entire year. For this example, the total yearly savings for Slashdot would be: $3,650 USD!

Remember: this calculation is based on the number of pages served as of 13 June, 2000. I believe that Slashdot’s traffic is much heavier now, but even using this three-year-old figure, the money saved is impressive.

Everything explained so far is discussed in more detail at the University of Wisconsin – Platteville’s Slashdot Web Standards example site.

The challenge

I now challenge the ALA community. We need a good web standards mechanic (or team of mechanics) to dig though Slashdot’s engine, Slashcode, and make it web-standards-compliant. CmdrTaco has encouraged us to submit patches, and I know we can show the benefits! The challenge is there — any takers?

Next time: printer-friendly and handheld-friendly Slashdot with a few simple additions.

94 Reader Comments

Load Comments