Current browsers are very forgiving; they quietly correct or gloss over many common HTML errors. This makes it easy for people to experience the joy of creating their own web pages with a minimum of frustration – if a page displays correctly, then it’s “right.”
Unfortunately, by hiding the need for structure that the web will require as it moves towards XHTML and XML, these forgiving browsers have helped create a world of structural HTML illiterates. As long as browsers continue to parse and display HTML that isn’t well-formed or valid, we will never learn the right ways, and we will never get to a structural web.
Teaching the next generation to the next generation
I was recently invited to be a guest speaker for the graphic design and desktop publishing classes at Santa Teresa High School in San José, California. It’s a pleasure to teach in their wonderful computer lab, which has two rooms with thirty iMacs in each room. Over the course of four days, I teach the basics of HTML; on the first and third days I present material, and the students get to practice on the second and fourth days. Needless to say, shoehorning any useful amount of material about HTML into a fifty-minute class period is a challenge, but it’s great fun. Some of the students in the class will have already done some work with HTML, so I can deputize them to assist other students.
This year, I decided to get ahead of the curve and teach XHTML – that is, HTML that conforms to the rules of XML – and thereby hangs a tale.
The mechanics of XHTML
On Tuesday, the first day I taught, I told the class what HTML means (one student told me he’d heard that it stands for “How To Make Love,” and was terribly disappointed when I told him otherwise). I told them that we’d be marking up our documents using commands inside of < and > signs rather than using a red pen the way their teachers do on their printouts.
I then launched into the basic elements:
<body>. I had them write several lines of text in the
body of a page and display it in the browser to show them that the text gets
converted into one word-wrapped line. Then I introduced tags that would
make their text look a bit better.
I had just gotten through the
<i> tags (the students type tags along with me, so it’s
not just a dry lecture format), and was prepared to launch into my standard
<font> when a large neon sign reading
DEPRECATED lit up in my brain. I stood, frozen, realizing
that my well-rehearsed train of thought was now derailed. I had to find a
way to introduce style sheets in the context of the tags they’d learned,
and I knew I couldn’t do it on the spot. [I did solve it later. See how I did it.] So
And thus I made my escape, dignity somewhat intact, to talk about
<br />. By the end of
the period I’d also covered the
bgcolor attribute on the
<body> element (it’s deprecated too, but I didn’t catch
myself in time),
<ul> and enough of
<img> to give them a good start. Along the way, I
emphasized proper nesting of elements and the “new way” to write
empty elements like
<hr /> and
The class had gone well, and the students were excited about using what they’d learned. On Wednesday, I presented the same material to the other group of thirty students while the first group worked on web pages devoted to any topic that they found interesting. And then…
On Thursday, I got my first look at the efforts of Tuesday’s class. I hadn’t talked much about page layout and design, so I saw numerous examples of:
- Bad color choices (maroon on black, purple on black, black on red). The main offenders in this area were males, by the way.
- Use of
<h1>to make text far too large, or
<h6>to make it far too small.
- Use of
<blink>, obviously learned from the students who already knew some HTML; I never mention its existence.
As an unapologetic Nielsenite, I was appalled, but I couldn’t say much about it. These were, after all, personal pages, so I held my opinions in check and concentrated on page content. Most of the males had pages about sports, cars, or rock bands with fascinating names like “Slipknot.” Most of the females had pages about their families or their boyfriends. There’s a sociology paper in here somewhere.
That was all minor, though, in contrast to the HTML itself. I was confronted with thirty or forty pages with code like this:
<html><head><title>My First Page</title>
All about Cars<body bgcolor=red>
I really like cars.<body>The new Fords are da bomb. Here’s theone I want.
When I was helping answer students’ questions about how to achieve specific effects, I would casually point out that it would be preferable to have properly nested tags and quotes around the attributes. The response would be a confused look. After all, once they loaded the file in Netscape 4 or Internet Explorer 5, it looked great.
I had achieved one of my main goals – letting people know what goes on behind the scenes of a web page. A few students enjoyed their new sense of control enough to want to abandon their favorite WYSIWYG page editors; the others at least had some insight that they hadn’t had before. My other goal of teaching well-formed HTML to ensure compatibility with future technology had been…
Sabotaged by the browser
The expansive forgiveness of the current browsers had defeated my efforts to teach the next generation of HTML to the next generation of designers. We need an unforgiving browser that adheres strictly to the letter of the XHTML law in order to move forward to the future. Let’s take a philosophical turn and examine this in greater detail.
The Hangman effect
Forgiving browsers work like most computer versions of the word game “Hangman.” In this game, you guess letters to figure out a secret word. Every wrong letter adds a body part to a picture of a person; if you run out of guesses, the person is hanged. If you guess the word right, you get a congratulatory message.
In short, whether you are right or wrong, you get some sort of visual benefit. Some versions actually give you a grand son et lumière presentation when the hanging occurs, which rewards you for being wrong!
Forgiving browsers work this way, too: you’re rewarded even if your HTML doesn’t follow the rules. But while Hangman rewards wrong answers differently from the way it rewards right answers, forgiving browsers reward correct and incorrect markup exactly the same way: by parsing it. Only if you’re egregiously wrong do you fail to be rewarded.
Raise the Flags
Some versions of Hangman have been done correctly; the first good version I ever saw was created by the Children’s Television Workshop. It was called “Raise the Flags.” When you guessed a letter correctly, a Sesame Street character would raise a flag with that letter on it. If you got the word right, all the flags would start waving, and music would play. If you guessed wrong, you got no reward. No flags, no sound, no animation.
(Another, more widely known version of Hangman done correctly was created by Merv Griffin. It’s called “Wheel of Fortune,” and it also rewards people for doing things right.)
To educate authors about proper structure, and to pave the way for a well-formed web, browers should behave like Raise the Flags or Wheel of Fortune: rewarding people for creating valid, well-formed HTML, and displaying nothing at all when fed invalid pages.
The ultimate solution to this problem is a browser that requires XHTML to be both well-formed and valid. Such a browser will be more frustrating for beginners, but that is not an impossible obstacle. My students were frustrated when they mismatched < and > signs or opening and closing quote marks, but they realized that matching them up was part of the game, and they adapted to it quickly because they got no reward for doing it wrong; their pages didn’t work until they followed the rules.
Would insistence upon strict XHTML have slowed the learning process? Yes, definitely, given that we were using SimpleText as our editor. Entry into a structured XHTML world will be greatly eased by what I call a “tag-aware” editor.
A tag-aware editor
Current HTML editors will surround text with <b> and </b> when you click a B icon, or use auto-completion to give you a </p> when you type an initial <p>. Once those tags are inserted, though, they lose their identity as tags. If you delete the <p>, the </p> remains, and vice versa.
A tag-aware editor would also perform auto-completion of tags and insertion by icons, but would remember them as tags. If you changed the initial <b> to <i>, the end tag would change too. When you deleted the initial <p>, the ending tag would vanish as well. Likewise, an attribute inserted within a tag by choosing from a pop-up menu would continue to be recognized as a unit rather than an anonymous group of characters. Additionally, a tag-aware editor would be able to perform validation, either as a menu choice or on the fly.
Such an editor would serve the purposes of everyone from talented amateurs to professional web authors. It would have to be inexpensive (or open source), cross-platform, and have a very light footprint. Its level of sophistication would be one step up from Notepad/SimpleText. Even at this level, however, there’s one other group of authors for whom this approach would be too daunting:
Aunt Alicia and Cousin Frank
Make no mistake; a strict browser that requires web developers to know about and use XHTML structure raises the bar for entry to the world of page design. What happens, then, to your Aunt Alicia’s current web page about senior citizen community issues? How will your Cousin Frank, who barely knows how to turn on his computer, create his web shrine to limited edition beer cans?
The issue of existing web pages must be addressed: there are hundreds of thousands of badly authored pages out there, and, left to themselves, they would die a horrible death at the hands of an unforgiving browser. But badly-written doesn’t mean useless; many of these pages have valuable content and should continue to be usable. These pages have probably been written in one of three ways:
- by using a WYSIWYG editor like DreamWeaver or GoLive;
- by filling in a set of page templates such as those provided by GeoCities; or
- by hand, with the author learning just enough HTML tags to get by.
The solution for the first two is easy: an updated version of the WYSIWYG software creates correct XHTML, and the ISP hosting the template-created pages runs the pages through a filter like HTML Tidy to clean them up, then updates the template software to create proper XHTML for all subsequent pages.
The third solution (hand-authored pages) requires user knowledge, and could be greatly helped by grassroots educational outreach programs, as well as for-profit page conversion services.
As for Cousin Frank, who’s about to create his first web pages, I suggest a stand-alone program that includes templates like the ones you’d get from Earthlink, etc. A web-based interface is not always the best, and web page creation is one task that is much better performed in a fast, responsive, client-side program than across a 28.8K connection. The program I am proposing is a CD that contains the program, tons of graphics and templates, and a built-in FTP client to upload finished pages. It is not as sophisticated as DreamWeaver, GoLive, or FrontPage; it’s strictly “color by numbers” for the beginner.
The web is moving towards the structure of XHTML and XML. And, if you ask me, it’s time to encourage authors to develop good habits through a combination of strict browsers, tag-aware editors for advanced and professional designers, and modernized authoring tools for beginners.
Apologies to Prof. Dr. Edsger W. Dijkstra for the paraphrase of the caption of his famous letter.