XML (Extensible Markup Language) is the Eurodollar of web development. Both XML and the Euro bring order to chaos; both offer undeniable, wide–ranging
benefits; both are poised, in 2002, to change the way we do things. Frankly,
both scare the crap out of people.
For web developers, 2002 is a time to conquer fears and take their first hands–on
approach to XML. It’s time to examine XML and realize the practical benefits
that it can provide to web projects today.
The bankers can fend for themselves.
XML, HTML & Databases#section1
If you need a good analogy to describe XML to other people, don’t mention HTML. Although XML looks a lot like HTML, creating a good XML file is more like designing a database than creating a web page.
Databases and XML documents are both used
as a means to organize data. As a result, they share a lot of similarities.
A database table design for a table containing news stories would
look something like this:
- Table Name:
- Table Columns:
A basic XML document containing the same information might look like this:
<?xml version="1.0"?> <Headline></Headline> <Body>Pending</Body>
In addition to these similarities, both databases and XML represent a huge step forward in the ability to publish and manage web content.
At any scale above that of the small, personal site, database–driven websites are indisputably better at managing, updating, and maintaining content than HTML–only sites. What everyone will discover in 2002 in that XML–driven database sites will prove to be indisputably better than database–driven sites. XML is going to be everywhere.
And as a web developer, you are going to love it.
XML is poised to eliminate more headaches than a bottle of Ibuprofen, improve
productivity more than cans of Red Bull, and increase profitability more than
we’ll want to our clients to know about.
How? Two words: Content management.
Content management & migration#section3
Before projects are initiated by a client, a website usually reaches a stage
of obsolescence, immediacy, or embarrassment. Web projects are big projects
with short time lines. It’s not surprising, then, that one of the biggest factors
influencing the profitability and success of web projects is the ability to
effectively manage content.
Separation of style, programming, and content#section4
The ability to store a site’s content, programming, and design separately and
mix them together transparently, on demand, is the art of our craft. Each moment
eliminating rework and duplication is a dollar in our pocket. It’s time spent
adding new features to a site rather than rewriting, reworking, and “searching
We’ve solved much of the problem with databases, templates, style sheets and
server–side includes. Much that remains, XML can address. It’s the best tool for managing content – the content itself, not the way text appears on screen.
XML is used to structure, store and send information in a platform–neutral,
object–oriented, plain text format.
The power of XML is unleashed when its placed in the hands of content providers.
However, since copywriters and clients are accustomed to writing in platform–neutral, object–oriented, plain text formats, it means helping them do it unknowingly. Guerilla content management tactics, such as MS-Word–to–XML migration, can be wildly successful.
The basic model for XML migration is to start in a text editor, such as MS Word ,
that can be converted directly to XML, or via RTF, using third party tools.
After conversion to XML, the documents can be used by an XML–aware server, or
converted to HTML using another third-party tool.
Successful migration requires providing content creators with a Microsoft
Word template and a set of basic instructions prior to Web development. The
template must include custom style tags based on the organization of the
When using the template, content developers need to avoid
using MS Word formatting options that are not defined within the custom style
tags. If custom tags are insufficient, new tags must be added that reflect
the type of content being addressed.
While the process seems cumbersome, with enough practice, it takes significantly
less time to update site content than using processes without XML – particularly once you harness the power of XML validation.
Websites either evolve or suffer the slow, painful death of neglect. New content
needs to be added. Old content needs to be removed. Missing content needs to
be found. Clients are frustrated by their inability to maintain and manage their
web content. Web developers are frustrated by the aftermath. XML can help.
XML–based documents make it easy to find outdated and
missing content at a glance. This is achieved by using XML Data Type Definitions
(DTDs) to identify the timeliness of information and determine what information
“nuggets” must be present within the content.
Like databases, XML documents allow you to validate information, before you
use it, to make sure the content is timely, appropriate, and complete. Since
we’re used to talking about validation as it relates to databases, let’s take
a more detailed look at the database table we created to hold news stories.
In reality, a database table must include definitions for each column:
|varchar||Yes||Max of 50 characters|
|Varchar||Yes||Selected from drop-down list|
|date/time||Yes||Date added to table|
|Abstract||varchar||Yes||250 character intro.|
|Body||text||Yes||Allows text formatting in field|
pending – No distribution
public – Public distribution
private – Internal distribution
By validating fields, the data table ensures that each news story contains
all of the required information. So, with the proper integration and a web–based
interface, the data table could be an efficient tool for publishing news on the
The XML document with simple DTD validation used for the same information might
look like this:
The XML document makes significant contributions to web publishing when compared to the database alone. XML allows data to be validated based on the embedded DTDs, XML tags and attributes. This means that appropriate content can be extracted directly from the XML document based on selection criteria without requiring an interim database,
without requiring a database query, and without being separated from the source
Using DTD, XML documents suddenly become self–aware.
Substance & Style#section7
XML finds advocates on both sides of the ongoing “content” versus
XSL (the eXtensible Stylesheet Language), the style sheet language of XML, packs
a wallop. It’s much more robust than Cascading Style Sheets (CSS). Instead
of using rules (as CSS does) to format content, XSL uses (.xsl) templates to describe
how to transform XML into other types of documents. When you implement an XML–based site, XML doesn’t replace HTML. If it sounds a bit confusing, here’s why. When you deal with XSL files, all is not as it appears:
- The .XSL file embeds HTML with XML tags and logic that define how information
should be displayed at run time.
- At run–time, the .XML file is displayed in the web browser on the fly.
- Although HTML formatting included in the .XSL file is applied, it won’t
appear in the source for the .XML document being displayed in the browser.
- The appearance in HTML is based on the combination of XML tags and logic
within the .XSL file.
- Because the .XSL file can transform XML in the browser, the document that
appears in the browser may only be a subset of the content in the actual
The ability to transform the XML conditionally in a web browser means that content
can be centralized. Parts of the document are displayed or ignored on an as–needed basis.
Now is the Time#section8
Web developers have been telling others that they are waiting to dabble in
XML until it becomes widely available. The truth is, it’s been widely available for months:
Internet Explorer 5 contains an XML engine that fully supports XML 1.0,
as defined by the World Wide Web Consortium (W3C). This is a huge improvement
over the engine in IE4.
Netscape 6.0/Mozilla includes full XML support.
Flash 5 ActionScript supports XML–based data transfer to and from a server.
Director has offered an XML Parser Xtra since Director 7.0 that allows
Shockwave movies to read, parse, and make use of the contents of XML documents.
(Ed.Note: Director’s somewhat buggy XML parser has put off many developers. Reader Hussein Boon recommends Andy White’s user–extensible Lingo scripts instead. Boon also recommends a DOM–Lingo binding that binds Director’s Lingo scripting language to the W3C DOM Level 2.)
- IIS servers offer XML integration
via the Microsoft XML Parser. Version 4 of the parser supports XML 1.0.
SQL Server 2000 provides integrated XML support. It’s the first release
to do so.
XML technology preview runs under any SQL Server release. Although
the output is slightly different in a few cases, it’s a solid XML environment
for the pre–SQL Server 2000 crowd.
Version 2 of the Apache
Cocoon XML, a powerful framework for XML web publishing, been released.
is a platform–neutral protocol for executing programs remotely, “designed to be as simple as possible, while allowing complex data structures to be transmitted, processed and returned.”
This means we’ve all run out of excuses for putting off XML. Today, the benefits of developing web projects in XML aren’t merely imaginable. They are achievable.