Using XML

by J. David Eisenberg

69 Reader Comments

Back to the Article
  1. Is anyone else having trouble getting to the Relax NG spec? The link http://www.oasis-open.org/committees/relax-ng/ returns a “Not found” page.

    Copy & paste the code below to embed this comment.
  2. It works for me.

    I suppose you could be the victim of poorly configured server-side content negotiation.

    Using a tool [DelorieResponse] on the OASIS link, I got the following headers:

    HTTP/1.1 200 OK
    Server: Netscape-Enterprise/4.1
    Date: Tue, 23 Jul 2002 20:09:29 GMT
    Content-type: text/html
    Connection: close

    [DelorieResponse]
    http://www.delorie.com/web/headers.html

    [DelorieRequest]
    http://www.delorie.com:81/some/url.html

    Copy & paste the code below to embed this comment.
  3. The key concept to understand is an xml document actually performs the role of a database (or datasource) which can be queried (using xpath) to find what ever data is desired, and then quickly output and formatted (using xsl). XML is always presented as a document format, which is wrong. XML has nothing to do with documents except that it happens to be easy to create an xml-encoded file with a text editor.

    XML is very good at representing recursion, which makes it unique when compared to other database formats or data representations. But resursion does not have much to offer designers. It is the darling of programmer types like me.

    In the real world it is much more difficult to create useful webapps out of XML/XSLT than it might appear from the article, which glosses over many very problematic issues the main one being where you will get your xml-encoded data from. If from a database, you might as well work directly with JDBC/JSP. XML only becomes practical as a middle layer in a very sophisticated project where the participants understand how to plan extensively—something that almost never happens in the real world! Don’t get bogged down with xml unless you know what you are doing and exactly what benefits your project is supposed to derive from the extra effort involved in supporting an additional layer between a SQL database and your presentation layer.

    Copy & paste the code below to embed this comment.
  4. «Yes, I’m serious. I currently use a MySQL database for data storage, and PHP to access that data and “transform” it into whatever I need. I can’t see using XML as any better – In fact, it seems to be much more difficult, to me. So why should I switch?»

    That’s a good question, and one I’ve thought about to great length. As has been mentioned, a little further up the page, the main benefits are in cross-platform compatibility, ease of data transfer, and the benefits of the data entry style.

    Why not compromise? Have you read up on the PHP functions relating to XML files? If not, take a look at http://www.php.net/manual/en/ref.xml.php. I’ve begun experimenting, just out of interest, with a CMS using XML for data storage and PHP for parsing and displaying the data. It’s quite nifty.

    Copy & paste the code below to embed this comment.
  5. I’ve been programming in PHP for the past three years and didn’t used XML since. A college of mine gave me the URL adres of this article because a friend had an idea for something which had to use XML so I start reading.

    I think XML is simple to use (it sounds simple in my ears) and I think it’s easy to do if you have the knowledge about HTML. It’s like ‘something the same’ as HTML but in advantage using your own defined tags. It was a great help using your examples so I could see how it worked and what it did (I’m a visual person, like to see things happening).

    Thanks for offering this great article to us!

    Bas (Netherlands)

    Copy & paste the code below to embed this comment.
  6. seems like the right place to ask for help…

    I am trying to send an XML formatted job to Jobserve.co.uk without success.

    I have the details of how to do it but it doesn’t explain exactly the syntax required for ASP to connect to their server.

    The manual says that you build up the XML string and then says:


    “Using an HTTP POST program add a HTTP header called “˜SOAPMETHODNAME’ with value “˜PostAdvert_IT’ Post the string representing the SOAP XML to the specified URL.”?

    So far I have got:
    Set xmlHTTP = server.Createobject(“MSXML2.ServerXMLHTTP”)
    xmlHTTP.open “POST”, strAddress, False
    xmlHTTP.setRequestHeader “SOAPMethodName”, strMethod
    xmlHTTP.send strXML
    POST = xmlHTTP.responseText
    Set xmlHTTP = Nothing

    when I run the code I get:

    Error Type:
    msxml3.dll (0×80072EE6)
    The URL does not use a recognized protocol
    /xmltest/process.asp, line 84

    Any Ideas anyone ???? Please help

    Copy & paste the code below to embed this comment.
  7. I have been involved with web design, to some varying degree, throughout it’s inception. I have always been “shy” of XML due to it’s lack of real “community” acceptance. After walking through the examples and using the technologies that Mr. Eisenberg presented, I feel LIBERATED!

    Thank you for showing me an example of XML implementation in terms that even someone such as myself could understand. I now share the exuberance that so many of my colleages have felt for some time: XML is the way!

    I have plans to redesign several of the “content management” systems that I have written in ASP, Perl and Java to reflect my new-found wisdom and revelations about this wonderful technology.

    William Dodson

    Copy & paste the code below to embed this comment.
  8. In the article, Mr. Eisenburg says the following :

    “Once you have created the entire stylesheet in the same directory as the XML file, you can open the XML file in a modern browser such as Mozilla, and it will display the information.”

    What other browsers beside Mozilla supports this? I found that Opera 6.0 was the only browser aside from Mozilla that was able to support this option. Explorer 6.0 did display SOME of it, but nothing usable.

    Copy & paste the code below to embed this comment.
  9. This is a really helpful article—thanks for writing it!
    One thing, however—You’ve included links to XML Tools for Linux & Windows (no surprise there), but what about XML Tools for use on Mac OS X?

    Thanks.

    Ethan

    Copy & paste the code below to embed this comment.
  10. Well, I fully stand behind the idea of separating presentation and content, and this can be seen on my website(which you can get to by going to the posted URL), but instead of using someone else’s programs to transform the content into the presentation, I do it myself on my website…or at least I am in the process of doing so on my site…
    I parse the corresponding file for the page, and depending on what content is in between which tags, I display it somewhere, somehow on the page. This approach is time- consuming because it requires the content, and the presentation to be separate, with the programming tying them together(correctly!). Nonetheless, this is my preferred approach…until i can learn to use XSL/XSLT to do the CSS and the PHP/Perl’s work for me!!!

    please feel free to e-mail me

    Copy & paste the code below to embed this comment.
  11. Quote:
    Why not compromise? Have you read up on the PHP functions relating to XML files? If not, take a look at http://www.php.net/manual/en/ref.xml.php. I’ve begun experimenting, just out of interest, with a CMS using XML for data storage and PHP for parsing and displaying the data. It’s quite nifty.
    ————————————

    hmm. See, it’s not the implementation I’m having problems with, it’s the whole concept. I don’t need to use the XML functions – I already use the MySQL functions. I can’t see any kind of real-world usage for this. If I need XML, I generate it from the DB. If I need to customize the display for a particular browser, I do so with PHP.

    I really can’t see a reason to change what I already know quite well, that works very well for every instance that I can come up with, in order to use XML.

    And if you say “You don’t always have access to PHP and a DB”.. Well, if you don’t have DB access on your webhost… what are the chances of getting them to add the PHP extensions? It’s not installed by default for PHP..

    (shrug)

    Copy & paste the code below to embed this comment.
  12. The decision to use XML or a relational DB should ideally be based on the nature of the data. Some data fits better into a relational DB, some fits better into XML, some can be represented using either pretty much equally.

    Data that can be represented using just a single table can generally be represented using either technology with no compelling wins either way. (More speed with a DB, more portability with XML, but nothing much beyond that.)

    Data that would be represented in a relational DB using two or more tables that get joined together is probably best kept in a DB. The DB will handle all the join operations, referential integrity, etc., more easily than will be possible using XML. (You could do it with XML, but you’d probably end up writing a lot of code yourself.)

    Data that consists largely of text with markup is best represented using XML or some other form of markup language. If you’ve got a document that could have an arbitrary number of sections, or styles appearing at arbitrary points in the text, you really need to use a markup language of some sort. There isn’t any way to represent that information using rows and columns. That goes double for recursive data structures such as subsections.

    Relational DBs and markup languages represent two different philosophies about the structure of data. Neither of them is suitable for all data.

    Apart from that, I personally like text files that I can hack by hand. I get worried when I have important data that requires a particular program to access.

    Copy & paste the code below to embed this comment.
  13. As a follow-up to my previous post [Post], I’d like to point out that XHTML 2 is incompatible with XHTML 1.0 [XHTML2], and is certainly incompatible with HTML. Further, CSS 2.1 is not backwards compatible with even CSS 2.0 [CSS2.1].

    Further, here’s evidence that even people that arguably “Get it” don’t understand MIME types. [DiveInto XHTML2] “My fresh IE 5.5 install asks to download the page…”. The Save As dialog in IE is popped up for any unknown MIME Type (after IE’s sniffing algorithm fails). [IEMIME]

    (As a side note, it appears the URL Mark references returns text/html now. I am pretty positive that he was getting the dialog because at the time he tested, it was (properly) returning application/xhtml+xml, and it has since been changed to return text/html.)

    It is not my intention to harm anyone in these statements. I am simply trying to call attention to the need for content negotiation, and to the fact that “forward compatibility” can’t be strictly counted on.

    A mechanism for negotiating representations based on client capabilities is necessary. In fact, Mark’s closing (sarcastic) note ” Looks great in Opera and Mozilla, though. That does it. I’m converting all my pages to XHTML 2.0. Accessibility be damned. Backward compatibility be damned. IE 5 be damned.” points to this fact, though he may not realize it.

    Please… think about it.

    -Jeremy

    [Post]
    http://www.alistapart.com/stories/usingxml/discuss/2/#ala-731

    [XHTML2]
    http://www.w3.org/TR/xhtml2/
    (Sorry, I can’t point out specific examples of non-conformance here.. They’ve not included a change summary, and I can’t do the research needed to gather evidence just now)

    [CSS2.1]
    http://www.w3.org/TR/2002/WD-CSS21-20020802/about.html#q1

    [DiveInto XHTML2]
    http://diveintomark.org/archives/2002/08/06.html#changes_in_xhtml_20

    [IEMIME]
    http://msdn.microsoft.com/library/default.asp?url=/workshop/networking/moniker/overview/appendix_a.asp

    Copy & paste the code below to embed this comment.
  14. Hi
    I get a Microsoft Jscript runtime error saying Null is not a null object, while I try to run Nutrition.svg. Help!

    Copy & paste the code below to embed this comment.
  15. This XML article was reccomended to me as I am in a hurry to make its R&D for our web division. I was given a 500-page book on the subject, very good nonetheless, but J.David’s article does what that book does in much less time and without any unnecesary jargon speak or hype. Now I can say I really get what XML is about! I just hope I can understand XML’s role in Flash as well, in the future.

    Copy & paste the code below to embed this comment.
  16. I’ve fixed the XSLT transformation to SVG and the SVG file.
    Until I can get everything uploaded, the new version is at

    http://catcode.com/nutrition.zip
    Copy & paste the code below to embed this comment.
  17. I went through the process of creating all of the parsed files. I can be a goon on the computer, but i found it to be a breeze. Very cool application of the technologies. Well written article too!

    Copy & paste the code below to embed this comment.
  18. Do you know of an engine that would take an xslt style sheet and parse the data into a word doc too?

    Copy & paste the code below to embed this comment.
  19. This was good overview. An example with DTD and XML Schema could also throw some light to those grammars.

    Copy & paste the code below to embed this comment.
  20. Hi,
    Good article – much food for thought (pardon, no pun intended).
    However, the msvalidate reported a JRE clash problem. I’ve got version 1.4.1_01 of the J2sdk & JRE on my machine. Running the msvalidate.bat reported that I needed the JRE1.3.
    I resolved the issue by going into regedit & changing the JRE current version from 1.4 to 1.3. Obviously not ideal but it works.
    Thanks for your hard work,
    Eddie

    Copy & paste the code below to embed this comment.
  21. So that’s how you use XML. I keep hearing how great it is but, up until this point, had no idea why creating your own markup was a good thing. Great article. Thanks.

    Even so, I question the usefulness of it. If you ask me XML seems to be a fancy way of managing data in text files. For someone who uses only text files that may be a good thing. But as Twyst says “I don’t need to use the XML functions – I already use the MySQL functions. I can’t see any kind of real-world usage for this. If I need XML, I generate it from the DB. If I need to customize the display for a particular browser, I do so with PHP.”

    I read colin_zr’s response with interest. He said: “There isn’t any way to represent that information using rows and columns. ” Ok, can someone provide an example of this. Any examples I’ve seen could all easily be represented in a DB. In fact, I seem to remember reading a tutorial somewhere that described how to use XML to display data in a HTML table (using php, I think). Kind of pointless, if you ask me.

    Now, I’m not saying all XML is pointless. This article showed the value of XML for those who may not have access to a DB for whatever reason; or those you do not want to go beyond markup (in other words: those who shy away from scripting lanuages such as php, asp, …). I just don’t see it’s value for those who do, such as myself. Maybe, someday, somewhere, someone will provide an example that simply can not be implemented into a relational DB. Until then, I will set XML aside.

    Copy & paste the code below to embed this comment.
  22. I would like to give a nod to Jeremy Dunck for coming to the same conclusion I did about this ariticle. It is a perfect primer for an explanation on the use of content negotiation. Based on a user agent’s (i.e. brower) capabilties you can serve the document as any one of the types listed.

    These capabilties include SUPPORTED MIME TYPES and supported languages. Therefore if a browers says it supports ‘en-us’ (United Sates English) and your site has THE SAME content in two languages, say en (English), fr (French), you can serve the apportiate one to the user (in this case the english one). No need for a new URI or to ask the user which version they prefer.

    In terms of the article you can also serve documents by MIME type based not only on if a type is supported but also by the qualty of that support.

    In a real world example IE supports text/html (HTML) and text/plain (Text) and Mozilla supports text/html, application/xhtml+xml (XHTML) and text/plain (Text) . Mozilla supports XHTML with a quality of 1 (Best) and HTML with a quality of 0.9. Therefore in IE your only options are to transform the xml document into HTML or text based on stated support, but in Mozilla you have more options. You could send the document either as XHTML, HTML or text. Since XHTML has a higher quailty for XHTML you would probably want to transform the document to XHTML and send it as such. Since (X)HTML is usally preffered over raw text we won’t send the text version to either user agent.

    To further extend this exmaple if user agent supports image/svg+xml (SVG) you can send it the SVG document instead or application/x-pdf (PDF) for the PDF document.

    I sure much of this post is somewhat short sided but the bottom line is one URI can serve multple versions of a RESOURCE based on what’s avalible and what the user agent’s capbilties are. For the most part this goes unused, but this is how HTTP is DESGINED to be used. And to be honest this can all be done today and is support unfortunly most servers make this diffucult as they ARE NOT desgined to work this way, but like most things their are ways around it.

    Copy & paste the code below to embed this comment.
  23. :)

    Thank you

    Copy & paste the code below to embed this comment.
  24. Is a DTD file always needed?

    From the xml file I saw something like:

    <!—
    <food>
    <name></name>
    <mfr></mfr>
    <serving units=“g”></serving>
    <calories total=”” fat=”“>
    <total-fat></total-fat>
    <saturated-fat></saturated-fat>
    <cholesterol></cholesterol>
    <sodium></sodium>
    <carb></carb>
    <fiber></fiber>
    </protein>
    <vitamins>
    <a></a>
    <c></c>
    </vitamins>
    <minerals>
    <ca></ca>
    <fe></fe>
    </minerals>
    </food>
    —>

    which seems like to be a template.

    Can we automate the “record” generation process by having a definition file?

    Copy & paste the code below to embed this comment.

  25. fop -xml nutrition.xml -xsl nutrition_fo.xslt -pdf nutrition.pdf

    The result is a PDF file; it produces pages that are approximately 8 centimeters wide and 9 centimeters high, which fits comfortably into a shirt pocket.
    ”—from the article

    Now how can I use this with say ASP or ASP.Net or Java to generate pdf file on the fly… say for example a customer order some item from a e-commerce site… I want to be able to generate a pdf version of pre-designed templated invoice with their details filled in dynamiclly.

    Is this possible is so how?

    Copy & paste the code below to embed this comment.
  26. I’ve only just stumbled across this article, and it has been great help. The software the author linked to is very useful on PC and *nix, but I’m using a Mac. Does anyone know of any comparable software for me? (I suspect the Linux stuff can be made to work in OS X, but I don’t know how). Please email me if you have any ideas.

    Copy & paste the code below to embed this comment.
  27. Grebmil: A response a few months late…

    Ok, fair enough. Most of the examples you see in these introductory articles involve record-oriented data. But that’s not the only kind of data.

    Here’s an example of some data that really needs to be stored in markup rather than in a relational model:


    Hello, my name is <name>colin</name>. I like <abbreviation>XML</abbreviation>.
    </paragraph>

    You can’t take data like that, make a field for names and a field for abbreviations, and force it into third normal form. That’s just not the structure of the data.

    To give you another illustration, think about how you’d take an HTML file and represent it in a database. Would you have a table of div elements, a table of h1 elements, a table of p elements? What would the records in those tables look like? How would you indicate all the p elements that belonged within a specific div? And those are the easy bits. Just wait till we get to inline elements…

    Obviously that’s silly.

    What you might well do is take the contents of the HTML file, or perhaps a fragment of it, and put it into a field of a database. But then you’ve still got all the HTML markup within that field. Markup just happens to be the best way to represent that data.

    Copy & paste the code below to embed this comment.

  28. Hello, my name is <name>colin</name>. I like <abbreviation>XML</abbreviation>.
    </paragraph>

    Is that the data itself, or is it really a list of people that like abbreviations?

    I personally do not see xml as a way to deal with millions of records over hundreds of tables. I will more than happily export a subset of data to someone else in xml in any way they want it. However, that does not make my application any more a user of xml than if the two of us had agreed to use pig-latin or parenthesis delimited text files with a header row in rot13.

    I do think xml has its place, it just doesn’t overlap with my domain except as another export format. I haven’t really had to deal with importing xml because everybody else uses databases which means they’re just as happy to give me a few csv files or a direct tap into their database.

    For database dumps, a CSV file with a header row is far more space/bandwidth contientious than XML.

    To people like me, who use SQL, XML seems very clunky and broken.
    To people who like XML, I think SQL and RDBs look big and overpowered for their needs.

    IMarv

    Copy & paste the code below to embed this comment.
  29. Twyst(e) is right, I’d say,

    I certainly don’t want to rely on client side functions
    (ever heard of browser quirks? do you really think there’ll be no more in times to come?)
    when I can access reliable server side functions (PHP, MySQL).

    >You don’t always have access to PHP and a DB…
    I guess angelfire/lycos account holders sharing there pastry recipes with the world
    are not the target audience here.
    (no offense: private homepages/pastries are OK)

    Marek

    Copy & paste the code below to embed this comment.