Printing a Book with CSS: Boom!

by Bert Bos, Håkon Wium Lie

68 Reader Comments

Back to the Article
  1. The people who are slagging this approach are completely missing the point, IMO. For me, the good part is not so much the use of CSS to format the book, but the use of XHTML to mark it up. The printing back-end can be ripped out and replaced with whatever works for you — FO (yuck), groff, LaTeX — or load up the HTML in M$ Word or OpenOffice, apply a stylesheet, and print.

    We can talk about DocBook until we’re blue in the face, but it’s such an incredibly complex DTD that most writers would give up before finishing the first chapter of the first document. DITA is a step in the right direction, but it’s probably still too complex for non-gearheads without a fair amount of motivation. Just about everyone knows enough XHTML to write a document, and there are plenty of tools — Free and commercial — that provide a pretty GUI for people who need it.

    As long as writers have to associate XML with complex large-scale publishing systems with six-figure deployment costs and five-figure support costs, it will be “eXcellent, Maybe Later” outside of Fortune 100 companies. HTML brought on-line publishing to the masses through a simple syntax; now it can bring single-source on-line/paper/PDF publishing to the masses as well.

    Copy & paste the code below to embed this comment.
  2. Others have pointed out the advertizing. I would also point out that HÃ¥kon Wium Lie has been long known to be an XSL foe (pun intended), so his opinion on XSL should be taken with a grain of salt.

    But now to Prince itself—

    It seems like a good way of bringing web pages to print. Someone mentioned its applicability to the blog-streaming world. Prince can fill this niche well. But going from web to print has, up till now, been carried out by printing according to a print stylesheet, not downloading a PDF. What Prince has over, for instance, Firefox is that the latter doesn’t support the CSS page model properly. When browsers come up to that functionality, Prince will be out of job.

    Mr Lie might answer that the niche isn’t web to casual print, it’s XHTML to books, with the XHTML not necessarily ever being hosted on a web server, and books like the kind we get from Framemaker or Quark. However, that’s a niche Prince, or more accurately an XHTML to PDF tool, can’t fill either. Maybe CSS is already up to the task of heavy formatting (and I doubt that), but XHTML isn’t up to the task of rich markup. XHTML is a limited tagset. You know it when you have to use span tags where in general XML you’d use an element. You know it because the new OpenDocument format for office applications didn’t duplicate XHTML, it was formulated with its own tagset, which is much bigger than that of XHTML. XHTML is suitable for the simplest books, but anything beyond that, like any random book you pick up in the college library, requires a more feature-rich markup language. XHTML for books could only be hobbyists’ fare, and maybe not even that, since hobbyists are far more likely to opt for WYSIWYG tools than textual stuff.

    In short, I don’t see Boom finding its niche among any of the possibilities. It’s overkill for simple web to print, and underpowered for professional typesetting.

    Copy & paste the code below to embed this comment.
  3. Great idea I will have to read the book to give a better review but excellent job on making it via this method.

    Copy & paste the code below to embed this comment.
  4. HÃ¥kon Wium Lie has been long known to be an XSL foe (pun intended)

    :) It’s the «FO» part I have a problem with. Formatting objects don’t have any semantics and should therefore not be represented in XML. It’s just a bunch of font tags. Which is why I “once wrote”:http://www.xml.com/pub/a/1999/05/xsl/xslconsidered_1.html?page=4


    bq. I can understand why overworked undergraduates think FONT is cool, but I’m very disappointed when a group of highly skilled adults tell kids to stop playing, form a committee – and then come out with a set of supercharged FONT tags

    Anyway, your main argument is not CSS vs. XSL-FO, it’s against the use of HTML as the basis for our markup. You write:

    XHTML isn’t up to the task of rich markup. XHTML is a limited tagset.

    Indeed, the tagset is limited, but HTML has a wonderful extension mechanism: the «class» attribute. Using the class attribute, you can convert any XML document into HTML and back—without losing information.

    You know it because the new OpenDocument format for office applications didn’t duplicate XHTML, it was formulated with its own tagset, which is much bigger than that of XHTML.

    I think this was a big mistake. By basing OpenDocument on HTML (much the same way Bert and I did Boom on HTML), the format would have had a huge installed base from the beginning: 1 billion browsers.

    Copy & paste the code below to embed this comment.
  5. Formatting objects don’t have any semantics and should therefore not be represented in XML.

    Why not? In the letters that make up the initialism XML, I don’t see anything that stands for Semantics. XML is just a toolchest for building any markup language you wish, and one of those happens to be the page layout language called XSL-FO. And it isn’t “just a bunch of font tags”? anymore than CSS is—I think we both know XSL-FO is to be generated from XSLT rather than written by hand, and when generated from an XSLT script it’s equivalent to a CSS stylesheet in separating content and presentation.

    Indeed, the tagset is limited, but HTML has a wonderful extension mechanism: the «class» attribute. Using the class attribute, you can convert any XML document into HTML and back—without losing information.

    Doesn’t the use of a kluge indicate the inappropriateness of the format? And, um, talking about semantics, are you aware that class attributes are style directives, containing no more semantics than I or B or TT tags? You’re like proposing the use of such constructs as divs with style attributes instead of H1/H2/H3 tags, which Ian Hixie complained about on his blog, but on steroids!

    This isn’t the right tool for the job. Anyone who so much prefers CSS to XSL can use CSS to style XML, and that would be better. I shudder to the thought of using HTML, with kluges and all, for preparing a college grammar book. But even CSS isn’t wholly satisfactory—you’ve had to write an external script to generate the TOC, while you can do it with XSLT, and then style with FO in the same gulp. Looking at it that way, the XSL approach could be said to be more deskilling than the HTML/CSS one.

    I think this was a big mistake. By basing OpenDocument on HTML (much the same way Bert and I did Boom on HTML), the format would have had a huge installed base from the beginning: 1 billion browsers.

    But OpenDocument is for office applications, not for browsers. You seem to be very Web-centric. Additionally, even if ODF were based on HTML, the type of HTML that office applications generate approaches the elegance of Frontpage’s output.

    Copy & paste the code below to embed this comment.
  6. Personaly I am more familiar with css than with Microsoft word. Now I can type my esseys in Dreamweaver :)

    Copy & paste the code below to embed this comment.
  7. How novel, using a language based on SGML (Standardised Genral Markup Language) to make a printer paint a page instead of a browser.

    Ah yes – my first program documentation (1983) was generated using IBM DCF on a 3090 mainframe. And our documents had strange markup like <h1>,  etc etc – and it had ‘stylesheets’ to add ornamentation (other than bold or underline) etc etc when IBM released its first advanced function printers (3800-3 and 3820s).

    IIRCit was a superset(or was that a subset?) of SGML. It even generated tocs, indexes etc etc

    Amazing its still around … http://www.printers.ibm.com/internet/wwsites.nsf/vwwebpublished/dcfhome_z_ww

    Kim Mihaly

    Copy & paste the code below to embed this comment.
  8. Just give me the needed CSS print functionality, and a web browser that supports it.  Then all I’ll need to do is File -> Print to Postscript/PDF.  Even better:

    % firefox—print http://www.alistapart.com/print/me—output ala.pdf

    Copy & paste the code below to embed this comment.
  9. TeX is different thing. I wrote a number of papers, and even a whole book using it. These days I use XSL and produce PDF. This is XML based and more flexible. Still I personally like TeX much more. But PDF and TeX are about actually printing content in high quality.

    The point of this article seems to be, that using XHTML/CSS can be used to publish a real book. It’s a prove-of-concept by the persons who created CSS — and that’s nice.

    Printing web-pages is always a pain. And I hope that’s what this is about. Printer-friendly pages are never really what the claim to be. XHTM/CSS can not compete with PDF. But it can complement it. And make it easier to import web-pages into publishing systems.

    Copy & paste the code below to embed this comment.
  10. Very nice paper, thanks. And Prince may be just the tool I need for a “print-on-demand” adjunct to my (free) ebooks site (http://etext.library.adelaide.edu.au)

    I’ve been tinkering for a long time with ebooks—mostly public domain novels and essays (which it is true to say are quite simple compared to technical works), using HTML. My main interest has been in formatting books for the web rather than print, but there’s always that lingering, “wouldn’t it be nice” feeling that it would be great to be able to print them too, if desired. And I have had some limited success in that direction using rudimentary CSS (see the FAQ), which produces a nice result if you don’t mind A4 and don’t much care about page numbering etc.

    But it is very pleasing to see someone pushing the envelope to see what can be done with CSS. Now, if only my browsers supported all those features, I’d be very happy.

    Of course, with or without Prince, there’s no reason I should not use the CSS3 features, even if they are not currently supported. They will be one day, and then my ebooks will be ready and waiting!

    (And I’ve heard all the “wrong tool” arguments from the ebook crowd already, thanks! LaTex, Docbook, XSL, yadda yadda. Most of them are still producing ugly results whatever the tool.)

    Copy & paste the code below to embed this comment.
  11. I opened up this article mainly because I’ve been looking for a means to create invoices and proposals quickly, easily, and with some customization.

    I’ve always hated opening up Indesign or Msoft Word just to fudge a couple variables and print. I’m on a slow laptop, and it can seem like forever to load up these bloated apps, only to close them after seconds of use.

    I love the possibility that I can create a printed page template, and only have to open a simple text editor to edit and then send it to a browser (which is always on!) and hit print.

    I know everyone’s been knocking the book format application, but I’m very excited about other possible applications.

    Copy & paste the code below to embed this comment.
  12. I am righting a docbook “book”. Some of the fonts are not looking good and would like to enhace the fonts. I learned that I could use CSS for enhancing fonts output in html. I am trying to see if someone has gone thru the experience and have used a good CSS stylesheet file. I would like to get a copy of the CSS stylesheet if possible. Or point me to good location where I can get a good CSS file.

    Thanks,
    Raj

    Copy & paste the code below to embed this comment.
  13. I am righting a docbook “book”?. Some of the fonts are not looking good and would like to enhace the fonts. I learned that I could use CSS for enhancing fonts output in html. I am trying to see if someone has gone thru the experience and have used a good CSS stylesheet file. I would like to get a copy of the CSS stylesheet if possible. Or point me to good location where I can get a good CSS file.

    Prince ships with a CSS style sheet that does rudimentary styling of DocBook files. It’s called “docbook.css” and it should be a reasonable starting point.

     

    Copy & paste the code below to embed this comment.
  14. thanks

    Copy & paste the code below to embed this comment.
  15. While the article is great, i don’t get the point.
    I tried printing you HTML file via the latest version of firefox and ie and the footers (page numbers) do not appear.
    if we need to have another program to convert to pdf to then be able to print, what is the use? why cannot CSS just work with browsers for printing?
    I am trying to create documents that can be visible on the web and then when somene wants to print them, have a page number and a footer appear and your solution does not work, why is the question? since it is CSS and XHTML, there should not be a problem.
    i have to revert to the idea of making 2 style scheets one for print and one for screen which is also a bad solution.
    Unless i missed something

    Copy & paste the code below to embed this comment.
  16. I don’t think this is ready for any serious use. Consider for example
    <ul>
    <li>That you can’t have a ul inside a p,</li>
    <li>or any block level element</li>
    </ul>
    which really breaks orphan control code. There is simply no way that a client figure out like that a sentence like this belongs to the paragraph above, and isn’t a paragraph on its own.

    (La)TeX has rather advanced algorithms to do this right, since it seriously breaks text flow. It may not be the kind of thing that people point their fingers at, but if you ask them, they tell you that your text was “heavy”. Any typographer worth his salt, and a serious book publisher will give this high priority.

    Don’t get me wrong: I would really like to see a LaTeX replacement, as the HTML tools are much more widespread than LaTeX, and LaTeX is often a pain to write. However, it is important to realise that there are many good reasons why people use it for high-quality work, and that is not going to change before certain flaws in the original design of HTML is corrected, and I know it breaks your heart, HÃ¥kon, but that means backwards-incompatible changes must be made to HTML.

    Also, it means that we have to put some effort into high-quality printing in the UAs, and I don’t see us doing that…?

    Copy & paste the code below to embed this comment.
  17. I like the idea, since we already have the API documentation of our software generated as HTML. It must not be perfect for printing (I would prefer Latex over XSL-FO/Docbook, but that is not important here).

    What I am really missing is the possibility to create a reference to a numbered element: E.g. images are numbered with a chapter prefix and a counter for the image, i.e. a caption like “Fig. 2.3” for the third image in the second chapter.

    &&/%%$$
      &&&%%&%
      %%&&&&&&&
      {text-align:center}Fig. 1.1


    This could be easily done using counters. But now I would like to create a reference in the text to this image like, e.g. “see Fig. 2.3” where the “Fig 2.3” is automatically generated. Is this possible?

    bla bla bla (see Fig 1.1) bla bla bla

    Copy & paste the code below to embed this comment.
  18. I’m involved in a project that requires a “clean” print option, but none of the developers have specific expertise in printing (nicely) from XHTML. Frankly, we have been dreading the day when we would have to buckle down an learn an unfamiliar print technology. After stumbling on this thread I picked up the free Prince demo today. Within an hour, I was outputting reasonably complex pdf layouts with tables borders, backgrounds and images (oooooooh… ahhhhhhh…) and without touching the original content. The CSS2 implementation is refreshingly solid (this the week that IE7 and FF2 were released). Now, my team is genuinely excited about printing. Offset press publishing might be a stretch, but Prince proves that more modest goals are achievable with a fraction of the effort.

    Copy & paste the code below to embed this comment.