Publication Standards Part 2: A Standard Future
Issue № 352

Publication Standards Part 2: A Standard Future

A note from the editors: This is the second part in a two-part essay about digital publishing. You can read part 1, “The Fragmented Present,” here.

It’s never been a better time to be a reader. We’re partly defined by the things we read, so it’s good to have an embarrassment of enriching, insightful writing on our hands. We hold hundreds of books in lightweight, portable e-readers. We talk to authors over Facebook and Twitter. Thousands of public domain works are available for free, and blogs afford us a staggering array of high-quality writing. Long-form journalism and analysis is experiencing a minor renaissance, and we’re finding new ways to discuss the things we read.

Article Continues Below

It’s never been a better time to be a writer. Anybody can publish their thoughts. Anybody can write a book and publish it on demand. Authors can reach out to readers, and enriching, fulfilling conversations can blossom around the connections we develop out of the things we make.

But we have considerable work ahead. Our ebook reading and creation tools are primitive, nascent, born of necessity, and driven by fear. We have one-click ePub to Kindle conversion, but it’s buried in a clumsy, bloated, cross-platform application that screams for improvement. We have page layout software, but it saves natively in a proprietary format, and it exports ePub files almost entirely as a set of <span> tags, rather than proper, semantic HTML. (Think <span class=“header”> instead of <h1>.) ePub may be saved as a zip file, but Mac OS X’s default zip archiver doesn’t handle ePub’s mimetype correctly, requiring a separate application. And there is still, as of this writing, no native reader for Mac OS X that’s up to both iBooks’ design standards and ePub’s native spec. (When creating ePub, my workflow involves uploading a new ePub to Dropbox, opening the Dropbox app on my iPad, and sending the ePub to iBooks—every time I want to view a change.) There is so much work yet to be done to make publishing easier. Farming out ePub development—overwhelmingly the current accepted solution—isn’t the answer.

Does this madness look familiar?#section1

Even if writers know HTML, they face many more hurdles. Writing now generates less income for people, but it costs the same to produce. The publishing landscape of 2012 looks similar to the music landscape of 1998, crossed with the web designs of 1996: it’s encumbered by DRM and proprietary formats, it treats customers as criminals, it’s fragmented across platforms, and it’s hostile to authors who want to distribute their work through independent channels. Libraries are almost ignored wholesale with every new development around DRM and pricing. Publishers take DRM on faith, in the face of considerable evidence that DRM hurts both readers and sales.

At the beginning of the aughts, major record labels weren’t behaving any differently than publishers are now. And for almost a decade, one browser maker held back technical progress in web development by not fully, reliably supporting web standards: this is no different than the Kindle entirely ignoring the recommendations of the International Digital Publishing Forum, despite Amazon being a paying member.

In 1997, the Web Standards Project was founded to encourage browser makers and web developers to embrace open standards. We need a similar advocacy organization for publishers, e-reader manufacturers, and readers. We need a Web Standards Project for electronic publishing.

Self-publishing and its discontents#section2

Before the web, self-publishers bankrolled their own operations. Edward Tufte took out a second mortgage on his house to self-publish The Visual Display of Quantitative Information in 1982, because no printer could meet his quality standards. But even after getting the copies made, there was no way for self-publishers like Tufte to effectively market their work. You could take out an advertisement in a newspaper or magazine, or you could call bookstores and see if they might be interested, but in 1982, you didn’t have the luxury of a website or email account.

These days, anybody can publish their own work. You can go through a print-on-demand service like Lulu. You can connect with readers via Kickstarter to handle upfront printing costs. You can set up a website and sell copies of your work through Fetch, Shopify, and Digital Delivery. You can print postage with Endicia. You can sell in person with Square. You can establish subscriptions with Memberly.

In large part, we—those who help craft the web—have embraced such a model. The Manual is a beautifully crafted independent journal of the “whys” around design. 8faces is a semi-annual typography magazine. Five Simple Steps covers all manner of design techniques. I run a quarterly journal for long essays called Distance, and I set up the whole thing to function over the internet. Finally, partly because of this very site, A Book Apart connects with like-minded readers. And more publications come by the day. Codex. Bracket. Kern & Burn. As web designers learn the details of print production, there will only be more publications like this. (I can’t speak for others, but I find a tremendous amount of pride in making physical goods, after so many days of crafting intangible things for the web.)

Many self-published projects receive less marketing and advertising, but they foster a greater intimacy with audiences, provide better customer service, and require more self-promotion. Self-publishers hustle.

And while I’ve focused mostly on design-related projects, the notion of connecting with readers over the internet is genre-agnostic. Anybody can do this—and writers are becoming empowered to take publishing into their own hands.

How must publishers evolve?#section3

There is still a role for publishers, though, if they adapt to the new landscape. What are the problems and trade-offs? Ebook cover design would probably change, for instance, now that it doesn’t need to stand out on a bookstore shelf. And while editorial increases in importance, frequently differentiating quality writing from stuff that’s written alone, marketing would change substantially. An ebook author probably doesn’t need to do a book tour. Print and display ads will be less frequent. Mailing lists will be more frequent. Publishers that understand the trade-offs and shifts in their work will be able to nimbly respond to the internet before the internet does their work for them.

Where are the standards?#section4

When it comes to ebooks, we’ve abandoned the standards we claim to embrace. In numerous conversations that I had while researching this article, many self-publishers said that it simply isn’t worth publishing their work in ePub—and the only people who were excited about ePub hadn’t tried to publish in ePub yet. It doesn’t have the same reach as other formats, and its features are implemented piecemeal, meaning it’s hard to ensure a consistent typographic standard from device to device.

Often, publishers start by producing a PDF in a tool like InDesign, and there simply isn’t an effective way to translate PDF layout and typography into HTML for ePub. As editor Allen Tan told me, “our workflow is pretty digital, it’s just that our output isn’t digital.” Lack of typeface support and robust layout tools are major pain points, and often publishers simply export their proof PDF and call it the digital edition. When people do make ePub files, they usually farm the task out, saying it’s too painful to create in-house. These are hacks, and they indicate a deeper problem.

But as web workers, we’re used to responding to such concerns better than other industries, and we’re uniquely equipped to discuss publishing issues from an outsider’s perspective. The web is typography; books are typography; ePub, the prevailing standard in books, is HTML, CSS, and XML, saved as a Zip file. Allen Tan added: “the advantage of having ePub as a standard is that any improvements with ePub can be pulled back into the web, because they use a common base.” We can provide deep, meaningful, constructive change in both ePub and the publishing industry if we apply what we’ve learned in our struggles with HTML and CSS.

So what have we learned?

Standards and disruption#section5

In 1997, competition between Netscape and Internet Explorer drove a handful of trailblazing web workers to found the Web Standards Project, which called for browsers to adopt the open standards of HTML and CSS. These days, we debate the fragmentation of the landscape, calling for cross-platform solutions and expressing worry when browser makers independently develop their own capabilities.

Meanwhile, the IDPF—essentially the W3C equivalent for books—has developed and released the ePub specification. But tools to create ePub efficiently haven’t kept up, there’s no way to semantically develop a book in page layout software, the largest e-reader company doesn’t follow the ePub spec at all, and no e-reader on the market fully supports the latest published spec, ePub 3.0. The IDPF has moved out of sync with the realities of the e-reading market—not unlike when the W3C released XHTML, which was out of step with the realities of the browser market. Publishers have taken to painstakingly developing digital bundles with many different formats, one for each potential e-reader—not unlike when websites were “best viewed in Netscape 4.0 at 800×600 resolution.” What did we learn fourteen years ago, and why are we letting this happen again?

Because the largest publishers make us. Even though DRM has been proven, time and time again, to be hostile to both consumer and publisher, almost all e-books sold today are encumbered by it. But media distribution usually runs this course: DRM is enforced out of fear, and in the face of flagging sales and rampant piracy, the industry moves toward open standards. On the iTunes Store, FairPlay gave way to unencumbered MP3s. With movies, DivX died; DRM-cracked DVDs flourished. Creators will only gain control of their industry if they stand up for themselves.

Likewise with web standards, and now publishing standards: we can only kill off Amazon’s DRM if we become fierce advocates for open standards and vote with our wallets until things are made right. And proprietary tweaks to ePub, like Apple’s iBooks Author spec, are not unlike the approach that WebKit takes to its own proprietary -webkit styles; such an approach can be refined and reformed if we approach it with the same perspective. As the W3C had WaSP in the late nineties and early aughts, the IDPF needs its own advocacy watchdog, too. We’re used to the web disrupting many industries, and it’s time to embrace the turbulence around publishing, for better and worse. The only alternative is to abolish the internet, and we’d rather not see that happen.

The Publication Standards Project#section6

That’s why, along with this issue of A List Apart, and with the support of many great people, I’ve launched the Publication Standards Project, which has the following long-term goals:

  • Fully featured, native support of the most modern ePub standard in all ebook reader software. You can support your own proprietary format in tandem, but if your reader does not fully support ePub 3.0, we will continue to advocate for you to do so.
  • Support of the most modern ePub standard in creation tools. Same as above: your book-making software should write semantically correct markup, even if it also exports to other publishing formats.
  • Improving the existing ePub standard. It is not perfect, and it needs to be improved. In the long run this might result in a fork of the specification—essentially a WHATWG equivalent—but for now we’ll begin working with the IDPF.
  • Page layout software that exports semantically correct, standards-compliant HTML and CSS code. Software developers, take heed: after speaking with many publishers and independent writers, I’ve concluded that this market is wide open right now. If you build a better mousetrap, it will do very well by you.
  • Abolishing DRM in all published writing. DRM has provably aided piracy, and it works against the customer by assuming they’re a thief. Removing DRM, on the other hand, has been proven to increase sales in many situations.
  • An end to gatekeeper standards. As power consolidates in the hands of a few booksellers, they have a decreasing motivation to accept radical viewpoints or contentious, “banned” books. Rejecting a book because it contains a third-party link may fulfill the letter of Apple’s law, but it violates the spirit of open access, sharing, and healthy competition—and it could arguably be interpreted as an act of censorship.
  • Simpler, more humane library lending policies. We desperately need libraries to support under-served communities without pervasive broadband. Refusal to simplify pricing models, and refusal to inter-operate among e-readers and lending systems, means that libraries will simply opt out of ebook adoption entirely—something they can’t afford to do if they’re going to stay relevant in the future.

At least in the short term, we’ll accomplish it in these ways:

  • Education. Many people don’t know everything about the issues, and how they parallel our prior technological progress in other areas. Publishers don’t understand the new mindset that readers are in, readers don’t understand why publishers won’t join the 21st century, writers don’t understand why readers won’t pay anymore, and writers don’t want publishers to have full editorial control. Very few people have a clear sense of all the competing publishing formats and why such fragmentation is a bad thing. And we still don’t have the right tools to build the best writing that we can, share it with others, and constructively discuss it. At the Publication Standards Project, we’re ridiculously passionate about these issues, and we’d love you to join the conversation.
  • Outreach. Part of the Publication Standards Project is a call to action: sign on to our goals as a reader, writer, and publisher, and resolve to work to improve the way that we communicate with one another. Another part is lobbying: we need to collectively advocate for e-reader manufacturers, publishing software developers, booksellers, and publishers to adopt better practices in the way they work. Everyone who reads this article is capable of action, and nobody else will stand up in your place.
  • Your ideas. We know we haven’t thought of everything, and the best thing you can do to help is to volunteer. Get in touch with us and tell us about your vision for this; it cannot happen in a vacuum, and it cannot come from a handful of people. We will adapt to new developments in the publishing landscape, and there’s no way to tell exactly how things will play out.

Concluding thoughts#section7

We discussed ePub’s promise in 2010, but we’ve only regressed since then. Our prospects look dim these days, but this is an opportunity for all of us to assert control over the way these standards are adopted. Right now, the landscape is dismal. But we solved these sorts of problems once. We can do it again. You can help. We set up a site at http://pubstandards.org. Our Twitter name is @pubstn. We’d like you to sign up for our mailing list so we can begin to take action with your help.

The internet was built on the foundation of free and open access to information, and it’s time for us to act like it. This isn’t going to happen by going to the companies and reforming them from the inside. It’s going to happen by building a new movement that can reform the older model. If we’re going to follow standards and openness in the things we publish, it starts with every single one of us, on every side of the table. Otherwise, we may end up with the walled gardens that we deserve.

13 Reader Comments

  1. Nick –

    Great overview of the challenges ahead for ebooks. I’ve always wondered why ebooks are not just straight HTML in the first place. Publication standards project looks like a great thing, will definitely subscribe to your newsletter and try to contribute ideas…

    I may have one piece of the puzzle already. You call for:

    bq. Page layout software that exports semantically correct, standards-compliant HTML and CSS code.

    I’ve built something quite like that, I call it Edit Room, and it’s a focused tool for semantic standards-based, responsive web design. It would not take too much to make it useful for ebook authoring as well. It’s available at “http://www.edit-room.com”:http://www.edit-room.com

    I’m just getting started bringing this to market, and any and all critique and feedback from anyone is welcome…

  2. Amidst all the brouhaha around ebooks, ePub et al, there’s an underdog project by Roger Black called “Treesaver”:http://treesaverjs.com/ which looks really interesting.

    Sure, it’s not specifically targeted to writers—hell, it’s not even about ebooks—but I wonder how far a solution built on a barebone HTML framework can go in comparison with all the closed standards that you mentioned.

    There’s an ebook-targeted framework as well that Craid Mod “introduced a while ago on A List Apart”:http://www.alistapart.com/articles/a-simpler-page/ but it hasn’t evolved that much for months, unfortunately.

  3. oh geez.

    i didn’t know there was a “part 2″…

    after “part 1”, i asked you to please
    just go away, and not “help” e-books.

    now i see that you’ve already mounted
    an effort. but really, just go away.

    you do _not_ know what to do to “help”
    — you’ve merely regurgitated the stuff
    that didn’t make the web any better —
    so your effort will just confuse people
    and fragment the work of the _real_ fix.

    i’m still serious. shoo!

    -bowerbird

  4. @splatcollision: I’ve had the chance to look at Edit Room lately, and it looks really cool! I’m excited to see where it goes.

    @Régis Kuckaertz: Treesaver — and frameworks like it – are really encouraging to see. Often they are marked up more simply and elegantly than your typical ePub document. One could speculate a handful of conclusions: that ePub is too hard to mark up and build; that it may be a chicken-and-egg problem where the lack of tools and readers forces people to build something easier; or that ePub’s competition with HTML/CSS makes people punt back to the established platform.

  5. Does anyone remember “Annotated Alice” by the Voyager Company? The Sony Bookman…?
    We need standards to be free and openess to grow.

    If every bookreader met a standard like ePub3 fairly and squarely I’d be out in the sun today, my work would be done ;). I’ve been here before, a time when “electronic books” went out like a damp squibb…I don’t believe it will happen like that this time. there will be winners, and they will share their toys and play fair.

  6. @”Nick”:#5 Yes! I do hope ePub 3 (from what I understand) being a blend of open web standards is a good sign that one day we will be able to create truly cross-platform ebooks without the hassle of dealing with incompatibilities.

    I’ve signed up to become a member of the Publication Standards Project and can’t wait to see how I’d be able to help.

  7. I am a fan of XML and semantics tagging, but new to ePub.

    I do not get why someone would care whether the final formatted result is a or a

    .

    It’s not like you can use the semantics for search/classification anymore. You are looking the the final formatted result. And if you were hoping to implement search/classification, the classes would potentially of more use than an h1 or an h2.

    What am I missing here?

  8. @france.baril: Keeping headers as H1 (for instance) ensures consistent rendering from e-reader to e-reader; it’s easier to preserve consistently; and it’s easier to port to other formats if need be (Mobi doesn’t support CSS, for example, so styled tags don’t play well).

  9. There is a world of commercial publishing that ePub, KF8, PDF, etc. have been created to allow for ebooks. However, because of HTML being so successful for browsers on computer screens, the aspect of the device has created huge confusion over how to deal with publishing. Steve Jobs created the Next computer with full WYSIWYG screens and printing using PostScript. By equating the two distinct media, CRT and 8.5″x11″ paper, was the recognition of power of verisimilitude. Obviously, new viewing devices and new methods to physically interact with data and documents have made the importance of serving content in a multitude of ways.

    For example, looking at legislative documents, the issue of authenticity and verisimilitude are much more significant and trump the issues that are driving the ebook issues mentioned in the Publications Standards article. And for these reasons, there is no good answer on formats for legal documents. One aspect is that line and page numbering are artifacts of publishing on paper with typography. And so for legal documents, academic and other crucial document creating communities, the discussion is a lot noise that often misses their issues.

    I would say that a single XML file can not meet the needs of publishing. Nor can a simple PDF or any other single file approach. Essentially, there are layers that are often out of sync with each other. Line numbers that correspond with a printed document are hard determine outside of processing the document for publishing. And XML cannot handle the resulting data without breaking nesting. And PDF cannot handle much without including an XML version.

    In reality, a document is best described with overlapping onion skins.
    *Simple text, illustrations, and typography.
    *Publishing artifacts on paper, B&W ereaders, color screens at different sizes, audio/accessible versions.
    *Interactive features, dimensional breaks/non-serial narrative breaks.
    *Versioning within a document.
    *Implicit and explicit metadata about the document.
    *Legal aspects, licensing, authenticity, watermarking, branding, and the interactive aspects (limiting views, copying, time restrictions, online …)
    *Business model issues, including purchasing information, ISBN, UPC, etc.
    *And, oh yes, content, which may be more than simple text (for law there may be cites, a metadata level for hierarchical/jurisdictional/version/authorship/etc) often creating a layer of XML that is XSLTed into ePub or HTML or PDF with or without representation in the presentation layer’s code.

    Daniel Bennett
    CTO, eCitizen Foundation

Got something to say?

We have turned off comments, but you can see what folks had to say before we did so.

More from ALA