The Look That Says Book

by Richard Fink

46 Reader Comments

Back to the Article
  1. Second, justified text — with word-spacing on successive lines varying only within tight limits — is the best and most relaxing text to read

    Bill, although I agree that going the Javascript route to do H&J is not A Good Thing, I tend to question the very principle of doing H&J on a systematic basis. Do you have any pointers (I mean, peer reviewed papers) that demonstrate that justified text is the “best and most relaxing” ?

    To me this seems counter-intuitive, except perhaps for specific widths, because I feel it harder to bring my gaze back to the correct line after an EOL when the text is justified. It is also the opinion of Edward Tufte, among others, if I am not mistaken. When the text is justified, it becomes a block of indistinguishable lines, and it takes more time to find which one you were reading, and which is the next, if your attention drops for an instant.

    Copy & paste the code below to embed this comment.
  2. H&J without carefully considered H&J rules (like those found in inDesign) seems a pointless endeavor. I shudder to think of the less typographic-savvy web designers who will interpret, “We can hyphenate!” as “We can justify!”, producing web pages full of rivers.

    Let’s not forget: Ragged text columns don’t have to be hard-set. They can hyphenate as well, to create a more even edge. This is where the focus should be right now.

    Copy & paste the code below to embed this comment.
  3. Hi

    Thanks to Richard Fink and ALA for this article and for the pointer to my Hyphenator.js poject. As its developer I’d like to share my thoughts, too.

    And special thanks to all those critic voices, too. I think those are the most important! I’m currently getting lots of mails and requests about my project.
    There are many good ideas about how to improve Hyphenator.js and I will implement most of them in the coming weeks. Hyphenator.js is a growing project that I’m maintaining in my spare time (@Bill Hill: There’s currently support for over 30 languages, more to come!).
    Just one appeal: RTFM! You’ll save my time and for many things there’s already a solution.

    @those being doubtful about readability of hyphenated text
    IMO this highly depends on literacy of the reader. Reading lots of books is getting used to hyphenated words since most books are set with H&J. (Want a prove? Go to the library!) Personally I’m a fast, experienced reader and capturing a hyphenated word isn’t that difficult to me. Together with the context the first syllables are enough to guess the whole word. So hyphenation gives me a valuable hint about being on the right line when my eyeballs jump to the next line (unliterally spoken ;-)
    True, hyphenation may be not that easy to read for less experienced or handycapped readers and reading webpages is generally not the same as reading printed text. But what about eBooks? What about optional hyphenation?
    This article does not say you should do hyphenation in every text. It says: “it can be part of your work”. I.e. you can use it now, if you feel like or your customers make you using it.

    @those saying unhyphenated flush-left text is a good alternative
    This my be true for english texts with short words on average. It isn’t true for other languages with longer words. In German there are lots of very long compound words:
    adjudication of the federal administrative court (en)
    Bundesverwaltungsgerichtsentscheid (de)
    I recently read this word in an article on the front page of; it didn’t fit the layout!

    @those complaining about the size of the hyphenator script
    I completely agree. It’s too large (largest part are the hyphenation patterns). But lets take this ALA-article as an example. The JPEG on top of it (the one with the sliced carrots) is about 45KB. It looks nice but it not relevant to the the content, too”¦
    The script and the the patterns are cached if you set it up correctly and can be reused on every page of your website.

    @those who don’t trust automatic hyphenation
    I’m using a quite old and sophisticated algorithm originating in TeX. To compute the patterns a list of hyphenated word is used. The better this list the better the patterns, the better are the results of hyphenation. There’s ongoing work on a list for german patterns ( but afaik not for english”¦

    @those who say “this should be done by the browser”
    Definitely. I hope that once upon a time a browser will fix it’s text layout engine and do hyphenation (according to CSS 3). Until then Hyphenator.js is just a crutch.

    Finally: I was quite surprised by some very determinating declararions in the discussion above. Nobody drives you into using Hyphenator.js and piling up scripts on your website. We’re not as free in all our decisions as we may think but in this case we are very free. Hyphenator.js is just an option and it may be valuable for some cases and completely disappropriate for others. But I think it’s great to have this option.

    Kindly, Mathias

    Copy & paste the code below to embed this comment.
  4. I think we all can agree that when you have text automatically justified you should also have it hyphenated, even if it’s automatic and less than perfect. (That’s where the current rule comes from: “don’t justify text on the Web!”) If you use (automatic) hyphenation you don’t necessarily need justification. The benefits are still there, especially in languages with average word length higher than in English. People also seem to forget or ignore that you can configure client- and server-side hyphenators to your likings, i.e. minimum characters to keep on line and push to nex, maximum number of consecutive lines hyphenated, exceptional words etc.

    Javascript solutions, of course, are not solutions but hacks. They’re fine as user-side workarounds (bookmarklet, plugin/addon), but, in general, shouldn’t be provided by site owners, at least not enabled by default.

    Some commenter suggested to propose additions to CSS. This has, unsurprisingly, been done long ago. It was even mentioned in the article, but hardly (re Prince), and currently resides in “Generated Content and Paged Media”: but will be moved to a “more appropriate module”:

    Copy & paste the code below to embed this comment.
  5. Just thought I’d mention that viewing the Heart of Darkness example in Opera on Windows, there’s an odd question-mark character that precedes every em dash.

    Copy & paste the code below to embed this comment.
  6. As an author of an article on “Antidisestablishmentarianism”: may have discovered, text benefits from hyphenation and justification — whether justified or ragged. Like using <table> and float:, the solutions in the article are workarounds to fundamental layout problems better solved by browser developers working with good specifications.

    If reading on electronic devices is going become as easy and elegant as print, browsers and e-readers need to have built-in sophisticated hyphenation and justification routines which are applied at the point of use, along with better handling of soft-hyphen and fixed codes so that they don’t pollute what the viewer receives.

    And once hyphenation is sorted, can we move onto good kerning and tracking and correct handling of ligatures, please?

    Reading on the web is like riding a unicycle or flying a biplane. We do it for the experience not the ride.

    Copy & paste the code below to embed this comment.
  7. For accessibility’s sake, there needs to be an “off” button from the site visitor’s end.

    I’ll also agree that too many Web pages suffer from JavaScript bloat.  On a Windows embedded system, I see pages that take minutes to load, or even sometimes to scroll.

    Copy & paste the code below to embed this comment.
  8. @joe clark
     “The article’s use of the U.S. Constitution as “neutral, generic” body copy is actively offensive to foreign nationals.”
    – Joe, as I wrote you in the email you referred to – did you not read my reply? – the “neutral text” quote is not from the US Constitution. It’s from the Declaration Of Independence and it was written by Thomas Jefferson to apply to all peoples at all times. It belongs to the world, not to citizens of the United States. It’s yours, too. And it beats Lorem Ipsum.

     “My question is why are we trying to set type standards on the web based off of print principles?”
    – We shouldn’t and I didn’t write that we should. Screens are different than ink on paper. Even high-res screens like the Retina display.
    But – and I admit that I haven’t interviewed every literate human on the planet in reporting this as Joe Clark would have me do, it seems – the habits of readers today have been formed by print and its conventions. (This is changing, and changing rapidly but it’s still the case.) We should examine the conventions of print. What doesn’t work should be tossed out. Traditions should be examined and analyzed. And then purging can be an informed decision, not a knee jerk reaction to “old media”.

     “The first thing that came to my mind for practical use was not to hyphenate body text (jagged right is just fine for me, personally). Instead, I could see this being useful for really long names and titles for objects that appear in a tight grid.”
    – Thanks for thinking outside the box. This is not an either/all proposition. (Do those who object on principle to this technique, object to JavaScript being used to insert “smart quotes”. Smart quotes are OK, but hyphens not?)
    And your idea ties in, I think, with Mathias’s observation that word-length is not the same in all languages and that the problem of line wrapping has levels of urgency. With auto-translation becoming more and more prevalent, this is an issue that rates some thought. Hyphenation is one tool that can be in the toolbox.

     “Javascript solutions, of course, are not solutions but hacks.”
    – I’ve got no problem with, and I thank you for the rest of your comment, but this broad-brush labeling of anything done with JavaScript as “hacks” is nonsense. And dismissing the work done in libraries like jQuery as “hacks” is offensive. (Although you most probably didn’t mean it that way.) JavaScript is the most widely used programming language in the world. It is embedded in many products besides browsers – Adobe Acrobat and InDesign, to name just two among the hundreds if not thousands of apps. JavaScript is and will remain an integral and irreplaceable part of the crafting of web pages now and into the future. There is nothing hacky about programmatic solutions to problems – they are perfectly appropriate. It is just another approach. Would it be nice if H&J were natively supported? Yes. But even then you would be dependent upon the included hyphenation dictionaries and you might have to turn to JavaScript to tweak the result. This stuff is hard to get right and there will never be a perfect world.

    @jon faulds
     “there’s an odd question-mark character that precedes every em dash.”
    Thanks for reporting this. It has a bearing on backward compatibility
    Happens to me on XP, also. And Opera’s behavior is per-spec and technically correct. Here’s why: if the spacing character or the ZWS character that I’ve used to surround the em dash isn’t in the font, and Opera can’t find it any of the fallback fonts, either, the browser should show a box, as Opera does. Other browsers will synthesize certain characters even if they’re not in the font. As of Vista, Microsoft began including these characters in the system fonts exactly because of complaints about “boxes”.

    Copy & paste the code below to embed this comment.
  9. One thing I wanted to add:
    Where hyphens are inserted is dependent upon hyphenation dictionaries. Whether or not the hyphens are inserted into the text using JavaScript or done with code built into the browser, this does not change.
    The result you see with Hyphenator.js is, theoretically, exactly the same result you would see with native support. For screen, the visual results would be indistinguishable.

    Copy & paste the code below to embed this comment.
  10. In no way does force justified hyphenated text make the text easier to read — in fact it is just the opposite.

    I regularly work with people who have visual difficulties, learning difficulties and who are reading text that is not in their first language. I can guarantee you that most of them find force justified hyphenated text more difficult to read and understand. It’s rarely necessary in print and totally unnecccessary on the web.

    Setting text this way is an anachronism and it would be a terrible shame to see it spread on the web.

    Copy & paste the code below to embed this comment.
  11. I don’t know why I have to keep reminding Richard that shoving U.S. legislative documents down our throats as “neutral, generic”Â examples offends people who aren’t American. We never declared independence, hence don’t have a Declaration of Independence.

    I guess this is one of those times when it’s pointless to argue with Americans about their view that everybody fundamentally is one.

    Copy & paste the code below to embed this comment.
  12. Your comment shows that you haven’t yet understood the web. Whereas in print the content and its presentation is often one big unity (the book, the newspaper), there’s a three layered model on the web:

    1. content — well structured HTML (w/o any styling)
    2. presentation — default or user defined CSS that styles the HTML
    3. behaviour — how the user can interact with layer 1 and 2: JavaScript and server side languages

    If a webdesigner decides to not respect this model, it’s his fault and he hadn’t understood the web, either!

    As it comes to accessibility and reception of text a website is well done — among may other important things — when it is receptionable when CSS and JavaScript (Layers 2 and 3) are turned off.
    H&J belongs to layer 2 only (it’s done by JavaScript in the case of Hyphenator.js because layer 3 is the place to change layer 2, but in case of native Hyphenation support layer 3 isn’t involved any more).

    There are interfaces for every user to change layer 2 (user defined stylesheets and extensions) and layer 3 (Bookmarklets and extensions). So if one doesn’t like how the context is presented he can change its presentation and its behaviour. (BTW: it’s exactly what I am doing with Hyphenator.js: it hyphenates every webpage for me, because I don’t like text layouts with ugly rags).

    It’s not a shame that there’s H&J for the web. If this would be the case then it is also a shame that there’s color (color-blindness) and sound (deafness) and many other things.

    I thing that you’re wrong.

    (But it’s a shame that people still don’t know about the model described above and still don’t know how to use and adapt the web for their needs!)

    Copy & paste the code below to embed this comment.
  13. Thank you for your article. I am going start using it in all of my my new website projects!

    Copy & paste the code below to embed this comment.
  14. Practically using hyphens on the web are still in early stage. For example, we designers have to avoid 3 word-break hyphens for 3 consecutive lines within the same paragraph. To be able to avoid this, we need a sultriness adjustment.

    We can’t wait for controlling content on the web the same way we do on print, though.

    Copy & paste the code below to embed this comment.
  15. I have implemented Hyphenator before on a website using the JS library. But personally I do not believe that this is the best way to implement hyphenation for the Web.

    Loading the JS library with every new web page creates quite a heavy load and takes time. And this also means that hyphenation is only available on website that actively offer it.

    Using it as a bookmarklet makes hyphenation available on any website that the user desires to have it to improve readability by his own judgement. Much better because it gives users a choice. But still you actively need to click that button for each new page which renders again then. What a waste of time!

    My conclusion is that the browser should add such functionality – preferable as an add-on – and make it configurable. One option could be to hyphenate all web pages by default. Another option would be to only hyphenate the current page on demand by clicking a button. But then the add-on could ask if you want to remember that website and set it individual default to hyphenate every time. That would be choice plus ease of use.

    Now we would need to find someone able and willing to write such an add-on to make us happy.

    Copy & paste the code below to embed this comment.
  16. When discussing whether ragged or justified style is better readable, I would like to remind everybody that many other languages than English – especially German and Finnish – have extremely long words. Using narrow columns (often seen with image captions as well) without hyphenation you often end up with just one word per line and large holes.

    Copy & paste the code below to embed this comment.
  17. There is also another alternative to using ­ entity – <wbr > tag.

    More info can be found here:

    Copy & paste the code below to embed this comment.
  18. interesting article. i haven’t finished it though but i think this one really helps alot.

    Copy & paste the code below to embed this comment.
  19. When I started reading this article, I thought, “Cool!” By the time I finished it and read the comments, I was swayed that this is an interesting tech-demo, but not good practice.

    I can see how there would be special cases that might merit, like Heribert Wettels mentioned. However, I think the right answer is to avoid creating such tight spots in the design phase, thinking globally long before you’re putting in content.

    Copy & paste the code below to embed this comment.
  20. The author takes umbrage at javascript workarounds being labelled “hacks”, but that’s exactly what they are.

    hyphenator.js is a hack, because it uses javascript to provide a feature that should be implemented natively in the browser. It’s a stop-gap measure, just like using javascript to fix poor CSS support in old browsers (max-width, fixed positioning, etc.).

    The point about labelling something “hacky” is to draw attention to the costs of using it. In the case of hyphenator.js, you’re adding javascript to make H+J work. The cost is additional complexity (maintenance), the nasty bug in find-on-page, and performance.

    Remember that javascript is a blocking download, so the cost of javascript in kB cannot be directly compared to an image. Anything that interferes with the display of text content should be subjected to a harsh performance assessment, because a delay in supplying the text greatly affects the perceived responsiveness of the page.

    hyphenator.js is an impressive project. The typophile in me longs to use it. But the pragmatic website owner in me says that the objective cost greatly outweighs the small, subjective benefit.

    In other words, it’s just too hacky for my taste. Brilliant, but hacky.

    Copy & paste the code below to embed this comment.
  21. Wonderful article. There is a lot of annoyingly repetitive stuff in this thread, and some complete falsehoods.

    @Mike Hopley: JavaScript does not have to be a “blocking download.” This is why tools like YSlow’s analyzer recommend putting it at the end of your code. There are even tools out there that will compress, cache, and reposition your JS automatically.

    @Everybody who wants the world to know the web is not print: We are in a post-web-is-not-print world now. The web is the new print. If you don’t want to come along for the ride, you don’t have to use tools like those mentioned in the article.

    I used to be one of those “The web is not print” people, until I realized it made me instantly recognizable as someone who designed “like a web designer.” We need to push the body of web design work forward, not coddle it.

    I applaud Mathias for his work in pushing the cutting edge forward a bit more. I work next to a print designer of 25 years, and knowing about this sort of tool helps me bring her visual language to the web. It’s worth the effort.

    Copy & paste the code below to embed this comment.
  22. Sure. But if you put hyphenator.js at the end of your code, you will get a jarring “flash” when the hyphenation kicks in.

    You can’t have it both ways. Either you take the performance hit, or you live with the FOUC-like effect. Or you could just stick to ragged-right.

    It’s much the same problem as using @font-face: either you delay the text, or you get a flash of restyling when the font file arrives. For custom fonts, perhaps it’s worthwhile (at least for some designs). But for justified text?

    Copy & paste the code below to embed this comment.
  23. If you leave the word-breaking and hyphenation entities in your copy for html-formatted email (and they may look like word spaces after you’ve cleaned the text), they will cause problems for inline conversion offered by services such as Premailer <>. Be sure to strip them!

    Copy & paste the code below to embed this comment.
  24. To all:
    Thanks for the frank, sometimes passionate comments.
    I wrote this article in the spirit of “hey, take a look at this, what do you think?” and you’ve certainly let me and ALA’s readers know your mind. (And if bowerbird wants to let us all in on his secret sauce for better H&J, I’m all ears.)
    One thing in particular that I’d like to point out is that HTML is not only the future of ebooks, it’s the future of print, too. At least that’s what I see with my binoculars on, and pretty clearly, too.
    If I may make a suggestion: try viewing the quick’n’dirty desktop browser example from the article in Print Preview. (However you might feel about IE, it happens to have a good Print Preview mode.)
    It looks like a book. And if I were to include a print style sheet, I wouldn’t be locked into the pixel grid and I could make use of the high res environment of print just as easily as any PDF. Add a dash of web fonts and anything InDesign can do, I can do in the browser. (In fact, I’ve issued a private face-off challenge to an experienced book designer of my acquaintance and I’m hoping he sends me a few pages by years end so I can try my hand at duplicating them – with every typographical nuance intact – in browser rendered HTML.)

    Simply put: as a web author, H&J is a design option I want. I don’t care if it’s unnecessary. I don’t care if some people like it or don’t like it. I want the option. I want my H&J.

    Personally, I happen to like H&J onscreen for long passages of text, especially narrative. I also like it on reading devices like the iPad where the viewing distance is more intimate.
    But that’s my druthers and it would take special circumstances for me to consider imposing H&J as the default.

    With regards to using the soft-hyphen and javascript to get H&J today:
     Everything has its advantages and disadvantages.
    Let me say that again:
     Everything has its advantages and disadvantages.

    Hyphenator.js is what it is and I haven’t heard anybody argue that it isn’t the best we can do for H&J for now.
    Hacky, shmacky, whacky or not.
    I certainly do admire Mathias Nater’s effort and will be using it on occasion, absolutely. I have no doubt the implementation will improve. Getting some more eyeballs on it was a spur, for sure, and – the way I see it – a part of what ALA is all about.
    ‘Til later…

    Copy & paste the code below to embed this comment.
  25. I happen to like justified text, even without hyphenation. I’m not sure how it harms readability, for English text, if the width of the text is sufficient. For narrow columns: yeah, it produces weird spacing between words. But for wide blocks of text, it just looks so much nicer at a glance, and you don’t really notice the spacing issue (rarely is it more than a few extra pixels per space) when reading.

    That said, I still don’t think automatic hyphenation is something I’m going to implement on my own site.

    Also, the Declaration of Independence is clearly not a “U.S. legislative document”. Even if it was, what could possibly be offensive about it? The quasi-religious mention of a creator? Or just the fact that it was written by Americans?

    Copy & paste the code below to embed this comment.
  26. In a world of social networking, cms and other user generated text, I am wondering how relevant this topic is.
    Interesting article, though.

    Copy & paste the code below to embed this comment.