The Perfect 404

by Ian Lloyd

57 Reader Comments

Back to the Article
  1. Ian Lloyd is out and about in Australia and will have intermittent Internet access for a few days after the publication of this article. Should you have any questions for him, please be patient. All will be answered in the fullness of time.
    Copy & paste the code below to embed this comment.
  2. The best way to deal with 404s is not to have them in the first place. Every time I move a page on my site I leave behind a file in the same place with the same filename that tells the user the page has been moved, and gives them the new link. The alternate method is to use .htaccess files to automatically redirect the user to the correct place. You can redirect based on wildcards, so whole directories can be catered for. No more 404s.
    Copy & paste the code below to embed this comment.
  3. Checking URL spelling might help reduce the number of 404s. Apache can do this: http://httpd.apache.org/docs/mod/mod_speling.html
    Copy & paste the code below to embed this comment.
  4. Arg! I’ve stated this before, but I’ll say it again, as it seems that people overlook this. “expired links” ie. links that used to exist but now exist in a new location [vs. mis-typed links] should be redirected via an HTTP ERROR 301 “file permanently moved” or an HTTP ERROR 302 “file temporarily moved”. These are seamless re-directs and will automatically update personal bookmarks *and* allow search engine crawlers to update their entries. 301 and 302 errors are obviously a better choice than presenting the visitor with a page telling them that something doesn’t exist, with the bonus of “fixing” old search engines without begging Google et al to change them for you. 404s should only be used in cases where the URLs are totally invalid due to human typographic errors, or the content doesn’t exist on the site anymore
    Copy & paste the code below to embed this comment.
  5. Actually, if the URL used to be valid but isn’t anymore, and the content does not exist at any other location, the proper HTTP status code is 410, not 404. Also, Ian, the sample links to accessify.com attempt to download the page rather than displaying it.  I’m using the latest nightly build of Firebird.  It works properly in IE6.
    Copy & paste the code below to embed this comment.
  6. i didn’t think this would be a very informative article, since error pages are very easy and basic, but there were some pretty useful tidbits stashed. kudos.
    Copy & paste the code below to embed this comment.
  7. If your site is running on top Apache and you’re allowed to use .htaccess files, create a file called, ‘.htaccess’ and put the below (between the snips)in there: [snip]
    Redirect Permanent /the/old/page.html http://heythatsyoursite.com/the/new/page.html
    [snip] (all on one line, mind) Save this file in your root public_html directory (the same place you put, say, a robots.txt file) That way, the user will never see an error page, which is what a “404” is (an error smarty) Yes, a visitor will never know that they need to, “update their bookmarks”, but when’s the last time you did that? That’s what I thought… If you want to get really techy, you can use the “RedirectMatch” thingy, like this: [snip]
    RedirectMatch ^/the/old/directory http://heythatsyoursite.com/the/new/directory
    [/snip] If you have some sort of odd fetish and know regex’s like the back of your hand (hairy hands up!), you’ll know that the caret (^) means: “match the beginning of a line”. We follow that with a directory. That means, any request to the directory will be redirected to a different directory. Finally, notice that you can use ANY url for,“http://heythatsyoursite.com”. For instance, if you move your entire site, or, a portion of your site - say your knitting tips are REALLY taking off and deserve their own site, it’s really no problem now, init? Cheers, Justin Simoni
    Copy & paste the code below to embed this comment.
  8. That’s because the server returns Content-Type: application/octet-stream instead of text/html in the HTTP headers. Mozilla doesn’t know what to do with an octet-steam, so it pops up a Save to Disk box.
    I’ve sent a bug report to accessify.com.
    Copy & paste the code below to embed this comment.
  9. .htaccess I have updated .htaccess when moving files and directories around, but I ususally do it in the old subdirectory so as to reduce the load on the root .htaccess; perhaps this is overkill, and may be harder to maintain. 410 Not sure how to make Apache spit this out.  I’d still probably want to show the same error page. Actually, we show the same error page for 403 as for 404, simply to keep the content simple for the end-user. Archive.org If we can’t find a trace of the file, I wonder whether it would make sense to offer to search for the page at archive.org, with the warning that the page is presumably out of date.
    Copy & paste the code below to embed this comment.
  10. Dunstan at 1976design has a great <a title=“1976design 404” href=“http://www.1976design.com/blog/ndex”>example 404.
    Copy & paste the code below to embed this comment.
  11. Sorry, didn’t know what was kosher for link making in comments here on ALA. Perhaps a short description would be useful, much like a good 404.
    Copy & paste the code below to embed this comment.
  12. How about not using <img> at all as it’s presentational markup ;)
    Copy & paste the code below to embed this comment.
  13. << Sorry, didn’t know what was kosher for link making in comments here on ALA. Perhaps a short description would be useful, much like a good 404. >> At the top of every page of the forum is this short description: Discuss this article. HTML tags and entities display as source; they do not render. To create a live link, simply type the URL (including http://).
    Copy & paste the code below to embed this comment.
  14. Some links you should also add to your 404: http://www.theobvious.com/archive/1998/11/02.html
    Copy & paste the code below to embed this comment.
  15. I wonder how to do the exact same thing from server side (PHP,CGI,etc). I don’t prefer Javascript though.
    Copy & paste the code below to embed this comment.
  16. Scripting to enhance the error page is a good idea, especially when you have limited control over the server configuration and cannot implement custom error pages on the server. The examples seem to be vulernable to cross-site scripting, though, with referrer information written out unescaped, which may have some undesirable side effects ...
    Copy & paste the code below to embed this comment.
  17. At Virgin Radio, we worked hard on our 404 errors. I liked this article, but think we’re ahead of the crowd here, and we’ve some more examples you might like. http://www.virginradio.co.uk/linktous
    - this page doesn’t exist, but we automatically search for possible results from the URL you gave us. Why not? It makes sense. But if you’re going to run searches automatically, you need to make sure this doesn’t happen on non-HTML requests - it’ll kill your server otherwise! A graphics file, like
    http://www.virginradio.co.uk/error.gif
    ...returns a much simpler error page.
    (As does a multimedia file, like http://www.virginradio.co.uk/error.asx ) We also use our 404 error code for simpler URLs for use in advertising.
    http://www.virginradio.co.uk/breakfast properly directs to the correct file at http://www.virginradio.co.uk/djsshows/shows/pgbreakfast/index.html Here’s hoping these ideas are useful. If you want to contact me further, I’m at http://www.virginradio.co.uk/thestation/contactus/?to=techies
    Copy & paste the code below to embed this comment.
  18. James, thanks for sharing. That’s awesome.
    Copy & paste the code below to embed this comment.
  19. it’s so 1997 and of little practical purpose, but I can’t avoid chuckling in front of the 404 you haven’t covered: the smart-assed 404 (eg, http://www.gamewyrd.com/archives/wyrderrors.php )
    Copy & paste the code below to embed this comment.
  20. Another addition worthy of note is that if your site is running on Apache (version 1.3 or greater) then you can enable mod_speling. This module allows for capitilisation issues (only a problem on non-windows servers) and also allows for one spelling error. An as example if you typed in index.ht instead if index.htm, or statd instead of stats, Apache will attempt to track down the right file or directory before issuing you the appropriate error.
    Copy & paste the code below to embed this comment.
  21. Just to add to what Josh L posted above, mod_speling can be good for taking care of some of the typos, though obviously is not ideal for those worried about performance. It does, however, check both spelling and capitalisation. In addition, if you are do use a server side language to serve up your 404 pages (and other error docs), it can be extremely beneficial to have the page email you whenever an error doc is served. Ideally, include all request headers with it too, so you can track down the cause of the error.
    Copy & paste the code below to embed this comment.
  22. Thanks for everyone who has commented so far. It has caught me off guard a little though, and as editor Erin pointed out, it’s also happening while I’m on travels and so I might not be able to respond in as timely a fashion as I’d like. Thanks to James Cridland for building on the ideas I had in the article. I knew there would many other great ideas, so thanks for sharing. As many have pointed it, there is an irony that in a piece about 404 errors, there has been another error in one of the example 404 pages, that being that the 404 page being served up is coming through as an octet stream. In IE it renders fine but on Firebird it prompts to save the custom 404 page as a file. If anyone knows how to remedy this, please add to the discussion. Thanks also to Klaus for pointing out that Cross-site scripting was a potential problem. I don’t know how much of a problem this is - I do not have the mind of a hacker! - but I am not married to the notion that the script must be client-side. If the server config allows, I would favour server-side scripting which should, as I understand it, stop (or at least reduce) the spectre of cross-site scripting. If’ I’m wrong, please do tell ASAP.
    Copy & paste the code below to embed this comment.
  23. Depending on your setup, it is helpful to add some server-side code into the custom 404 page to eMail the administrator information about the offending link. With PHP it is easy to create a mail message containing the URL (if they mis-typed it you can see what it was) the Referrer (so you can see if google has an outdated link or fix a link within yourown site), and other information like date/time, etc. If you don’t have the ability to put this code in your custom 404 page, that’s OK. You can create a php file with just the mail function. Then in the HTML of the custom 404 page add a something like this in the head tag. [removed][removed] This contains the PHP mail function. If done properly, it will be called when ever someone hits the 404 page. This will eMail you a heads-up that there is an error. If the offending link is from within the site, you can fix it. If the link comes from a search engine, you can temporarily re-create the link and point it somewhere else. You can even add a snippit of code to the top to tell the search engine the content has permenantly moved header(“Status : 301 Moved Permanently”);
    header(“Location: correct_link/”); /* Redirect browser */
    exit; Having a well designed custom 404 page is great, but after one person gets there, you should be making every effort that no one else does. Without any information about what caused the 404, someone is doomed to repeat the error.
    Copy & paste the code below to embed this comment.
  24. I have built a custom 404 page with some of the ideas here and toyed with the idea of an automatic email but a glance at my log files put me off the idea - most of the 404 errors come from script kiddies and worms looking for vulnerable web servers. Instead I plan to add an optional ‘Tell the webmaster’ form. Clicking submit without doing anything else will send the URL but they will have the option to add a comment and a return email. Back to our current 404 page - Its all server-side using vbscript on an MS box. I use a text file that holds possible matches for the error URL. These include names for our larger directories, URLs for pages that have been moved, wrong URLs that published elsewhere. If someone has a dud URL that points to somewhere in our ‘rules’ directory,  they will be offered a link to the ‘Current rules and regulations’ contents page. If its a known problem, eg a magazine has published a URL with a typo, we can offer the exact page. The same idea on our search page gives ‘recommended links’ for our most common search terms. I’ll have to look at the idea of dealing separately with typos and wrong referrals. What happens when people just chop off the end of the URL looking for an index page? We get a few of those. If you have to deal with a Microsoft IIS server and would like the code, I’m happy to supply it.
    http://casa.gov.au
    Copy & paste the code below to embed this comment.
  25. If you are scripting a custom 404 page then it is critical to send appropriate headers. For example: <?php
    header(“HTTP/1.0 404 Not Found”);
    ?> This will inform user agents (such as Googlebot) that the page is a 404.  
    Copy & paste the code below to embed this comment.
  26. Does anybody have a good way of going to a URL and specifying a false referrer? Not sure how to test drive all the different pieces of this without, for example, convincing a search engine to link to a nonexistent page in my site…
    Copy & paste the code below to embed this comment.
  27. Thought your article was great. Linked to it in this weeks newsletter I just sent out :)
    Copy & paste the code below to embed this comment.
  28. One thing that is not mentioned in this otherwise good article is that you should have your system set up to send mails or other forms of data to your webmaster/system everytime a 404 fault is registered. This ensures a proactive approach if something is wrong and needs to be fixed i.e. someone in the marketing department has done a mailing with a wrong URL.
    Copy & paste the code below to embed this comment.
  29. No-one’s mentioned this site yet, the “404 Research Lab”: http://www.plinko.net/404/ . Tips and links to lots of custom 404 pages. Nice article btw, pity my ISP won’t let me write a suitable .htaccess file (sigh).
    Copy & paste the code below to embed this comment.
  30. Nate, you can just have a link to a non-existant page from another site.  Just post it in a forum or something and click on it.  That should work for testing purposes.  In response to the article, I think it has some good ideas.  I will not use Javascript to implement this, but I undertsnad why the author did so.  Thanks for taking the time to write this article, good ideas!
    Copy & paste the code below to embed this comment.
  31. I quite like the 404 page at The Sisters of Mercy’s web site ..(http://www.thesistersofmercy.com/error404page.html)
    Copy & paste the code below to embed this comment.
  32. 1. The <!—and //—> comment delimeters for Javascript have not really been needed since Netscape 1.0 died - they were there to prevent browsers that were not aware of the [removed] element from displaying the script verbatim. 2. More importantly: The example error document does not validate as XML, which means that if you attempt to serve this as application/xhtml+xml. While escaping this with a CDATA section could work in some cases: [removed]
    <![CDATA[
    ... unescaped script content ...
    ]]>
    [removed] .. there are also cases where this theoretically could fail (e.g. if you serve this to a client that doesn’t understand what a CDATA section is, it could throw script errors).  If you wanted to satisfy both the pseudo-xhtml-crowd, and the old-browser-crowd, you could use something insanely complex like [removed]<!—//—><![CDATA[/><!—
    ...unescaped script content
    //—><!]]>[removed] In the end - you’d really be better off using an external script that modifies the source of the document - which would also do away with the need for a <noscript> block.
    Copy & paste the code below to embed this comment.
  33. Hello… I’ve done a tentative converstion of this to php for those of you who may want it. It’s kind of tough to debug because I can only check it through www.wannabrowser.com so there may be a few slight errors, but I think it’s pretty close to done.  If anyone out there wants to do a double-check of my conversion please do!  Keep in mind that this code will still need to be modified a bit to suit your specific needs.  http://www.sitepoint.com/forums/showthread.php?p=1076762
    Copy & paste the code below to embed this comment.
  34. It’s also posssible to tailor the content of a 404 page based on the file or file type requested (James Cridland hinted at this in an earlier post). For example I cooked up a little something becuase I wanted to offer mp3 downloads for a limited time period, then just delete the files and forget about them (yes, yes, I know, moving/changing urls is very bad practice but sometimes you just can’t help it. I can’t anyway). I did something like this in php: if(preg_match(’/mp3$/’ $_SERVER[‘REQUEST_URI’]))
    {
    echo ‘<!—mp3 match found!—>’;   // custom error message } else {   // standard 404 error message } It at least allows you to offer a little explanation. Trivial demo: http://malaise.sorehead.org/nosuchpage.html
    http://malaise.sorehead.org/nosuchsong.mp3 There’s probably a dozen better ways to do the same thing, but that seemed to work for me.
    Copy & paste the code below to embed this comment.
  35. Or, if you prefer your php to parse correctly subsitute: if(preg_match(’/mp3$/’, $_SERVER[‘REQUEST_URI’])) * sigh *
    Copy & paste the code below to embed this comment.
  36. looksmart has changed their search param from ‘key’ to ‘qt’
    Copy & paste the code below to embed this comment.
  37. I screwed up my 404 page once; hopefully others might benefit from my mistake. My host offered “missing.html” as the 404 handler, and I had a basic 404 page done some time ago.  Later I converted my site to PHP, and added a directive with mod_rewrite to redirect *.html to *.php.  I noticed this stopped loading my 404 page, so I thought I’d just rename missing.html to missing.php. After a very long time I began to notice my custom 404 page was appearing in search engine results.  Turns out my PHP redirect was making errors return a 301 (moved permanently) message instead of 404.
    Copy & paste the code below to embed this comment.
  38. If you mistype the WAP url above, the server will send a 404 status followed by a nice little HTML page. But WAP phones can’t read HTML, so all the user will see, is “INVALID FILE TYPE”. Conclusion 1: The WAP gateway should intercept the 404 message, and supply its own WML error page. Conclusion 2: The WAP gateway doesn’t, so if your site has WAP pages, it should send WML error pages to WAP user agents. In general, we should always try to send a MIME type that the UA says it will accept. But IE doesn’t say that it accepts text/html, so forget it, see http://landsbank.fo/tools/http.headers/http.headers.html.cfm
    Copy & paste the code below to embed this comment.
  39. Bruce - I agree entirely, it seems like a good idea to include an auto email facility (if you can do this) to notify the webmaster, but it can get out of hand too quickly. Many of these ideas were used in the error 404 for Nationwide Building Society http://www.nationwide.co.uk/ (my place of employment when I’m not slacking around the world!) and the key thing was not to overload some poor admin person. Hence, the technique was an opt-in one - if the page they wanted was important, they’d probably report it otherwise it gets ignored. Or at least that’s the theory!
    Copy & paste the code below to embed this comment.
  40. Hello. I read and i didn’t see anyone posting the url to this site : http://www.404lounge.net/
    This is a nice gallery.
    Copy & paste the code below to embed this comment.
  41. An error is an error. An exception should not be replaced by anything! If I download a web-document using a spider or robot I *WANT* the 404 as an error, not as a page. The ‘invention’ you show us here also break link-checkers, since they simply will find: aha, there is a document on this link, so all is fine. Usability is to be enhanced by applications, not by data-files. If the data is missing it is missing. If the programmer of the web-server wants to analyze the user-agent-string and send a special document: ok. If the browser does offer a better option: ok. But if the server sends a search- and report-form to the robot: bad!
    Copy & paste the code below to embed this comment.
  42. Read the provided links.  This is not a hack, this is an encouraged standard by Apache. ErrorDocument is an Apache directive that only customizes a standard Apache response containing important HTTP header information (404/301/etc).  The URL http://www.mydomain.com/error/404.html is not going to be written to your browser’s address bar.  Apache will serve the appropriate header information, i.e. the 404, and then provide this page in place of the incorrectly requested one.  Google does not catalog pages that give the appropriate 404 header information, nor 301, etc.  You can deny direct access to those files by simply adding a ‘Deny From All’ htaccess file in the error directory itself. The page that is served in conjunction with ErrorDocument is exactly the same as what you *WANT* and *GET*, this is only Apache’s way of allowing you to customize it’s look, feel, & functionality. If you’re still not convinced, read this: http://httpd.apache.org/docs/mod/core.html#errordocument
    http://httpd.apache.org/docs/custom-error.html
    http://httpd.apache.org/docs/misc/custom_errordocs.html Excerpt :: If all attempts to locate the content fail, Apache returns an error page with HTTP status code 404 (file not found). The appearance of this page is controlled with the ErrorDocument directive and can be customized in a flexible manner as discussed in the Custom error responses and International Server Error Responses documents.    
    Copy & paste the code below to embed this comment.
  43. Better late than never.
    Ian and Andrew inspired me to write an ASP custom 404 page based on their ideas/design/style. Which is useful because I can actually use it for the health systems’ website that I manage. I believe simple and consistent options for all platforms could make 404 error pages useful across the board. eh - eNjoy…
    http://www.tastypopsicle.com/404.asp
    Copy & paste the code below to embed this comment.
  44. Okay, I added a search for an archival copy of pages that have been deleted from http://www.sfmuni.com Of course, there is no guarantee that Archive.org will have archived the particular page if it indeed ever existed. Example of a page we recently deleted. http://www.sfmuni.com/rid/cac/ca000706.htm
    Copy & paste the code below to embed this comment.
  45. from experience… if you use apache’s mod spelling then that’s it for server side (untill apache 2). if your isp has mod spelling turned on, you can turn it off with .htaccess but your error page (php etc) will still not run as the page has already been sent to mod spelling.
    Copy & paste the code below to embed this comment.
  46. MSIE/PC is per default set to
    “Show friendly HTTP Error Messages”
    thus making it impossible for ca. 95% of all users to benefit from the article’s idea as they are presented with what MSIE thinks is a good 404. (or 401 or the like) The default setting can be overruled easily but most users don’t (and wouldn’t know about it anyway). Web developers tend not to use MSIE for private surfing, (quite OK) so I guess that’s how the article came into beeing; good idea, but not fitting reality. By the way, I don’t see the point for relaying on client side scripting, server side offers the same possibilities here and wouldn’t exclude ca. 10% of all users.
    (some stats state that number for JS-less users)
    Copy & paste the code below to embed this comment.
  47. octet-stream solution (re Ian’s page 3 post) “...there has been another error in one of the example 404 pages, that being that the 404 page being served up is coming through as an octet stream. In IE it renders fine but on Firebird it prompts to save the custom 404 page as a file. If anyone knows how to remedy this, please add to the discussion.” IE violates the W3C rule of using MIME-type to determine how to handle a file; it uses the dot-extension instead. So IE will work fine with application/octet-stream (a default MIME type for some servers) while standards-compliant browsers will attempt to download the file.  The solution is to send .asp files out with the proper MIME type. In Apache, you would put this line in the .htaccess file of the root folder of your Web site (if you are the server master, you would put it in httpd.conf in /www/conf instead): AddType text/html .asp Other Web servers may use other syntax and files. ————————————————— re: MSIE 404 default The MSIE default page for 404’s and other server errors only shows if the site-supplied 404 page is shorter than a certain number of bytes, 512 I believe.  If you want to show a custom page, you need to make sure it is longer than that length, so don’t be *too* terse.
    Copy & paste the code below to embed this comment.
  48. http://www.juicystudio.com/tutorial/xhtml/mime.asp#asp Please ignore the octet-stream portion of the previous post.  Of course, addType wouldn’t work for server-side generated pages.
    Copy & paste the code below to embed this comment.
  49. >The MSIE default page for 404’s and other
    >server errors only shows if the site-
    >supplied 404 page is shorter than a certain
    >number of bytes, 512 I believe. I tried this with MSIE6/WinXP, works well, 512b is the exact threshold. Thanks for the info, calling the article “useless for 95% of all users” was rash, I apologize.
    Copy & paste the code below to embed this comment.
  50. I read your article and then got to work on my site! Inspiration I tell you. Anyway, while researching the 404 custom error page thingy, I stumbled onto this site, http://www.plinko.net/404/ Someone has dedicated a section of their site to custom 404 pages. That’s amazing!
    Copy & paste the code below to embed this comment.
  51. One thing that annoys me almost everyday: Some people redirect every 404 Error to their Main 404 page. So I loose the original URL in the case I’d like to debug my incorrectly typed URL or to hack it in order to look what I can find around. So, whatever 404 page you put, please, don’t redirect the user somewhere else.
    Copy & paste the code below to embed this comment.
  52. What might be a good feature to have on 404 pages is a link to google’s cache, entering the adress that creates the 404 error. It might help the users to get what they are looking afterall. http://www.google.com/search?q=cache:[pagethatisnotfound]
    Copy & paste the code below to embed this comment.
  53. Good solid article with useful info. Check out Apple’s 404 error page (http://www.apple.com/gobbledeegook) - they seem to have really hit the nail on the head here.
    Copy & paste the code below to embed this comment.
  54. There is no reason for 404 error pages. In the years I am browsing the web, no single 404 page has helped me and your approach does not help either. The site is either searchable or not. If so, what does it cost to return to the top or the previous page and access a search box there ? 404 is an error. An exception. It means: No document at given location. This is important ! Robots, Spiders, Downloaders use this to compute results. And these programs are the ‘other’ users of the web. You simply let them not read. Whenever I mirror a site with ‘wget’ I am very angry for all those 404-error-pages, that mean nothing to me. Even worse, this way sepcialized programs are not able to determine “holes” in a web-document. How with an error is being dealt is solely up to the clients programmer and user, not (!) to the web-designer. The web would be much nicer, if more web-designers would realize, that the internet is not a human-only territory.
    Copy & paste the code below to embed this comment.
  55. TEST
    Copy & paste the code below to embed this comment.
  56. TEST2
    Copy & paste the code below to embed this comment.
  57. Thank you very much for this great tutorial. It helped me very much.
    Good work.
    Greetig form Germany
    Dominik
    Copy & paste the code below to embed this comment.