Community Creators, Secure Your Code!

by Niklas Bivald

33 Reader Comments

Back to the Article
  1. Or just use something like textile or markdown and move on with your life. What was that ajax sample showing exactly? Seemed like filler to me.
    Copy & paste the code below to embed this comment.
  2. Be very wary of all the nifty new things Web 2.0 brings us. Seems like an old message, but I suppose we could all use a reminder now and then. IE(Microsoft Internet Explorer) flops once again. Hopefully Microsoft will do a better job with IE7. :/
    Copy & paste the code below to embed this comment.
  3. The purpose of the ajax sample will be more clear when part two is released. Like an introduction for those who haven’t got a throughout knowledge of ajax… Regards,
    Niklas
    Copy & paste the code below to embed this comment.
  4. http://iamcal.com/publish/articles/php/processing_html/
    http://iamcal.com/publish/articles/php/processing_html_part_2/
    Copy & paste the code below to embed this comment.
  5. Very nice to show the bad code, but somehow the way to protect yourself is always missing in these kind of articles.
    Regular expressions are not exactly bread & butter for everyone, so if you want to get the world to notice your warning and act on it, some cut and paste examples would be very helpful.
    Copy & paste the code below to embed this comment.
  6. Worth checking if some of the strings listed on this site slip through your validation routines: http://ha.ckers.org/xss.html
    Copy & paste the code below to embed this comment.
  7. Martin said: bq. Very nice to show the bad code, but somehow the way to protect yourself is always missing in these kind of articles But as this was announced as Part One of a two-part series, the comment might be reserved until Part Two has run.
    Copy & paste the code below to embed this comment.
  8. When will part two be published? In the ALA(A List Apart) publication? Not being well versed on this subject, it sounds like a site is less likely to be susceptible to XSS(cross-site scripting) when one avoids the usage of eval and avoids muddling style, structure and behavior through the use of the style attribute and inline-javascript. Is this the case? I know this is easier said than done.
    Copy & paste the code below to embed this comment.
  9. It occurs to me if someone just sat down and wrote a stack-based (not regex) parser based closely on a stripped down versions of the HTML, XHTML, XML and CSS specifications, we could have something that would deal quite nicely with attempted XSS attacks. Remember: these are all well documented specifications and the browsers, which trigger these XSS attacks, simply adhere to these specifications. The ad hoc “tricks” the article prescribes can fall victim to clever attackers. For instance, if you were to use str_replace(‘javascript’, ‘’, $html) your script would still be vulnerable to javasjavascriptcript (this is documented in the XSS cheatsheet posted above, excellent reading for anybody interested in HTML validation).
    Copy & paste the code below to embed this comment.
  10. I think the problem there is that the browser (specifically IE in this case) _doesn’t_ adhere to the specifications; or at least, that it goes beyond the spec by parsing - and executing - sloppy and/or malformed code.
    Copy & paste the code below to embed this comment.
  11. This can be partially mitigated by using a proper DTD for your documents, but you’re right. I suppose the idea about giving users a limited toolset would help prevent malformed code. However… if the parser makes the code *valid as well*, it shouldn’t be a problem.
    Copy & paste the code below to embed this comment.
  12. Using real HTML, CSS and URI parsers seems like the most secure solution, and it only has to be done when processing the input, not every time it’s displayed. In Java, there’s “TagSoup”:http://mercury.ccil.org/~cowan/XML/tagsoup/ for parsing just about any input HTML, which is a good idea anyway, a few “CSS Parsers”:http://www.w3.org/Style/CSS/SAC/ and the provided URI parser, and presumably other languages have the same. If you include only the elements, attributes, CSS rules and URI methods you don’t understand, and correctly escape the output with the right “character encoding”:http://www.w3.org/International/O-charset.html I don’t see how anything can slip through.
    Copy & paste the code below to embed this comment.
  13. Seems to me that if I was to summarise this article in a few words, capturing all the important information that is not already second-nature to most web developers, it would be: _"The word ‘javascript’ can have line breaks (and spaces, and other separating chars?) in it”._ All the stuff about escaping special characters, white-listing HTML elements, and being careful about CSS input, have been well-known for years. It’s just that this new ‘ja-vas-cript’ IE trick has come into the limelight recently, because of the MySpace exploit that the author mentions.
    Copy & paste the code below to embed this comment.
  14. For the web app I develop in Perl, I’ve found HTML::Scrubber to be a good way to help in cleaning up input from untrusted sources - nicely customizable in a very perlish way. Check it out - http://search.cpan.org/~podmaster/HTML-Scrubber-0.08/Scrubber.pm does well with javascript, although I do not know about css (you can strip out any tag you’d like), or js in css.
    Copy & paste the code below to embed this comment.
  15. It is facicnating to see this issue discussed, we work on a number communitiy based sites and will now work on a solution for this. Upto now we have been using the HTML tidy variations for .net at… http://tidy.sourceforge.net/ I’m sure this will resolve most issues. We have found it fantiastic! Also for editing HTML try using http://tinymce.moxiecode.com/ We find this brilliant and has a range of options for narrowing the HTML tags allowed. Hope this helps  
    Copy & paste the code below to embed this comment.
  16. Partially in response to Brian Lepore (comment #8), I’d like to help underscore the threat of XSS. *Any time* user input is accepted (even if that input comes directly from a GET or POST variable), it needs to be properly escaped on output or you are at risk of an attack. Here’s an example that brings it home for me. Imagine a site where you store a cookie for authorization. As you may know, cookie contents are accessible through Javascript. Imagine if this site’s login page would display error messages in the URL, like login.php?error=Incorrect password. Seems innocent and common enough, but if I am an attacker, and I IM one of the site’s users with a URL like login.php?error=Incorrect password [removed]alert([removed])[removed]
    (contrived example), you can see how I’d be able to manipulate the cookies and send their login cookie to my site (through a Javascript redirect). Similarly this can be used for phishing by sending the user to a fake login page via a link on the real site. With XmlHttpRequest, I’d even be able to force the user to perform actions (via HTTP POST/GET) on the site - such as the voting example given in this article. To make matters worse, many of the new community sites that are springing up encourage the user to enter HTML, and correctly differentiating valid HTML from invalid HTML is a difficult process. This means that you can’t really use stuff like PHP’s strip_tags() function. Hope this helps!
    Copy & paste the code below to embed this comment.
  17. At the moment we’ve got a half-empty glass here. I can’t judge the contents just yet, because if I really want to get to the taste I need the whole thing.
    I’m sorry to say, but I find the half of this article useless. It might start making more sense when part two is out and about, but until then this seems like a very lengthy introduction.
    Copy & paste the code below to embed this comment.
  18. thomas: I’m sorry that my previous post did not state this, but I am aware of the idea of validating input to protect users. That said, I have never really understood why many sites have a tendency to use GET data like in your example, rather than keeping then the use of error code numbers. I know, it is quite annoying to have to look up the different numbers when you want to use something, but it saves the worry of someone injecting HTML into your site. In my opinion, the security benefit outways the simplicity in development. I like the check_tags function in the first link that *ban jax* posted. It looks like a beefed up version of strip_tags that fits the needs of most developers that need to allow HTML.
    Copy & paste the code below to embed this comment.
  19. bq. I like the check_tags function in the first link that ban jax posted. It looks like a beefed up version of strip_tags that fits the needs of most developers that need to allow HTML. The Iamcal code is quite interesting, but it doesn’t guarantee XHTML 1.0 valid code, since it doesn’t check the children of the elements (a tag within a tag). Also, it’s not easily extensible to environments that need a broader tag base: there’s a lot more to XSS in attributes than a few protocols. Especially true if you decide to allow the style attribute (which, as the article points out, can execute JavaScript too! Fun.) Shoot me, but I’m not sure why anyone would need image tags for most applications either.
    Copy & paste the code below to embed this comment.
  20. Am I being totally short-sighted here, or can all these security holes be resolved in one simple stroke: don’t let your users personalise their space through real code! My other half asked me a few days ago to help her style her MySpace stuff (and as it’s the first time I’ve really bothered going there I almost threw up when I saw how s**t it is code-wise) and then gave up after tearing my hair out for ages. For a start it clearly says “please don’t use CSS to remove any MySpace ads” so immediately I set about doing just that - and succeeded. I then decided to play a little prank on her by adding “table {display:none}” to her style sheet and promptly destroyed the entire site in preview mode. IMO these sorts of places - MySpace in particular - are so shoddily written it’ll take a CSS expert to write code to successfully style the soup of nested tables, divs and junk to get anything worthwhile looking at - and I doubt the majority of the user base will be these CSS experts (I know I’m not) - and certainly from my own experience anybody half web-dev savvy have their own blogs with crisp, clean blogging systems or just written their bloody own! I’m all for personalisation and marking your cyber-territory, but surely it’s quicker and easier for the users and safer for the admins to allow personalisation through forms and options. Let the user click the settings they want and the system generates the styles. Obviously you’ll still need to filter text input areas to avoid IE’s incompetence, but you’re already running a lot tighter ship. Or, I say again, am I being short-sighted?
    Copy & paste the code below to embed this comment.
  21. MySpace didn’t get popular because it was well-written ;-) It got popular because it gaves users control. However, I think that you do have a point in not letting users customize space through “real code.” Isn’t that what most forums and blogs do right now by not allowing HTML when they can avoid it? BBCode and Textile are much easier to secure than raw web input.
    Copy & paste the code below to embed this comment.
  22. Well of course you don’t have to let them use real code. You could let them pick between the options you give them. But you could also just not accept their content, or their photos, or their comments.. Like the previous poster said, MySpace sold for many millions of dollars, and it was only because “making profiles pretty” really appeals to teenaged girls and the boys who lust for them. People like to do their own thing.
    Copy & paste the code below to embed this comment.
  23. “But you could also just not accept their content, or their photos, or their comments..” I think that’s a totally different ball game - not accepting customisation through real code has nothign to do with censorship. I entirely agree that making “pretty profiles” is the attraction of MySpace and its ilk, but as it’s been mentioned, there are much more secure ways of going about it - allowing real-code submission is just asking for too much trouble.
    Copy & paste the code below to embed this comment.
  24. this is important stuff, blogging communities allow other users to view their code, but for web 2.0 and secure programs its important to know that the code can be hacker-proof.
    Copy & paste the code below to embed this comment.
  25. As well as allowing javascript URLs in CSS, IE also has a “feature” that lets properties be set using expressions, written in (of course) JavaScript. So you can use: <body [removed]alert(‘hi’));”>...</body> Just something else to watch out for…
    Copy & paste the code below to embed this comment.
  26. I think MySpace giving their users freedom to edit their templates CSS was a huge mistake. Sure, the default theme is ugly but the things that the majority of people do to their MySpaces is much worse. They just destroy them beyond any level of readability or sanity. MySpace would be nicer place without theme editing. Pure Volume is proof of this.
    Copy & paste the code below to embed this comment.
  27. Your article makes a good case for the security codes necessary to hold back abuse of the system. The system still needs to be refined so that legitimate users are not kept out in the same stroke we use to stop abusers. Thanks for raising the topic so well…
    Copy & paste the code below to embed this comment.
  28. I know the article is about XSS, but the example used points to another problem with a lot of these types of sites, not using a validation scheme for the ‘voting’ script, or scripts that control other types of changes. A simple check for a valid random unique id in voteOnAuser.php would kill any chance of a XSS vulnerability such as this from having any effect because the ‘vote’ would automaticaly be rejected. And a big applause to #28, if you’re going to allow customization then by all means have complete control over the code yourself.
    Copy & paste the code below to embed this comment.
  29. I sure there are two clear strategies to prevent XSS attacks:
    1. Format using tidy and then remove anything unexpected - leave only basic set of tags. 2. Separate administrating and displaying markup content on different sites (like blogger.com does). Correct me if I wrong.
    Copy & paste the code below to embed this comment.
  30. Instead of
    one should use
    ;) Makes things valid as well :D
    Copy & paste the code below to embed this comment.
  31. All that customization of web application reminds me of an article on network security.  Pick your poison, do you want to restrict the user to a know and limited set of features or chase all the hacks that people will find?  With the first approach you define and design it once; the other approach is a never ending race to ensure your application is safe from the known tricks and hacks. Ok, it takes more time to develop the white list; most certainly feels more restrictive from a user standpoint but I guess it depends if you prefer to appear on the front page because you’re application is great or because somebody took control of other people’s account.  Maybe I’m paranoid, but I’d rather spend time expending my applications then fixing my damaged reputation. Cheers!
    Copy & paste the code below to embed this comment.
  32. http://kevin.mesiab.com/wordpress/index.php/cragslist-vulnerability/ Apparently they need to heed the message. ;)  At the time of writing, this hole has not been patched.
    Copy & paste the code below to embed this comment.
  33. The ad hoc “tricks”? the article prescribes can fall victim to clever attackers. For instance, if you were to use str_replace(”˜javascript’, ‘’, $html) your script would still be
    www.replicahours.com vulnerable to javasjavascriptcript (this is documented in the XSS cheatsheet posted above, excellent reading for anybody interested in HTML validation).
    Copy & paste the code below to embed this comment.