Community Creators, Secure Your Code!
Issue № 215

Community Creators, Secure Your Code!

Personalization is a great feature, it allows users to make their personal pages come to life by adding colors, pictures, and even sound, but as with any user input, it is a security threat if not properly sanitized. The creation of a secure online community is a balancing act: your users should be able to personalize their pages using pseudo code or actual HTML, while remaining protected from vandals who might inject malicious JavaScript or otherwise cause harm.

Article Continues Below

One piece of the larger security puzzle is cross-site scripting (XSS). In part one of this two-article series, we will look at various XSS techniques you should be aware of, and at common methods of defending your community against them. In part two, we’ll use real-world examples to explore these techniques in greater detail.

The threat#section2

Malicious JavaScript injections are a threat at many levels. Using a full-fledged injection, an attacker could:

  • Change the presentation of the attacker’s personal pages in a forbidden way (this is the lowest level of severity, but could produce a misleading or confusing experience for other users).
  • Execute an action whenever a user enters the attacker’s page, such as voting for the attacker in a poll or adding the attacker to a buddy or “trusted” list.
  • Infect the personal pages of users who visit the attacker’s page, creating a spreading virus that might, in turn, execute malicious code or propagate spyware /viruses that exploit security flaws in popular browsers.

These are just three examples of what an attacker might do, but two things are already clear:

  1. XSS is a real threat. MySpace and many other community sites have already been attacked or compromised.
  2. Webmasters should, therefore, make sure that their sites are properly protected.

A real-world example using eval() and AJAX#section3

By using the eval() function, an attacker can execute long JavaScript commands and even self-made functions. The attacker could, for instance, use the XMLHTTP request object (the core component of AJAX) to send or retrieve a piece of information. A insertion that would force the victim to vote for the attacker could look something like this:

  // IE only (to shorten the example)
  http_request = new ActiveXObject(“Microsoft.XMLHTTP”);  // The string to POST (Taken from the community)
  send = “vote-id=123456789&vote=10”;  // We send the data to our function “nullfunction”
  http_request.onreadystatechange = nullfunction;  // Send it to the right page“POST”, “voteOnAuser.php”, true);
  // Sending as form data
  http_request.setRequestHeader(“Content-type”, “application/
  http_request.setRequestHeader(“Content-length”, send.length);
  http_request.setRequestHeader(“Connection”, “close”);  // Send
  http_request.send(send);  // In our case we don’t want to use the data being returned.
  If we’d wanted to (for example, to get our user id or any
  other information) we could have used this function to
  process the data.
  function nullfunction() {
  if (http_request

About the Author

Niklas Bivald

Niklas Bivald (LinkedIn, GitHub) is a tech guy at heart. He loves creative use of data. His passion is the belief that teach and creativity are like milk and cookies—not opposites. He's a long time lecturer with background from tech companies such as Spotify and various agencies.

33 Reader Comments

  1. Be very wary of all the nifty new things Web 2.0 brings us. Seems like an old message, but I suppose we could all use a reminder now and then.

    IE(Microsoft Internet Explorer) flops once again. Hopefully Microsoft will do a better job with IE7. :/

  2. The purpose of the ajax sample will be more clear when part two is released. Like an introduction for those who haven’t got a throughout knowledge of ajax…


  3. Very nice to show the bad code, but somehow the way to protect yourself is always missing in these kind of articles.
    Regular expressions are not exactly bread & butter for everyone, so if you want to get the world to notice your warning and act on it, some cut and paste examples would be very helpful.

  4. Martin said:

    bq. Very nice to show the bad code, but somehow the way to protect yourself is always missing in these kind of articles

    But as this was announced as Part One of a two-part series, the comment might be reserved until Part Two has run.

  5. When will part two be published? In the ALA(A List Apart) publication?

    Not being well versed on this subject, it sounds like a site is less likely to be susceptible to XSS(cross-site scripting) when one avoids the usage of eval and avoids muddling style, structure and behavior through the use of the style attribute and inline-javascript. Is this the case? I know this is easier said than done.

  6. It occurs to me if someone just sat down and wrote a stack-based (not regex) parser based closely on a stripped down versions of the HTML, XHTML, XML and CSS specifications, we could have something that would deal quite nicely with attempted XSS attacks. Remember: these are all well documented specifications and the browsers, which trigger these XSS attacks, simply adhere to these specifications.

    The ad hoc “tricks” the article prescribes can fall victim to clever attackers. For instance, if you were to use str_replace(‘javascript’, ”, $html) your script would still be vulnerable to javasjavascriptcript (this is documented in the XSS cheatsheet posted above, excellent reading for anybody interested in HTML validation).

  7. I think the problem there is that the browser (specifically IE in this case) _doesn’t_ adhere to the specifications; or at least, that it goes beyond the spec by parsing – and executing – sloppy and/or malformed code.

  8. This can be partially mitigated by using a proper DTD for your documents, but you’re right. I suppose the idea about giving users a limited toolset would help prevent malformed code. However… if the parser makes the code *valid as well*, it shouldn’t be a problem.

  9. Using real HTML, CSS and URI parsers seems like the most secure solution, and it only has to be done when processing the input, not every time it’s displayed.

    In Java, there’s “TagSoup”: for parsing just about any input HTML, which is a good idea anyway, a few “CSS Parsers”: and the provided URI parser, and presumably other languages have the same.

    If you include only the elements, attributes, CSS rules and URI methods you don’t understand, and correctly escape the output with the right “character encoding”: I don’t see how anything can slip through.

  10. Seems to me that if I was to summarise this article in a few words, capturing all the important information that is not already second-nature to most web developers, it would be: _”The word ‘javascript’ can have line breaks (and spaces, and other separating chars?) in it”._

    All the stuff about escaping special characters, white-listing HTML elements, and being careful about CSS input, have been well-known for years. It’s just that this new ‘ja-vas-cript’ IE trick has come into the limelight recently, because of the MySpace exploit that the author mentions.

  11. It is facicnating to see this issue discussed, we work on a number communitiy based sites and will now work on a solution for this. Upto now we have been using the HTML tidy variations for .net at…

    I’m sure this will resolve most issues. We have found it fantiastic! Also for editing HTML try using

    We find this brilliant and has a range of options for narrowing the HTML tags allowed.

    Hope this helps

  12. Partially in response to Brian Lepore (comment #8), I’d like to help underscore the threat of XSS. *Any time* user input is accepted (even if that input comes directly from a GET or POST variable), it needs to be properly escaped on output or you are at risk of an attack.

    Here’s an example that brings it home for me. Imagine a site where you store a cookie for authorization. As you may know, cookie contents are accessible through Javascript. Imagine if this site’s login page would display error messages in the URL, like login.php?error=Incorrect%20password.

    Seems innocent and common enough, but if I am an attacker, and I IM one of the site’s users with a URL like login.php?error=Incorrect%20password%20
    (contrived example), you can see how I’d be able to manipulate the cookies and send their login cookie to my site (through a Javascript redirect). Similarly this can be used for phishing by sending the user to a fake login page via a link on the real site.

    With XmlHttpRequest, I’d even be able to force the user to perform actions (via HTTP POST/GET) on the site – such as the voting example given in this article.

    To make matters worse, many of the new community sites that are springing up encourage the user to enter HTML, and correctly differentiating valid HTML from invalid HTML is a difficult process. This means that you can’t really use stuff like PHP’s strip_tags() function.

    Hope this helps!

  13. At the moment we’ve got a half-empty glass here. I can’t judge the contents just yet, because if I really want to get to the taste I need the whole thing.
    I’m sorry to say, but I find the half of this article useless. It might start making more sense when part two is out and about, but until then this seems like a very lengthy introduction.

  14. thomas:

    I’m sorry that my previous post did not state this, but I am aware of the idea of validating input to protect users.

    That said, I have never really understood why many sites have a tendency to use GET data like in your example, rather than keeping then the use of error code numbers. I know, it is quite annoying to have to look up the different numbers when you want to use something, but it saves the worry of someone injecting HTML into your site. In my opinion, the security benefit outways the simplicity in development.

    I like the check_tags function in the first link that *ban jax* posted. It looks like a beefed up version of strip_tags that fits the needs of most developers that need to allow HTML.

  15. bq. I like the check_tags function in the first link that ban jax posted. It looks like a beefed up version of strip_tags that fits the needs of most developers that need to allow HTML.

    The Iamcal code is quite interesting, but it doesn’t guarantee XHTML 1.0 valid code, since it doesn’t check the children of the elements (a tag within a tag). Also, it’s not easily extensible to environments that need a broader tag base: there’s a lot more to XSS in attributes than a few protocols. Especially true if you decide to allow the style attribute (which, as the article points out, can execute JavaScript too! Fun.)

    Shoot me, but I’m not sure why anyone would need image tags for most applications either.

  16. Am I being totally short-sighted here, or can all these security holes be resolved in one simple stroke: don’t let your users personalise their space through real code!

    My other half asked me a few days ago to help her style her MySpace stuff (and as it’s the first time I’ve really bothered going there I almost threw up when I saw how s**t it is code-wise) and then gave up after tearing my hair out for ages.

    For a start it clearly says “please don’t use CSS to remove any MySpace ads” so immediately I set about doing just that – and succeeded. I then decided to play a little prank on her by adding “table {display:none}” to her style sheet and promptly destroyed the entire site in preview mode.

    IMO these sorts of places – MySpace in particular – are so shoddily written it’ll take a CSS expert to write code to successfully style the soup of nested tables, divs and junk to get anything worthwhile looking at – and I doubt the majority of the user base will be these CSS experts (I know I’m not) – and certainly from my own experience anybody half web-dev savvy have their own blogs with crisp, clean blogging systems or just written their bloody own!

    I’m all for personalisation and marking your cyber-territory, but surely it’s quicker and easier for the users and safer for the admins to allow personalisation through forms and options. Let the user click the settings they want and the system generates the styles.

    Obviously you’ll still need to filter text input areas to avoid IE’s incompetence, but you’re already running a lot tighter ship.

    Or, I say again, am I being short-sighted?

  17. MySpace didn’t get popular because it was well-written 😉 It got popular because it gaves users control.

    However, I think that you do have a point in not letting users customize space through “real code.” Isn’t that what most forums and blogs do right now by not allowing HTML when they can avoid it? BBCode and Textile are much easier to secure than raw web input.

  18. Well of course you don’t have to let them use real code. You could let them pick between the options you give them. But you could also just not accept their content, or their photos, or their comments..

    Like the previous poster said, MySpace sold for many millions of dollars, and it was only because “making profiles pretty” really appeals to teenaged girls and the boys who lust for them. People like to do their own thing.

  19. “But you could also just not accept their content, or their photos, or their comments..”

    I think that’s a totally different ball game – not accepting customisation through real code has nothign to do with censorship.

    I entirely agree that making “pretty profiles” is the attraction of MySpace and its ilk, but as it’s been mentioned, there are much more secure ways of going about it – allowing real-code submission is just asking for too much trouble.

  20. As well as allowing javascript URLs in CSS, IE also has a “feature” that lets properties be set using expressions, written in (of course) JavaScript. So you can use:

    Just something else to watch out for…

  21. I think MySpace giving their users freedom to edit their templates CSS was a huge mistake. Sure, the default theme is ugly but the things that the majority of people do to their MySpaces is much worse. They just destroy them beyond any level of readability or sanity.

    MySpace would be nicer place without theme editing. Pure Volume is proof of this.

  22. Your article makes a good case for the security codes necessary to hold back abuse of the system. The system still needs to be refined so that legitimate users are not kept out in the same stroke we use to stop abusers. Thanks for raising the topic so well…

  23. I know the article is about XSS, but the example used points to another problem with a lot of these types of sites, not using a validation scheme for the ‘voting’ script, or scripts that control other types of changes.

    A simple check for a valid random unique id in voteOnAuser.php would kill any chance of a XSS vulnerability such as this from having any effect because the ‘vote’ would automaticaly be rejected.

    And a big applause to #28, if you’re going to allow customization then by all means have complete control over the code yourself.

  24. I sure there are two clear strategies to prevent XSS attacks:
    1. Format using tidy and then remove anything unexpected – leave only basic set of tags.

    2. Separate administrating and displaying markup content on different sites (like does).

    Correct me if I wrong.

  25. All that customization of web application reminds me of an article on network security.

    Pick your poison, do you want to restrict the user to a know and limited set of features or chase all the hacks that people will find?

    With the first approach you define and design it once; the other approach is a never ending race to ensure your application is safe from the known tricks and hacks.

    Ok, it takes more time to develop the white list; most certainly feels more restrictive from a user standpoint but I guess it depends if you prefer to appear on the front page because you’re application is great or because somebody took control of other people’s account.

    Maybe I’m paranoid, but I’d rather spend time expending my applications then fixing my damaged reputation.


  26. The ad hoc “tricks”? the article prescribes can fall victim to clever attackers. For instance, if you were to use str_replace(“˜javascript’, ”, $html) your script would still be vulnerable to javasjavascriptcript (this is documented in the XSS cheatsheet posted above, excellent reading for anybody interested in HTML validation).

Got something to say?

We have turned off comments, but you can see what folks had to say before we did so.

More from ALA