Community Creators, Secure Your Code!

by Niklas BivaldApril 18, 2006

Personalization is a great feature, it allows users to make their personal pages come to life by adding colors, pictures, and even sound, but as with any user input, it is a security threat if not properly sanitized. The creation of a secure online community is a balancing act: your users should be able to personalize their pages using pseudo code or actual HTML, while remaining protected from vandals who might inject malicious JavaScript or otherwise cause harm.

Article Continues Below

One piece of the larger security puzzle is cross-site scripting (XSS). In part one of this two-article series, we will look at various XSS techniques you should be aware of, and at common methods of defending your community against them. In part two, we’ll use real-world examples to explore these techniques in greater detail.

The threat#section2

Malicious JavaScript injections are a threat at many levels. Using a full-fledged injection, an attacker could:

Change the presentation of the attacker’s personal pages in a forbidden way (this is the lowest level of severity, but could produce a misleading or confusing experience for other users).
Execute an action whenever a user enters the attacker’s page, such as voting for the attacker in a poll or adding the attacker to a buddy or “trusted” list.
Infect the personal pages of users who visit the attacker’s page, creating a spreading virus that might, in turn, execute malicious code or propagate spyware /viruses that exploit security flaws in popular browsers.

These are just three examples of what an attacker might do, but two things are already clear:

XSS is a real threat. MySpace and many other community sites have already been attacked or compromised.
Webmasters should, therefore, make sure that their sites are properly protected.

A real-world example using eval() and AJAX#section3

By using the eval() function, an attacker can execute long JavaScript commands and even self-made functions. The attacker could, for instance, use the XMLHTTP request object (the core component of AJAX) to send or retrieve a piece of information. A insertion that would force the victim to vote for the attacker could look something like this:

eval(

  // IE only (to shorten the example)

  http_request = new ActiveXObject(“Microsoft.XMLHTTP”);  // The string to POST (Taken from the community)

  send = “vote-id=123456789&vote=10”;  // We send the data to our function “nullfunction”

  http_request.onreadystatechange = nullfunction;  // Send it to the right page

  http_request.open(“POST”, “voteOnAuser.php”, true);

  // Sending as form data

  http_request.setRequestHeader(“Content-type”, “application/

   x-www-form-urlencoded”);

  http_request.setRequestHeader(“Content-length”, send.length);

  http_request.setRequestHeader(“Connection”, “close”);  // Send

  http_request.send(send);  // In our case we don’t want to use the data being returned.

    If we’d wanted to (for example, to get our user id or any

    other information) we could have used this function to

    process the data.

  function nullfunction() {

  if (http_request
Like this:#section4
Like Loading…
		
			
				
									
						
							Recently by Niklas Bivald						
						
							
								Community Creators, Secure Your Code! Part II								
							In part two of his two-part series on protecting your community site from malicious cross-site scripting attacks, Niklas Bivald rolls up his trousers and wades into the JavaScript.

						
					
										
						
							Further reading about							Community
						
													
								
									Good designers, bad websites: a proposal									
								Designers are good people. Some designs exclude people anyway. Alan Dalton explains why—too much to remember—and offers a practical fix: accessibility personas that help you recognize problems while you're designing, not after. Homework included.
							
														
								
									Design for Amiability: Lessons from Vienna									
								While Hitler plotted and Europe crumbled, a motley crew of mathematicians, philosophers, architects, and economists met weekly to invent Computer Science. Mark Bernstein mines this forgotten history for lessons that just might save today’s web from its worst impulses.

33 Reader Comments

Justin Perkins says:

April 18, 2006 at 4:23 am

Or just use something like textile or markdown and move on with your life.

What was that ajax sample showing exactly? Seemed like filler to me.
Damien Wilson says:

April 18, 2006 at 4:54 am

Be very wary of all the nifty new things Web 2.0 brings us. Seems like an old message, but I suppose we could all use a reminder now and then.

IE(Microsoft Internet Explorer) flops once again. Hopefully Microsoft will do a better job with IE7. :/
Niklas Bivald says:

April 18, 2006 at 6:29 am

The purpose of the ajax sample will be more clear when part two is released. Like an introduction for those who haven’t got a throughout knowledge of ajax…

Regards,
Niklas
ban jax says:

April 18, 2006 at 7:37 am

http://iamcal.com/publish/articles/php/processing_html/
http://iamcal.com/publish/articles/php/processing_html_part_2/
Martijn ten Napel says:

April 18, 2006 at 11:01 am

Very nice to show the bad code, but somehow the way to protect yourself is always missing in these kind of articles.
Regular expressions are not exactly bread & butter for everyone, so if you want to get the world to notice your warning and act on it, some cut and paste examples would be very helpful.
Christian Sattel says:

April 18, 2006 at 11:44 am

Worth checking if some of the strings listed on this site slip through your validation routines:

http://ha.ckers.org/xss.html
Jeffrey Zeldman says:

April 18, 2006 at 11:59 am

Martin said:

bq. Very nice to show the bad code, but somehow the way to protect yourself is always missing in these kind of articles

But as this was announced as Part One of a two-part series, the comment might be reserved until Part Two has run.
Brian LePore says:

April 18, 2006 at 2:56 pm

When will part two be published? In the ALA(A List Apart) publication?

Not being well versed on this subject, it sounds like a site is less likely to be susceptible to XSS(cross-site scripting) when one avoids the usage of eval and avoids muddling style, structure and behavior through the use of the style attribute and inline-javascript. Is this the case? I know this is easier said than done.
Edward Yang says:

April 18, 2006 at 6:23 pm

It occurs to me if someone just sat down and wrote a stack-based (not regex) parser based closely on a stripped down versions of the HTML, XHTML, XML and CSS specifications, we could have something that would deal quite nicely with attempted XSS attacks. Remember: these are all well documented specifications and the browsers, which trigger these XSS attacks, simply adhere to these specifications.

The ad hoc “tricks” the article prescribes can fall victim to clever attackers. For instance, if you were to use str_replace(‘javascript’, ”, $html) your script would still be vulnerable to javasjavascriptcript (this is documented in the XSS cheatsheet posted above, excellent reading for anybody interested in HTML validation).
Phil Stewart-Jones says:

April 18, 2006 at 7:46 pm

I think the problem there is that the browser (specifically IE in this case) _doesn’t_ adhere to the specifications; or at least, that it goes beyond the spec by parsing – and executing – sloppy and/or malformed code.
Edward Yang says:

April 18, 2006 at 8:38 pm

This can be partially mitigated by using a proper DTD for your documents, but you’re right. I suppose the idea about giving users a limited toolset would help prevent malformed code. However… if the parser makes the code *valid as well*, it shouldn’t be a problem.
Carey Evans says:

April 18, 2006 at 9:13 pm

Using real HTML, CSS and URI parsers seems like the most secure solution, and it only has to be done when processing the input, not every time it’s displayed.

In Java, there’s “TagSoup”:http://mercury.ccil.org/~cowan/XML/tagsoup/ for parsing just about any input HTML, which is a good idea anyway, a few “CSS Parsers”:http://www.w3.org/Style/CSS/SAC/ and the provided URI parser, and presumably other languages have the same.

If you include only the elements, attributes, CSS rules and URI methods you don’t understand, and correctly escape the output with the right “character encoding”:http://www.w3.org/International/O-charset.html I don’t see how anything can slip through.
Jeremy Epstein says:

April 19, 2006 at 1:44 am

Seems to me that if I was to summarise this article in a few words, capturing all the important information that is not already second-nature to most web developers, it would be: _”The word ‘javascript’ can have line breaks (and spaces, and other separating chars?) in it”._

All the stuff about escaping special characters, white-listing HTML elements, and being careful about CSS input, have been well-known for years. It’s just that this new ‘ja-vas-cript’ IE trick has come into the limelight recently, because of the MySpace exploit that the author mentions.
Justin Simoni says:

April 19, 2006 at 4:55 am

For the web app I develop in Perl, I’ve found HTML::Scrubber to be a good way to help in cleaning up input from untrusted sources – nicely customizable in a very perlish way. Check it out –

http://search.cpan.org/~podmaster/HTML-Scrubber-0.08/Scrubber.pm

does well with javascript, although I do not know about css (you can strip out any tag you’d like), or js in css.
Trevor Spink says:

April 19, 2006 at 11:06 am

It is facicnating to see this issue discussed, we work on a number communitiy based sites and will now work on a solution for this. Upto now we have been using the HTML tidy variations for .net at…

http://tidy.sourceforge.net/

I’m sure this will resolve most issues. We have found it fantiastic! Also for editing HTML try using

http://tinymce.moxiecode.com/

We find this brilliant and has a range of options for narrowing the HTML tags allowed.

Hope this helps
thomas lackner says:

April 19, 2006 at 1:19 pm

Partially in response to Brian Lepore (comment #8), I’d like to help underscore the threat of XSS. *Any time* user input is accepted (even if that input comes directly from a GET or POST variable), it needs to be properly escaped on output or you are at risk of an attack.

Here’s an example that brings it home for me. Imagine a site where you store a cookie for authorization. As you may know, cookie contents are accessible through Javascript. Imagine if this site’s login page would display error messages in the URL, like login.php?error=Incorrect%20password.

Seems innocent and common enough, but if I am an attacker, and I IM one of the site’s users with a URL like login.php?error=Incorrect%20password%20
(contrived example), you can see how I’d be able to manipulate the cookies and send their login cookie to my site (through a Javascript redirect). Similarly this can be used for phishing by sending the user to a fake login page via a link on the real site.

With XmlHttpRequest, I’d even be able to force the user to perform actions (via HTTP POST/GET) on the site – such as the voting example given in this article.

To make matters worse, many of the new community sites that are springing up encourage the user to enter HTML, and correctly differentiating valid HTML from invalid HTML is a difficult process. This means that you can’t really use stuff like PHP’s strip_tags() function.

Hope this helps!
Matthew J Matthiesen says:

April 20, 2006 at 3:12 am

At the moment we’ve got a half-empty glass here. I can’t judge the contents just yet, because if I really want to get to the taste I need the whole thing.
I’m sorry to say, but I find the half of this article useless. It might start making more sense when part two is out and about, but until then this seems like a very lengthy introduction.
Brian LePore says:

April 20, 2006 at 4:58 am

thomas:

I’m sorry that my previous post did not state this, but I am aware of the idea of validating input to protect users.

That said, I have never really understood why many sites have a tendency to use GET data like in your example, rather than keeping then the use of error code numbers. I know, it is quite annoying to have to look up the different numbers when you want to use something, but it saves the worry of someone injecting HTML into your site. In my opinion, the security benefit outways the simplicity in development.

I like the check_tags function in the first link that *ban jax* posted. It looks like a beefed up version of strip_tags that fits the needs of most developers that need to allow HTML.
Edward Yang says:

April 20, 2006 at 7:29 am

bq. I like the check_tags function in the first link that ban jax posted. It looks like a beefed up version of strip_tags that fits the needs of most developers that need to allow HTML.

The Iamcal code is quite interesting, but it doesn’t guarantee XHTML 1.0 valid code, since it doesn’t check the children of the elements (a tag within a tag). Also, it’s not easily extensible to environments that need a broader tag base: there’s a lot more to XSS in attributes than a few protocols. Especially true if you decide to allow the style attribute (which, as the article points out, can execute JavaScript too! Fun.)

Shoot me, but I’m not sure why anyone would need image tags for most applications either.
Ross Clutterbuck says:

April 20, 2006 at 6:03 pm

Am I being totally short-sighted here, or can all these security holes be resolved in one simple stroke: don’t let your users personalise their space through real code!

My other half asked me a few days ago to help her style her MySpace stuff (and as it’s the first time I’ve really bothered going there I almost threw up when I saw how s**t it is code-wise) and then gave up after tearing my hair out for ages.

For a start it clearly says “please don’t use CSS to remove any MySpace ads” so immediately I set about doing just that – and succeeded. I then decided to play a little prank on her by adding “table {display:none}” to her style sheet and promptly destroyed the entire site in preview mode.

IMO these sorts of places – MySpace in particular – are so shoddily written it’ll take a CSS expert to write code to successfully style the soup of nested tables, divs and junk to get anything worthwhile looking at – and I doubt the majority of the user base will be these CSS experts (I know I’m not) – and certainly from my own experience anybody half web-dev savvy have their own blogs with crisp, clean blogging systems or just written their bloody own!

I’m all for personalisation and marking your cyber-territory, but surely it’s quicker and easier for the users and safer for the admins to allow personalisation through forms and options. Let the user click the settings they want and the system generates the styles.

Obviously you’ll still need to filter text input areas to avoid IE’s incompetence, but you’re already running a lot tighter ship.

Or, I say again, am I being short-sighted?
Edward Yang says:

April 20, 2006 at 10:53 pm

MySpace didn’t get popular because it was well-written 😉 It got popular because it gaves users control.

However, I think that you do have a point in not letting users customize space through “real code.” Isn’t that what most forums and blogs do right now by not allowing HTML when they can avoid it? BBCode and Textile are much easier to secure than raw web input.
thomas lackner says:

April 21, 2006 at 2:22 am

Well of course you don’t have to let them use real code. You could let them pick between the options you give them. But you could also just not accept their content, or their photos, or their comments..

Like the previous poster said, MySpace sold for many millions of dollars, and it was only because “making profiles pretty” really appeals to teenaged girls and the boys who lust for them. People like to do their own thing.
Ross Clutterbuck says:

April 21, 2006 at 3:31 pm

“But you could also just not accept their content, or their photos, or their comments..”

I think that’s a totally different ball game – not accepting customisation through real code has nothign to do with censorship.

I entirely agree that making “pretty profiles” is the attraction of MySpace and its ilk, but as it’s been mentioned, there are much more secure ways of going about it – allowing real-code submission is just asking for too much trouble.
Freelance Web Developer UK says:

April 21, 2006 at 4:03 pm

this is important stuff, blogging communities allow other users to view their code, but for web 2.0 and secure programs its important to know that the code can be hacker-proof.
Michael O'Brien says:

April 21, 2006 at 4:47 pm

As well as allowing javascript URLs in CSS, IE also has a “feature” that lets properties be set using expressions, written in (of course) JavaScript. So you can use:

…

Just something else to watch out for…
Jim Whimpey says:

April 24, 2006 at 10:41 am

I think MySpace giving their users freedom to edit their templates CSS was a huge mistake. Sure, the default theme is ugly but the things that the majority of people do to their MySpaces is much worse. They just destroy them beyond any level of readability or sanity.

MySpace would be nicer place without theme editing. Pure Volume is proof of this.
Ellen Weber says:

April 24, 2006 at 1:54 pm

Your article makes a good case for the security codes necessary to hold back abuse of the system. The system still needs to be refined so that legitimate users are not kept out in the same stroke we use to stop abusers. Thanks for raising the topic so well…
Edward Vermillion says:

April 25, 2006 at 5:27 am

I know the article is about XSS, but the example used points to another problem with a lot of these types of sites, not using a validation scheme for the ‘voting’ script, or scripts that control other types of changes.

A simple check for a valid random unique id in voteOnAuser.php would kill any chance of a XSS vulnerability such as this from having any effect because the ‘vote’ would automaticaly be rejected.

And a big applause to #28, if you’re going to allow customization then by all means have complete control over the code yourself.
Alexander Netkachev says:

April 27, 2006 at 1:29 am

I sure there are two clear strategies to prevent XSS attacks:
1. Format using tidy and then remove anything unexpected – leave only basic set of tags.

2. Separate administrating and displaying markup content on different sites (like blogger.com does).

Correct me if I wrong.
Priit Laes says:

April 27, 2006 at 12:26 pm

Instead of
one should use
😉 Makes things valid as well 😀
Jean-Marc Lagace says:

April 28, 2006 at 11:34 am

All that customization of web application reminds me of an article on network security.

Pick your poison, do you want to restrict the user to a know and limited set of features or chase all the hacks that people will find?

With the first approach you define and design it once; the other approach is a never ending race to ensure your application is safe from the known tricks and hacks.

Ok, it takes more time to develop the white list; most certainly feels more restrictive from a user standpoint but I guess it depends if you prefer to appear on the front page because you’re application is great or because somebody took control of other people’s account.

Maybe I’m paranoid, but I’d rather spend time expending my applications then fixing my damaged reputation.

Cheers!
Kevin Mesiab says:

May 8, 2006 at 2:58 am

http://kevin.mesiab.com/wordpress/index.php/cragslist-vulnerability/

Apparently they need to heed the message. 😉 At the time of writing, this hole has not been patched.
bob sfog says:

May 8, 2006 at 2:44 pm

The ad hoc “tricks”? the article prescribes can fall victim to clever attackers. For instance, if you were to use str_replace(“˜javascript’, ”, $html) your script would still be
http://www.replicahours.com vulnerable to javasjavascriptcript (this is documented in the XSS cheatsheet posted above, excellent reading for anybody interested in HTML validation).

Got something to say?

We have turned off comments, but you can see what folks had to say before we did so.

More from ALA

Designed for a Dead Language

by Shrey Shah

Every language app in your pocket inherited a teaching method built for Latin. Understanding why that happened is a more useful design lesson than anything the apps themselves can teach you.

Good designers, bad websites: a proposal

by Alan Dalton

Designers are good people. Some designs exclude people anyway. Alan Dalton offers a practical fix: accessibility personas that help you recognize problems while you're designing, not after. Homework included.

“Successful” or “Unsuccessful”: the Post-“Good Design” Vocabulary

by Justin Dauer

Design for Amiability: Lessons from Vienna

by Mark Bernstein

Computing was born in a Viennese café. Between 1928 and 1934, while Hitler plotted and Europe crumbled, a motley crew of mathematicians, philosophers, architects, and economists gathered weekly to puzzle out the limits of reason—and invented Computer Science in the process. What made their collaboration possible wasn't just brilliance (though they had plenty). It was amiability: the careful design of a social space where difficult people could disagree without destroying each other. Longtime A List Apart contributing author Mark Bernstein mines this forgotten history for lessons that might just save today's embattled web from its worst impulses. Spoiler: it involves better coffee service and the looming threat of public humiliation.

Design Dialects: Breaking the Rules, Not the System

by Michel Ferreira

Design systems aren't component libraries—they’re living languages. Rigid adherence to visual rules creates brittle systems that break under contextual pressure. Fluent systems bend without breaking.