Blogging from Inside the New Media Revolution

Smibs on Code: Filtering user data

1 Comment
By Forrest - April 29th, 2009

There is a major security concern when it comes to displaying text provided by a user. Ruby on Rails does a good job of keeping your MySQL code sanitized, but web browsers are still a source of concern. It is VERY easy for a hacker to write HTML or JavaScript into a text field. You don’t want them to be able to execute arbitrary code on other users machines. It could do all kinds of nasty stuff, like reading the session key, sending it to a different server, and letting a hacker hijack your secure connection. It could scrape data off your screen or prompt you for additional data, in an attempt to get your credit card information. The point is, displaying user data without any kind of filtering is VERY bad.

XKCD - Exploits of a mom: A perfect example of not filtering SQL code.

XKCD - Exploits of a mom: A perfect example of not filtering SQL code.

The real question is when to filter the data. There are two trains of thought on this: filter the data right when the user gives it to you, before you store it in the database or store exactly what the user supplies and filter the data when it needs to be filtered.

Rails is designed more for the latter approach and there are a number of good arguments for post-filtering. It makes mass-assignment much easier (storing a large amount of data from a form into a database). More importantly, it lets you store exactly what the user wants to store. Obviously, you don’t want to be altering the users data. Depending on how the data is being read, it may not need to be filtered. An RSS reader is a perfect example of this. In the RSS stream you can display an unfiltered version of what the user entered, but a filtered version can be shown in the browser.

Despite these benefits, the method I eventually chose was pre-filtering data before it was stored in the database. This has better performance, because you filter once, and display the data many times. I still have various unfilter functions, which can restore the parts of the data we need in situations when we want an unfiltered, or partially unfiltered string. Finally and most importantly, I feel it is more secure. If your data is safe, you don’t have to constantly be remembering to filter when you are displaying your data. Doorbell and Smibs are large applications, with plenty of spots where you could forget to place the all important “h” in front of your variable. It just seems safer to me to store a string that can do no damage, and convert that to a dangerous string in the few instances when you need it, rather than the reverse.

In my next post I’ll go into the code and show how I implemented the pre-filtering in our applications.

______________________

What are your thoughts on pre-filtering vs. post-filtering? Did I miss any important points?

Tags: , ,

Filed under: Smibs on Code, Technology  •  Tagged: Tags: , ,
  1. Smibs Grow Smart Blog May 28, 2009 at 10:39 am

    [...] my last post for “Smibs on Code” I discussed the tradeoffs between pre-filtering and post-filtering user [...]

Leave comment

NOTE: We’d rather not moderate, but inappropropriate comments may be removed. Repeat offenders will be banned from commenting. Now, let’s focus on adding fun and valuable content. Thank you