Login

mwhitney · 03-22-2016, 08:42 AM

The XSS filter is ultimately a mechanism for escaping data for use within HTML content. In other words, it's not a filter in the same sense as the word is used in the security mantra "filter input, escape output".

Another source of confusion is that "filter input, escape output" sometimes leads people to believe that it is a two step process, filtering the data when received from the user and escaping the output when displayed to the user. In reality, the process applies to every interaction with data. You filter input from the user, then escape it before saving to the database, which prevents various SQL errors and SQL-based attacks (however, you have to escape the data specifically for saving to the database, so you would not use a function like html_escape() for this purpose). When you retrieve data from the database to display to the user, you filter the input from the database, then escape it for output in the HTML, JSON, part of a URI, some document format (e.g. PDF), or whatever form of output you are currently using.

Finally, there is some confusion (and disagreement) on the intended meaning of "filter input". In the sense of the strictest security needs, input may be whitelisted, and any input which does not match the whitelist is rejected outright. So, you would define a set of validations which state the allowed characters, the minimum and maximum length of the data (or, for numeric values, the minimum and maximum values and whether floating point values are permitted, and, if so, their precision), and similar validations which define what the data must be in order to be permitted.

When the requirements are slightly less strict, you might use a blacklist to filter the input. This is defining what the data cannot be, so, if certain characters are encountered the data would be rejected. However, unless there are very few potentially dangerous inputs, this can lead to an ever-increasing list of dangerous values to reject.

An even less secure method of filtering input, and one which probably most closely matches what people think of when they read "filter input, escape output", is to eliminate only the blacklisted portion(s) of the input, rather than rejecting the input. In other words, if the character '<' is not permitted, then it would be removed from the data, but everything else would remain.

Using xss_clean() on input is essentially taking the least secure of these options a step further (in the direction of reduced security) and escaping the blacklisted values in the input rather than eliminating them. As is the case with eliminating blacklisted characters, you can't recover the original input for use in other contexts, and the user often has no idea how or why the data was changed (or in some cases, even that the data changed).

Bonfire

Practical CodeIgniter 3
CodeIgniter Testing Guide

***Narf*** · 03-22-2016, 11:21 AM

(03-22-2016, 08:42 AM)mwhitney Wrote: Finally, there is some confusion (and disagreement) on the intended meaning of "filter input". In the sense of the strictest security needs, input may be whitelisted, and any input which does not match the whitelist is rejected outright. So, you would define a set of validations which state the allowed characters, the minimum and maximum length of the data (or, for numeric values, the minimum and maximum values and whether floating point values are permitted, and, if so, their precision), and similar validations which define what the data must be in order to be permitted.

This confusion is at the very core of the problem - people need to know the difference between "filtering" and "validation".

Filtering, or sanitization, is what xss_clean() does - trying to strip only the invalid parts of the data.
Validation is what should be done with inputs - completely rejecting data that is not 100% valid.

Problems like this one are why correct terminology is important.