CodeIgniter Forums

Full Version: When to use the XSS filter?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
In the security part of the documentation, it says

Quote:XSS filtering should only be performed on output. Filtering input data may modify the data in undesirable ways, including stripping special characters from passwords, which reduces security instead of improving it.

That part links to the documentation for the security class, which says

Quote:Note: This function should only be used to deal with data upon submission. It’s not something that should be used for general runtime processing since it requires a fair amount of processing overhead.

What should I do?
The documentation in the Security class is likely at least partially left over from a previous version. Additionally, the wording could be improved in both portions of the documentation, as xss_clean() is not "filtering" the data in the way that term is typically used when discussing security.

The mantra for secure handling of data is "filter input, escape output". This tends to be confusing, because we might see xss_clean() referred to as XSS filtering or we think of input strictly as user input from a form and output strictly as output to a view.

"filter input" can also be thought of as validation. Make sure the data that comes in is valid for the intended purpose. If a field is supposed to be a number, don't allow letters. If a field is supposed to be an email address, make sure that it's at least conforming to the basic requirements of an email address. If any of the data is not valid, reject it. It's up to you whether any given piece of data is important enough to reject all of the input or just the invalid part.

Additionally, "input" is form data, $_POST data, $_GET data, or the URI, which is all more or less what we expect, but it's also the data retrieved from your database, from a file, or basically anywhere else. You have to define your own level of trust at some point, but at the very basic level the arguments passed to a public method in a controller, data retrieved from the database, data retrieved from a file, or data received from a third-party library should be treated the same as form/$_POST/$_GET/URI data (especially public controller methods, since they are accessible via the URI).

"escape output" means you process the data to be output and escape anything in that data which might be dangerous in the destination format. The method of escaping data is specific to the output format, as is the potentially dangerous content which must be escaped. In most cases, the text used to perform a SQL-injection attack is perfectly safe to output to HTML, just as the text used to perform an XSS attack is perfectly safe to store in the database. Most escape routines can only be performed on the data once, and some can be difficult to reverse. They also have the potential to render the data useless for other purposes.

As hinted above, we commonly think of the "output" portion as an HTML page, or maybe JSON or XML in specific instances. However, our SQL queries (or those generated by query builder) are also a form of output. The data inserted or updated in the database is a form of output. We may also write to a file or generate a file for download. We can even think of anything we pass to a third-party library as output. Even a redirect should be thought of as output, since we output a set of headers telling the user where to go next and quit.

So, xss_clean() should be used on data to be output to an HTML page, or any other format in which it proves useful. However, you need to consider the performance implications and the risk of the data to be output. If the risk is high but the performance is a potential issue, you may need to consider caching the output of xss_clean(), but you need to do it in a manner which permits purging/regenerating the cache if xss_clean() is updated or additional measures are required to escape the output.
Thank you for your thorough answer. I guess I need to start benchmarking in order to know whether or not xss_filter is appropriate as is for my output, or if I need to cache it. The content is user submitted (think blog posts, comments etc.), so I guess it's best practice to assume that it is malicious.

Adding to the confusion, the documentation suggests in another place that one may want to put


PHP Code:
$config['global_xss_filtering'] = TRUE


in the config file, to run XSS on all input.
Thanks for reporting these inaccuracies, it was indeed just leftover pieces from older times. Here are some quick improvements:

https://github.com/bcit-ci/CodeIgniter/c...3e64187b7b
https://github.com/bcit-ci/CodeIgniter/c...a5030340ef

Please don't use 'global_xss_filtering'. The document that describes it also says that it is DEPRECATED.
I must admit that this is something that still causes me issues too. I cannot quite get my head around filtering on output only. I understand, for instance, that if you filter on input it is impossible (or hard) to get back the original data if you needed to for some reason. I have also read a few samples of sites that had problems caused by doing it the way I normally do it, which is filtering on input. However I just can't seem to shift my own mind-set that I do not want untrusted data in my database! I also find that filtering on input only need happen in the few places input is accepted, but the data might be output in lots and lots of places.

As I said, I am not arguing that filter-on-output is wrong, just that I still can't really understand or appreciate why. Perhaps I need to re-read or read more on the topic. Or maybe I am just too set in my ways, even if they are not the most ideal way of doing things.

Best wishes,

Paul.

PS I missed the global_xss_filtering when I first used CI3 (stopped using it immediately when I saw the note about it being deprecated) but since then, I would not go back to using it now as I think dropping it has been a great benefit for me and my code. Perhaps this is an example of, if I start filtering on output exclusively, I would not want to go back once my mind set had been changed.

PPS On rereading the post from mwhitney above, I definitely forget about those other forms of output. Thank you for that post. I bet you get fed up explaining that again and again.
(09-01-2015, 05:01 AM)PaulD Wrote: [ -> ][...] However I just can't seem to shift my own mind-set that I do not want untrusted data in my database! I also find that filtering on input only need happen in the few places input is accepted, but the data might be output in lots and lots of places. [...]

PPS On rereading the post from mwhitney above, I definitely forget about those other forms of output. Thank you for that post. I bet you get fed up explaining that again and again.

It's no problem, and almost every time I post something on the subject I go out and search to try to find better examples or something that might make it easier to explain.

I tend to write models which allow me to change the data source relatively easily without modifying the controller, so I can sometimes be a little paranoid about the assumptions people make about their underlying data. In most of my controllers there's no way of telling the whether data came from a database, a file, or a 3rd-party API, and the data source could change tomorrow. Because the security requirements can sometimes be specific to the model, I will generally supply some rudimentary functionality to secure the data for output in common formats (or formats specific to the way that specific model is being used in the system), but I generally leave it up to the controller to apply those methods to the data where appropriate, and take additional precautions on its own.