[eluser]reconstrukt[/eluser]
This is a problem.
The ampersand character is a commonly used (VERY commonly used) string delimiter. There are multiple scenarios where this "semicolon insertion" is unacceptable.
----
Scenario #1: $_GET data
Let's say, for example, you want to pass a return_url in your querystring (assuming you're running CI with uri_protocol set to PATH_INFO, and global_xss_filtering to TRUE).
Code:
http://mysite.com/user/login?s=1&return_url=some_encoded_url
Because its a good idea to cleanse GET data, let's run xss_clean() on $_SERVER['QUERY_STRING']. You'll get:
Code:
s=1&return;_url;=some_encoded_url
The "return_url" param becomes "return_url;".
----
Scenario #2. $_COOKIE data
Let's say you're reading a cookie with some delimited data (on a project I'm on now, we read top-level cookies [written by other apps] on subdomains). Again, assume you've got global_xss_filtering set to TRUE.
Let's say our cookie name is "sessdata" and the cookie data is:
Code:
u=reconstrukt&fn=Matt&uid=234&ts=20080716120000
Reading this through CI:
Code:
get_cookie('sessdata');
And you get back *altered key names*
Code:
u=reconstrukt&fn;=Matt&uid;=234&ts;=20080716120000
----
Scenario #3. $_POST data
Let's say you've got a form with a field called "homepage_url", allowing a user to post their website (or any external URL) on their personal profile page. So, the user copies and pastes their link:
Code:
http://www.myblog.com?page=something&sort=2&a=1
Reading this through CI:
Code:
$this->input->post('homepage_url')
You get back a malformed URL
Code:
http://www.myblog.com?page=something&sort;=2&a;=1
----
Now, I *love* CI. I've built a buncha sites with it (a few fairly high-profile). But this "add a semicolon so we can convert entities to ASCII later" is
just plain wonky.
A recommendation? Don't make "global_xss_filtering" an all-or-nothing setting.
Instead, if we made this setting an enumerated list, you could leave filtering on for $_POST data, but leave it off for $_COOKIE data. From the config:
Code:
$config['global_xss_filtering'] = TRUE;
Becomes
Code:
$config['global_xss_filtering'] = array('POST', 'GET');
And a simple check in the Input class would leave some the $_COOKIE superglobal untouched.
(This doesn't necessarily solve scenario #3 above, but is certainly an improvement.)
Matt