Welcome Guest, Not a member yet? Register   Sign In
xss_clean adds semicolon to anything with an &
#1

[eluser]dmorin[/eluser]
I've noticed that the xss_clean function seems to add a semi-colon to any string that contains an ampersand. For example, if someone writes "AT&T;Park" in a textarea and I run it through the xss_clean function, it will return "AT&T;Park".

Has anyone else seen this? Any recommendations? Thanks.

Edit: This is pretty funny.....they must use xss_clean for this forum, so both of the things above say AT&T;Park. What it should say is:

ATampersandT Park
#2

[eluser]Pascal Kriete[/eluser]
Encode it as an entity: AT&T
#3

[eluser]dmorin[/eluser]
Thanks, that has been my fix for now. It appears that the xss_clean provides additional security versus just html encoding the information though, so I have been running it through htmlspecialchars() first and then through xss_clean. Does anyone know if this is necessary?

It seems like it's an issue given that I couldn't post the correct information to this forum...
#4

[eluser]reconstrukt[/eluser]
This is a problem.

The ampersand character is a commonly used (VERY commonly used) string delimiter. There are multiple scenarios where this "semicolon insertion" is unacceptable.

----
Scenario #1: $_GET data

Let's say, for example, you want to pass a return_url in your querystring (assuming you're running CI with uri_protocol set to PATH_INFO, and global_xss_filtering to TRUE).

Code:
http://mysite.com/user/login?s=1&return_url=some_encoded_url

Because its a good idea to cleanse GET data, let's run xss_clean() on $_SERVER['QUERY_STRING']. You'll get:

Code:
s=1&return;_url;=some_encoded_url

The "return_url" param becomes "return_url;".

----
Scenario #2. $_COOKIE data

Let's say you're reading a cookie with some delimited data (on a project I'm on now, we read top-level cookies [written by other apps] on subdomains). Again, assume you've got global_xss_filtering set to TRUE.

Let's say our cookie name is "sessdata" and the cookie data is:

Code:
u=reconstrukt&fn=Matt&uid=234&ts=20080716120000

Reading this through CI:

Code:
get_cookie('sessdata');

And you get back *altered key names*

Code:
u=reconstrukt&fn;=Matt&uid;=234&ts;=20080716120000

----
Scenario #3. $_POST data

Let's say you've got a form with a field called "homepage_url", allowing a user to post their website (or any external URL) on their personal profile page. So, the user copies and pastes their link:

Code:
http://www.myblog.com?page=something&sort=2&a=1

Reading this through CI:

Code:
$this->input->post('homepage_url')

You get back a malformed URL

Code:
http://www.myblog.com?page=something&sort;=2&a;=1

----

Now, I *love* CI. I've built a buncha sites with it (a few fairly high-profile). But this "add a semicolon so we can convert entities to ASCII later" is just plain wonky.

A recommendation? Don't make "global_xss_filtering" an all-or-nothing setting.

Instead, if we made this setting an enumerated list, you could leave filtering on for $_POST data, but leave it off for $_COOKIE data. From the config:

Code:
$config['global_xss_filtering'] = TRUE;

Becomes

Code:
$config['global_xss_filtering'] = array('POST', 'GET');

And a simple check in the Input class would leave some the $_COOKIE superglobal untouched.

(This doesn't necessarily solve scenario #3 above, but is certainly an improvement.)

Matt
#5

[eluser]Pascal Kriete[/eluser]
reconstrukt, did you even try this?

The function protects GET variables (line 541 and 563):
Code:
$str = "http://mysite.com/user/login?s=1&return_url=some_encoded_url";
echo $this->input->xss_clean($str);

// Echos the url with no modifications
#6

[eluser]reconstrukt[/eluser]
What version CI are you on? Not seeing that on my end.

Try the other scenarios, you'll see what I mean. Till this is fixed, I'm subclassing the Input lib and making the 'global_xss_filtering' config key an array

Cheers
M
#7

[eluser]Pascal Kriete[/eluser]
I'm using CI 1.6.4

The regular expression it uses actually matches any &key=value&another=more pattern, so none of your examples should cause any problems at all.

EDIT: Note how the forum (which uses that same function) didn't change anything in my post, because it looks like a query string.
#8

[eluser]reconstrukt[/eluser]
We're on 1.6.0

Was this fix recently added? I'm familiar with xss_clean(), but not seeing a regex that'd match qs strings.

M
#9

[eluser]Pascal Kriete[/eluser]
It got a pretty major overhaul for 1.6.3 (1.6.4 was EE, I'm not magically ahead Smile ), if you have a test install of your current system, replace the input library with the newest version and see if you run into any problems. Should work without any major issues.
#10

[eluser]reconstrukt[/eluser]
We haven't changed the core, subclassed libs if we needed to make mods. So I will give it a shot starting next week and post here (if anyone cares.)

cheers inparo.
M




Theme © iAndrew 2016 - Forum software by © MyBB