Welcome Guest, Not a member yet? Register   Sign In
Using xss_clean() for displaying content from WYSIWYG editor
#1

Hi friends,

I have a web application that accepts comments using a WYSIWYG editor (having basic formatting like bold, italics, heading, lists, etc).

I am trying to allow users to also enter manually typed html like:
Quote:<b>This is manually typed content</b>
This is formatted content

Here the 1st line is html that's manually typed by user, and second line is text formatted using the wysiwyg editor.

This input then gets stored in my database as (with html entities encoded):
Quote:&lt;b&gt;This is manually typed content&lt;/b&gt;
<b>This is formatted content</b>

Now before displaying when I run this through "$this->security->xss_clean()", the output of xss_clean() is:
Quote:<b>This is manually typed html</b>
<b>This is formatted content</b>

So on echoing this in browser both lines come out as bold text, whereas I want only the formatted text to be treated as html.

Another issue is that xss_clean() url-decodes the text passed to it, so in some cases the text having "%" with some trailing alphabets get converted to a special character. 

I understand a lot of stuff that xss_clean() does is for security reasons and my app has never suffered any xss attacks until now thanks to it. But I would like an alternative that would preserve html entities & not url decode text while filtering for XSS.

So my question is, are there any good alternatives to xss_clean() which someone is using that can solve these 2 issues while also doing xss filtering?
I am aware of HTML purifier, but would like to know if there are any other good alternatives.


P.S. I am aware that the % sign issue has been reported on CI's github issues & is marked closed, because removing it is a security concern : https://github.com/bcit-ci/CodeIgniter/issues/5125
Haven't found anything else related in CI's Github issues, and searching "xss_clean" retrieves a lot of results on this forum. I went through a few pages but couldn't find something relevant. So forgive me for not being able to find  a topic where this has been discussed before.
Thanks!
Reply
#2

I guess that's why using meta-tags like [b] on forums is so popular - you can convert all < and > characters to &lt; / &gt; to avoid any script execution, then replace all meta-tags with actual HTML tags. Or Markdown is quite popular as well, but both use their own syntax, THEN convert to HTML.

I guess if it's stored in DB correctly, and it is xss_clean that replaced &lt; with < - you could replace &lt; with -my-random-less-than-replacement- before xss_clean, then revert it back. Quite ugly solution, but could get you going.
Reply
#3

Thanks @Pertti!

Yeah using markdown or meta-tags can avoid these issues.. But unfortunately it is not possible for me to switch to them for this project.

Regarding the replacement method: Yes, that is something I had thought of too, but only want to go for it if I don't find something else Smile

HTMLPurifier seems like a good alternative but it seems like it isn't being regularly updated/maintained, which is my major concern.
Also it doesn't support HTML5 yet.

So I am still keen to know if someone has come across this kind of issue & what library they have used and maybe their experiences with it.
Reply




Theme © iAndrew 2016 - Forum software by © MyBB