Welcome Guest, Not a member yet? Register   Sign In
CodeIgniter and XSS protection
#11

[eluser]Padraic Brady[/eluser]
[quote author="Kenji @ CodeIgniter Users Group in Japan" date="1305298049"]Padraic, thank you for your comments, and your blog article.

The disclosure of the incident is not enough, I feel. Official Information about the security bug is too little.

And the CI documentaion about XSS protection is not good, at least the best security practice http://ellislab.com/codeigniter/user-gui...urity.html is not the best.

I think that improving CI documentaion benefits all CI users.[/quote]

I agree, and updating the documentation would take very little effort for potentially a lot of gain in educating users on how to write more secure apps with CI. That can only benefit CI's already positive reputation.
#12

[eluser]toopay[/eluser]
I really dont know, your impression, when write that article. If you give yourself more time, to understand what I said, before hastily responded in part that only one part of the core that I wanted to say, then you will realize that there is no single thing about security techniques, which I rejected, of your conclusion that emerged from your research about the XSS attack. And of course, this needs to be appreciated.

In fact, I'm just trying to give a more thorough understanding, related with data integrity policy, as the basis (once a complement of what you wrote in your blog) when people start talking about XSS attacks. If not, then for me, same as you invite people to discuss an exciting game of football, without giving the basics of understanding of the rules in football (which most of them, may not know about football)

I am here not to talk or want to act like a lawyer who defended the EllisLab. It's just a little funny, seeing the emphasis on writing is more towards the demands that they must include all points of your "security research" details, rather than promote what's developers need to know when they interact / manage their data integrity policy, directly . And as I tell you, and this is purely my opinion, that such childish demands, more like a mechanic who blames his tools (and the sales also stores that sell these tools, even in fact they sell it for free.)
#13

[eluser]Padraic Brady[/eluser]
Has it occurred to you that I didn't quote the first part of your response because I agreed with it? Tongue I have no dispute with what you said. What I did quote was something you said made no sense to you, so I added a clarification in the event there were any misunderstandings. I didn't want to go too far off course from the original topic.

In the same way, I'll clarify that my "demands" are simple recommendations. I'm not holding a gun to anyone's head - just offering the benefit of some 14 years of PHP and security experience for consideration.

I apologise if my references to a "security report" are a bit vague. I don't wish to disclose the full details of the vulnerabilities or the original report given EllisLab has stated in the past they prefer not to, and I want to respect their wishes in this regard. Hence my article doesn't carry the usual detail I tend to disclose and limits itself to a quick overview of the count and generic nature of what was reported.
#14

[eluser]boltsabre[/eluser]
Interesting discussion... I've recently completed a 1.5yr Diploma IT (Website Development) course (note: not a full Bachelor Degree of Computer Science, where I imagine things would have been taught more in depth. And to further compact the problems I'm about to mention, my course was initially a 2 year course, but my institution condensed it into 1.5 yrs...hmmphfff).

1.5 years is not much when you consider you have to cover topics such as HTML, JavaScript, PHP, a FrameWork, Relational Databasing, (My)SQL, UML, Project Management (and an array of other 'interpersonal, business, costing, budgets, Service Level Agreements, etc' kind of subjects), Online Privacy, Risk Analysis, Reporting/Analysis tools (ie google analytics), logs, Testing and so on. And that's not even to mention stuff like a JavaScript Framework/Library, SEO/SEM, .htaccess, JSON (etc etc) were not even touched upon. As such, unfortunately, Security (like the rest of the subjects we studied) was skimmed across in a very quick fashion.

As such, I'm going to say that both 'opinions/contributions' made by Toopay and Padraic Brady are of utmost importance and interest too me. Trying to figure out 'what constitutes good security' in todays day and age is almost impossible, there are ranging conflicts of opinions and methods on every blog you read.

Having a good solid "data integrity policy" is imperative, but at the same time I fully agree that the documentation on CI is rather lacking, especially for someone, like myself, who is relatively new to the game... For example, I had to ask on the forums here about CSRF - the documentation is only 3 lines long.

A good tutorial on how to safely and accurately implement the myriad of security goodies CI has given us, combined with an overview of 'why' it is important to implement them in the first place, (and perhaps even covering how(and WHY) to write your own custom form filters and extend the base validation class, thus tackling the filtering issue at its base) I believe, would be up great benefit to this community. I know I would definitely be watching/reading it!

I shudder to think what kind of applications some 'novices/hacks' is making using CI (or any framework or just a plain old text editor for that matter) considering how hard I'm finding the whole 'security, filtering, cleansing, input, output' minefeild. Whilst I understand it IS NOT CI'S RESPONSIBILITY OR PROBLEM, as mentioned, a good tutorial and improved documentation would be invaluable... then when 'novice' questions arise in the forums we could just point them in the right direction so that they can build better and safer applications, and thus leaving the forum for the more important and technical aspects of this complex and daunting issue.

Just my 2 cents worth from the perspective of someone struggling, but trying hard, to build safe applications for my beloved users.
#15

[eluser]Padraic Brady[/eluser]
Security is always a complex topic. You'll probably find when it comes to programming languages specifically that we focus largely on the practical elements of what to do, when to do it, and how to do it - and less on WHY to do it. The WHY is actually a giant minefield. I know very knowledgeable programmers who when confronted with WHY are at a loss to explain it.

Take a simple example (well, relatively simple Wink). If you use Google Code Search, you can find examples of tons of PHP classes and libraries connecting to HTTPS URLs (Twitter's API for example). Now this seems a simple operation! Boot up curl and away you go. If you look really hard though, you'll notice that many clients pass in a curl config to disable something called Peer Verification (SSL_VERIFY_PEER if I recall the option name correctly). This option is explained almost nowhere in depth for PHP. In fact, PHP Streams disable something like this by default (just as Curl enables it by default). What does it do? It verifies the SSL certificate of the server you contact - i.e. it verifies you really are talking to Twitter, and not some hacker with a fake SSL cert who's interested in your Twitter API session Wink. So, if you disable SSL Peer Verification, it's an insecure connection. Yet, it's commonly done.

Try explaining the WHY of that to most developers, and their eyes will glaze over Wink. Many web developers probably don't even know exactly how SSL works to even see that it is a problem. Often, it's just easier to document it very quickly (in a short paragraph as above) and then basically give them a good rule to follow (NEVER DISABLE SSL_VERIFY PEER!).

I still know some developers I admire who struggle with this Wink. You can imagine how complex it can get as the range of security topics increases (as it always does, look up remote timing vulnerabilities that I blogged about recently for an example of who the WHY is so damn complex - it's a long post Smile).
#16

[eluser]toopay[/eluser]
I do not think the new developers will found it easier to understand this topic (XSS), while someone bring up "Security is always a complex topic" as preface. Also, I still wonder, when there were still people on this forum who feel the need to mention how many years they have been dealing with PHP (or also, if neccesary, stated how much they love preg_match() and/or levenshtein() function, and so on.), rather than directly provide such information that is directly related to the issues addressed. Not because the only thing I can be proud of is my hobby, playing the melodica, so I can not mention it at this forum. It's just that I think, the way I mentioned last, more useful for others. Seriously, if you give a response that similar with "I was live in COBOL era, stay away from me, i'm a geek! I even can eat your brain." or "Stop asking, did you know that in bursts i can reach 200 Words Per Minutes. On that speed,with my Vim, i can write all of your personal information, including your sandwich preference, roast beef and Swiss, in less than ten seconds!", that will stop us to gain more discussion and elaborate it.

I dont have any personal blog to write (maybe some day, i want to have a musical blog or any topics that not related at all with programming), so since this topic is about XSS attack, then maybe i can explain or share something here, as a nutshell of what that really means.

[Cross-Site Scripting]

Cross-site scripting, or XSS, is the name given to attemps to attack a site, by submiting data that will then be displayed back to others users with undesirable effects. This covers everything from messing with stylesheets to capturing user's password inputs. 'XSS holes' are place in an application's, implementation that allows user data to be treated as untainted when either data is fully tainted or the untainting process, was ineffective.

October 2005, we saw the first large scale XSS worm, which attacked the myspace[dot]com. In around 20 hours, the worm spread to infect over one million accounts (it added the text "Samy is My Hero" to account profiles).

October 2007, on security statistics report(without double or single quote) which provided by Jeremiah Grossman, Founder and CTO, WhiteHat Security, stated that 7 out of 10 websites were vulnerable to XSS attacks (that report is based on data obtained between January 1, 2006 and July 31, 2007).

XSS attack, indeed was and is still a hot topic, and the attention that it's started to receive will ensure that there are hundreds more people out there adding it to their arsenal of exploits. As attention grows, the likelihood of any holes in your apps being found and exploited also demands more concern.

"How we deal with this evil?" Keep these following point...

Canonical Hole

Even if you're not planning on displaying user-entered HTML data in your app, we can easily fall prey to the most common HTML-based XSS hole. Ussually within an app you;ll passing data around between pages in the form of HTTP GET or POST param. For example, a user access a page, then passing along an 'id' in the GET query string, which is then stashed into another link on the page to let him navigate elsewhere, like these :
Code:
... see <a href="/photos/?id=1">Geek Man's Photo</a>
while our php source may look like
Code:
see <a href="/photos/?id=&lt;?php echo $_GET['id'] ?&gt;">Geek Man's Photo</a>
With the code as is, all any hacker has to do to inject HTML into your pages is pass it along in the query string : (remove the backslash to understand the actual url)http://yourapp.com/photo/?id="><\script>alert('hijacked by newbie hacker');<\/script><

What we've learned here? Unlike Perl for example, PHP doesn't have a taint mode built in, so we need to manage tainting ourselves. A good rule of thumb is to declare certain variables as tainted and always escape values from those. This tends to create a data integrity policy of not trusting any data(as InsiteFX has stated above with his 6 words), which is good mindset to have.

It's important to also note that it's not just the usual suspects ($_GET, $_POST, $_COOKIES) that should be considered user entered and thus tainted. $_SERVER and $_ENV superglobals don't actually contain data purely from the server and its environment. Eg : $_SERVER['HTTP_HOST'] and $_SERVER['REQUEST_URI'] both came directly from client's request.

Anything you haven't pulled out of database or explicitly filtered must be escape when outputing HTML. So, its very important that as developer, we have a full understanding of the risk involved and how to avoid exposing users to unfiltered user-entered data.

[to be continue]
#17

[eluser]toopay[/eluser]
[continuation from previous post]
Tag and Bracket Balancing

But there's a deeper problem we need to address first. Consider these following input
Code:
// remove the backslash to understand the actual input
<\script<\script>>
<<\script>script<\script>>
<scr&lt;!-- foo --&gt;ipt>
With any of these inputs , if you already have some regular expression function that able to filtering the input by their tag for example, our app will allow the evil script tag through. So, we need to ensure that the input we're processing for tags doesn't contain angle brackets that aren't connected to a tag. The naive solution to this, is to balance the bracket first, like so
Code:
$data = preg_replace("/>>+/",">",$data);
$data = preg_replace("/<<+/","<",$data);
$data = preg_replace("/^>/","",$data);
$data = preg_replace("/<([^>]*?)(?=<|$)/","<$1>",$data);
$data = preg_replace("/<(^|>)([^<]*?)(?=>)/","$1<$2",$data);
While this should catch most mistake and always create a valid marup, it may not always be what the user intended. It's worth playing around a little with some example mistakes and see what you'd expect to happen nor with our balancing logic working correcly, there are still some common cases we're not dealing corectly, also there are still a couple of oddities left to deal with.

Protocol Filtering

Now since we already know, that filtering out undesirable elements and attributes is only a portion of the solution, there's a final piece of the puzzle left : the content inside some of the attributes.

Consider an app with a white list that allows hyperlinks and images. The following user input would be allowed :
Code:
// remove the backslash to understand the actual code/statement
<a href="java\script:foo">
<\img src="java\script:foo">
This is still within the rules of our whitelist but clearly isn't what we want. Simply removing these attributes from our whitelist isn't best option either, since allowing users the ability to input hyperlinks and images is core to many application function. So we (may) need some extra filtering. The attribute filtering we're going to need to add applies to all attribute that point to an external resource, the 'href' attribute of the 'a' element or the 'src' attribute of the 'img' element and so on. Each of these is a URL, which means we can nicely break them all into the following format
Code:
protocol ":" protocol-dependant-address
Some examples show that nearly all input falls into this format
Code:
// remove the backslash to understand the actual code/statement
mailto:[email protected]
http://iamgeek.com/
ftp://iwanttobeahacker.com/
java\script:alert('foo is not always bar');
The only kind of format that doesnt fall into this form are relative URL, which don't include a protocol. The decision of whether to allow relative URLs will depends on how the data will be outputted.

Also there's the issues of relative URLs : if we want allow them, and how do we find the protocol? Lets start with some basic protocol matter and see what it finds
Code:
// remove the backslash to understand the actual code/statement
<a >
<a href="java{\t}script:foo">
<a href="java{\n}script:foo">
<a href="java{\0}script:foo">
<a href=" java\script:foo">
<a href="JaVa\ScRiPt:foo">
It turns out that there's alot of munging that we have to perform before you have a normalized protocol string to match against. Spaces at the beginning and the end have to be stripped, but so do spaces in the middle. All whitespace and formatting characters need to be stripped too. Casing needs to be normalized. All of the above examples work in IE6, with several working in Mozilla.

If, our "protocol checker" just searches for a colon to find protocol, then its vulnerable to various techniques all grouped under the heading of protocol hiding. Any data in HTML can be escaped using character entities :
Code:
// remove the backslash to understand the actual code/statement
<a href="&#\106;&#\97;&#\118;&#\97; &#\115;&#\99;&#\114;&#\105;&#\112;&#\116;&#\58;foo">
Thats alot of variations. There are more still : with named entities, denormalized UTF-8 sequences and Unicode character entites. The best thing to do in this situation, if you are new, is to cheat and use somebody else's code. The number of ways in which HTML source filters can be exploited is growing every years, as the browsers add more and more "helper" function to correct invalid user code. The dangers posed by unfiltered or incorrectly filtered user data grow with each new exploit, as attackers find more and more innovative ways to steal user data. For high-profile apps, protection agains XSS can only become more important.
#18

[eluser]boltsabre[/eluser]
Wow, see, that's my point exactly... there is SO SO SO much to consider when it comes to security, it's almost overbearing, and easy to miss something. I research a topic/exploit, and get side tracked by mention of other exploits, go and research them, and my original topic/exploit gets left in the wind.

Oh how I wish there was a golden bible of steps to follow and code to implement :-) (I guess never trust user input, as mentioned, is a good place to start!)

Thanks for all the comments, help, suggestions and code snippets, it's an amazing community here at CI!!!
#19

[eluser]Padraic Brady[/eluser]
[quote}The number of ways in which HTML source filters can be exploited is growing every years, as the browsers add more and more “helper” function to correct invalid user code.[/quote]

Quoted for TRUTH.

To offer a simple example, HTML 5 is the first HTML standard to allow a closing tag contain attributes. Think about that for a moment. Any XSS filter/sanitiser which assumes attributes can only exist in the opening tag (and doesn't check the closing) will have problems. It's weird changes like this mixed with browsers continually adding complex parsing rules that creates a field where combating XSS is very difficult.
#20

[eluser]toopay[/eluser]
[quote author="boltsabre" date="1305396886"]... there was a golden bible ...[/quote]

You will found that more you know and understand how your system works, more you aware about the security hole that exist on your system and how to deal with that. Your understanding of how your system works, is the key here.

Although as we see, there are so many people who preach about it, security is not a religion! ;-)




Theme © iAndrew 2016 - Forum software by © MyBB