CodeIgniter Forums

Full Version: permitted_uri_chars rejects all non-ASCII hex values
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

El Forum

[eluser]sawatdee[/eluser]
One of my controllers needs to accept non-ASCII strings as a parameter. But when I pass urlencoded non-ASCII characters over the URL, I get the dreaded error "The URI you submitted has disallowed characters".

So I ran a simple test. I tried using "% 68 % 6F" (hex representation of the string "ho") as the URL segment for that parameter (I typed it right into the URL) and it worked fine. Then I kept the exact same URL but I replaced that segment with the UTF-8 encoding of a single Arabic character "% D8 % A7", and it returned the error message. This error message comes from CodeIgniter, so my controller code was never even called.

My configuration for permitted_uri_chars is set to the default. I am not using a .htaccess file and I am not rerouting for that controller.

Since the same URL works with " % 68 % 6F" but not with " % D8 % A7", the problem could not be any characters in my URL. The problem is that it is refusing to accept properly encoded hex values that happen to be out of ASCII range.

I ran several more tests and found that "% 7E" (the last printable ASCII character) works, but nothing greater than that value works. The URL should accept any hex value that is properly encoded for a URL, regardless of what characters those values represent. For example, "% 7F" (the ASCII delete character) could be part of a printable character in some other charset.

I have not had time to look at the CodeIgniter code to find the bug, but the permitted_uri_chars code is definitely not behaving correctly.

El Forum

[eluser]CI Lee[/eluser]
Must resist making comment about CI rejecting, not embracing, "ho"s......


I know someone else ran into this.. whether it was here or IM; I will try and find it....

El Forum

[eluser]CI Lee[/eluser]
I wonder if this is somewhat related

Post with similar issue here

I am not super familiar with anything relating to making anything world friendly... I keep things local.

El Forum

[eluser]jkevinburton[/eluser]
[quote author="CI Lee" date="1204773105"]I wonder if this is somewhat related

Post with similar issue here

I am not super familiar with anything relating to making anything world friendly... I keep things local.[/quote]

I keep my HO's where I can see them... "locally"

El Forum

[eluser]sawatdee[/eluser]
That is exactly the problem. Even encoded URLs can not contain characters with a hex value higher than 7E. Even ISO-8859-1 is not supported. This is a major limitation of CodeIgniter.

I have also noticed that the Email class does not support RFC 2047 header formats. I have started adding support for RFC 2047 headers to Email.php, but I have not finished yet.

I don't know if there is a formal process for changing system code, but CodeIgniter should have an i18n work group to make the system code compatible with international formats. I am very interested in i18n and would be willing to lead or be a member of such a team.

El Forum

[eluser]CI Lee[/eluser]
I know that the Guys at Ellis Labs are uber busy at the moment.

However their uptake of fixed code is quite rapid and if a proof of concept is submitted along with a fix; a few quick tests later to make sure its sound it is committed to SVN.

-Lee

El Forum

[eluser]Michael Wales[/eluser]
I don't claim to be an expert when it comes to encoding and RFC standards, but doesn't RFC 3986 state the URLs consist of US-ASCII characters only?

El Forum

[eluser]sawatdee[/eluser]
Yes, US-ASCII characters only. However, encoded URLs consist of hex digits separated by percent signs, all of which are ASCII characters. The server then turns those hex digits into a sequence of bytes that represents a string in some character encoding.

El Forum

[eluser]Avatar[/eluser]
heres a suggestion albeit not a good one. you can add the character not permitted into your config.php as below:
Code:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_-`;';

theres a weird thing about the above string though, if I remove ` then it breaks. any special chars I want in there i have to place ` then all special chars I want.