Welcome Guest, Not a member yet? Register   Sign In
[Solved] URIs with special characters like á À ê ó ü
#11

[eluser]umefarooq[/eluser]
really great, what about Arabic i am also looking for solution which support Arabic letters in URL using CI i have use the same method but no luck getting error on URI.php line 191

http://php.net/manual/ja/function.preg-match.php
(it was in the comments)

$pattern =’/^[-a-zA-Z0-9_\x{30A0}-\x{30FF}’
.’\x{3040}-\x{309F}\x{4E00}-\x{9FBF}\s]*$/u’;
#12

[eluser]Ochetski[/eluser]
You can check for Arabic chars interval here: http://www.nies.ch/doc/entities.en.php?page=6
You must find the first and the last char and put it like first @takasia example.

"ァ-ヶ ぁ-ゞ ヲ-゚ 亜-腕 弌-熙 亜-熙 a-z 0-9~%.:_\-"
The second option didn't worked for me on unix or windows. Bugs with CI replace on URI library '\x' became \\x or the char is misunderstood. Even if you change \x{30FF} for \x30FF without "u" flag on URI library regexp.

I've tried on mac and windows with Japanese characters but windows apache server seems to be buggy with it. I guess any unix will work pretty easily. I will try on ubuntu later and back here with results. More tests with mac today, maybe with ubuntu tonight.
#13

[eluser]Cesar Kohl[/eluser]
Yes Ochetski, I realized that after I posted the message above. Encoding is very tricky but I'm making progress here. Thanks!
#14

[eluser]umefarooq[/eluser]
i have tried 2nd option, but not the first one but now one thing want to ask do i have to put all Arabic char in regular expression to check it URL
#15

[eluser]Cesar Kohl[/eluser]
I think so.

But before this would be interesting to lookout about encoding because my problem was related with it. Read this: http://htmlpurifier.org/docs/enduser-utf8.html
#16

[eluser]Ochetski[/eluser]
My problem on windows was that Input library was not validating my url, i had to make a My_Input whit the following:

Code:
class MY_Input extends CI_Input {
    function _clean_input_keys($str)
    {
        if ( ! preg_match("/^[ァ-ヶ ぁ-ゞ ヲ-゚ 亜-腕 弌-熙 亜-熙 a-z0-9:_\/-]+$/i", $str))
        {
            exit('Disallowed Key Characters.');
        }
        return $str;
    }
}

And this to my config:

Code:
$config['permitted_uri_chars'] = 'ァ-ヶ ぁ-ゞ ヲ-゚ 亜-腕 弌-熙 亜-熙 a-z 0-9~%.:_\-';

Worked just fine on win, mac and linux. I was scared about unix/win differences, but it's all ok now. =]
#17

[eluser]umefarooq[/eluser]
here you have put separate Japanese letters in regular expression, in Arabic a word is combination of letters

Code:
كود

i just wrote code in Arabic, these 3 letters are used for it

ك و د
will this preg_match will parse as separate letter.

[edited]
wow its really work i put separate words and its working really nice thanks let me see more.
#18

[eluser]Ochetski[/eluser]
Better than separated words is that regexp can understand the interval between characters like a-z or 0-9 but with any kind of characters... in your case you can try the link I've posted. [http://www.nies.ch/doc/entities.en.php?page=6]

For example:
From: د (د or \x62F) to ي (ي or \x64A)

You will have a lot of characters listed here: http://www.nies.ch/doc/entities.en.php?page=6
If you use ي-د all between them will probably work.

I just cant create the exact ode for you because I don't understand Arab.




Theme © iAndrew 2016 - Forum software by © MyBB