Welcome Guest, Not a member yet? Register   Sign In
[Solved] URIs with special characters like á À ê ó ü
#1

[eluser]Cesar Kohl[/eluser]
Hi guys!

Yesterday I spent some hours trying to find the solution to this problem I'm having but I guess nobody had it before.

I want to make an URI like this:

Code:
http://www.example.com/search/[keyword]

Where the [keyword] can be words with special characters like á À ê ó ü etc etc. I know this is possible because www.delicious.com's URIs are like this.

Code:
http://www.delicious.com/cesarkohl/inteligência

At the line ~157 of the config/config.php file we have a RegEx notation:

Code:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-';

And reading through the internet it seems the solution is to write the right regular expression to it. But I can't!

Do you know the solution to it?
#2

[eluser]Cesar Kohl[/eluser]
Anyone?
#3

[eluser]Ochetski[/eluser]
First of all, "search/[keyword]" will be hard to you make because you will need an .htaccess file configured for it.

You can try "search/index/[keyword]" or change "index" for another controller method you may have.

TO use the special characters on your url you may have to do something like this:
Code:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-\?~`\]\[\}\{ãàáâäõé';
Add all special characters you may input into your search keyword.

You can also try not using the keyword itself, because you will have to add all possible characters to the RegExp.


I suggest you to encrypt with base64_encode() and later base64_decode(), it's easier and less change to your CI.
#4

[eluser]Cesar Kohl[/eluser]
I'm already using an .htaccess file.

I used your code and still doesn't work. Even with the original URI not related with the .htaccess config. =/
#5

[eluser]Ochetski[/eluser]
The code I've posted is not complete, you need to put ALL the characters that could appear.
What about you show us the errors and examples?

Perhaps you guess that index I said is the index.php replaced by htacces, but its not. It's the controller method, the function index() or main()...
#6

[eluser]takasia[/eluser]
Maybe it will help :-)
To allow japanese characters in the URL I have this in the config file:

$config['permitted_uri_chars'] = 'ァ-ヶ ぁ-ゞ ヲ-゚ 亜-腕 弌-熙 亜-熙 a-z 0-9~%.:_\-';

I also found something like this:

http://php.net/manual/ja/function.preg-match.php
(it was in the comments)

$pattern ='/^[-a-zA-Z0-9_\x{30A0}-\x{30FF}'
.'\x{3040}-\x{309F}\x{4E00}-\x{9FBF}\s]*$/u';

U+4E00–U+9FBF is for Kanji
U+3040–U+309F is for Hiragana
U+30A0–U+30FF is for Katakana

so maybe there is a way to do something similar for the characters you need?
Some information you could find here:
http://pl.wikisource.org/wiki/Unicode/Lista_znaków

I know it's in polish, but htere are english names also, so maybe it would do?
Latin Extended-A and Latin Extended-B should cover your needs :-)
#7

[eluser]Ochetski[/eluser]
Haha, thanks @takasia, funny is that to allow Japanese characters is a problem that I was looking for solution.
And now this will help me. A lot! :lol:

@Cesar Kohl
I'm not sure, but i guess you can find the interval between the chars you want to use the groups as well. something like [á-ú], I don't know exactly how it would be to help you.
#8

[eluser]Cesar Kohl[/eluser]
Thanks Ochetski and takasia. Based on your thoughts I found a solution to it. Let me explain:

Problem:

Code:
http://www.example.com/folder/êêêêê

Because of this ê,

Quote:The URI you submitted has disallowed characters.

Solution:

So, I opened libraries/URI.php and at line 186 I added echo $str:

Code:
function _filter_uri($str)
    {
    echo $str;
    ...}

Refreshed the page and voilá:

Quote:á

So, ê = á.

I added this code to the config/config.php on line ~157, like this:

Code:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-á;

And now everything's just fine.

Of course then I added all the chars I need to use.

Problem solved! Thanks a lot!
#9

[eluser]Ochetski[/eluser]
Actually you have another problem... your page is not configured as UTF-8. Check CI's config.php and include an HTML meta defining that the page is UTF-8 encoded.

You may have to check the configurations of your editor as well, its probably using iso-8859-1 or mac-roman (if it's a mac).

á instead of ê means that your broser/system thinks that the code is ISO but it's UTF.
#10

[eluser]takasia[/eluser]
@Ochetski - Nice to be some help :-) You write japanese-lang applications?

@Cesar Kohl:

Don't forget to save the config file as UTF8 without BOM!
The encoding of the website (Ochetski' post) and the encoding of php's where the non-latin characters are put MUST be the same and UTF8 is best solution.




Theme © iAndrew 2016 - Forum software by © MyBB