Welcome Guest, Not a member yet? Register   Sign In
Site Search & Internationalization
#1

[eluser]haydenp[/eluser]
The current site I am developing will be supporting 8+ different languages. I'm finding that coding a site search that accounts for internationalization is a little tricky and am hoping someone in the community has 'been there done that', and can provide a few pointers.

POST -TO- GET

I'm sticking to using segments with my search form. A user fills in the keywords form field, the form submits to an intermediary method which adds the keywords as a segment and executes a redirect eg.

Code:
$kw = urlencode($this->input->post('keywords'));
redirect( 'search/kw/' . $kw );
exit();

This all works fine until one introduces internationalization eg. a user searches on the keywords 'Consolação Hotel'.

Under CI 1.7.2 (fresh install) this throws the usual ...

Code:
The URI you submitted has disallowed characters.

One would immediately think of adding the foreign characters to the 'permitted_uri_chars' config setting eg.

Code:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-çã';

But what about ALL the other characters one will have to account for? Before you know it you are looking at something along the lines of eg.

Code:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-\'`!@+=& ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûýýþÿŔŕ';

... and even this is no where near complete ...

Question 1

As there are more permitted characters than there are unpermitted characters (I assume this is the case) would it not make sense for the _filter_uri() method in URI.php to check against a list of 'unpermitted characters' instead of checking against the list of permitted characters held in $config['permitted_uri_chars']?

So instead of having $config['permitted_uri_chars'] ... CI should have $config['unpermitted_uri_chars'].

Question 2

Has anyone coded search functionality on a site offering multiple language support and come across AND resolved such issues in an eloquent but robust manner?

All feedback will be greatly appreciated.

Many thanks
#2

[eluser]Josepzin[/eluser]
I have a similar problem, but i think is better to base64_encode the string, is not a friendly URI, but the string is standar.

BUT i have another problem... i encode the string, i make the anchor "/5/DfwwfHjkls==", and then i try to get this segment using $this->uri->segment(nn);

And the result is LOWERCASE!!! the base64_decode dont work... Sad Sad

Sorry, i dont answer your questions Big Grin
#3

[eluser]haydenp[/eluser]
@Josepzin

base64_encode can be used. The problem you are experiencing is discussed in a number of forum posts.

I originally ran my solution via a database using a unique string as an identifier (similar to the way searches are done on this forum) but I would prefer to retain the search keywords in the URL. UNLESS someone can convince me otherwise ;o)
#4

[eluser]Josepzin[/eluser]
@haydenp: rawurlencode may be a solution, but i dont know if it works with CI
$config[‘unpermitted_uri_chars’] is a good idea.

About base64_encode, can be used but when i get from URI-segment, it is in lowercase and decode dont works...




Theme © iAndrew 2016 - Forum software by © MyBB