Welcome Guest, Not a member yet? Register   Sign In
Internationalization and Routing: adding the language code (en,de,fr) inside URLs
#1

[eluser]Michael Ekoka[/eluser]
language internationalization i18n routing

For those interested in having the current language in the url like this:
http://www.mysite.com/en/my_controller/my_action/var1/var2

I have been playing around with routing and hopefully the time I spent will be worth your while.

You first need to follow a couple of steps in your application's config/routes.php:


1-) You first need to set a regex for the language code that will be prepended to request urls other than the index page. The regex covers the following criteria:
- the language code should be optional
- it should be composed of 2 alphabetical characters
- if added, it must be followed by a slash
e.g. The following urls should point to the same route:
http://www.mysite.com/en/controller/action/var1/var2
http://www.mysite.com/controller/action/var1/var2

In your routes.php file insert this:
Code:
// routes.php
$prepended_lang = "(?:[a-zA-Z]{2}/)?";


2-) You also need a slightly different regex for the language code optionally appended to the index url (www.mysite.com/en) . The criteria covered by that regex:
- the language code should be optional
- it should be composed of 2 alphabetical characters
- if appended to the index, the language code has an optional trailing slash
e.g. the following url should all point to the same route
http://www.mysite.com/
http://www.mysite.com/en
http://www.mysite.com/en/

still in routes.php insert this:
Code:
// routes.php
$appended_lang = "(?:[a-zA-Z]{2}/?)?";


3- Now lets set some routes in routes.php:

Code:
/* routes for controllers */

$route["scaffolding_trigger"] = "some_secret";

// now lets set the routes for the site index
$route["default_controller"] = "index_controller";
$route["$appended_lang"] = "index_controller"; // our index with our optional language code appended

// example of a route with the prepended optional language code
$route["{$prepended_lang}controllerX/actionY(.*)"] = "controllerA/methodB$1";


/**
* routes for controllers inside subfolders
* are similar to routes for normal controllers.
* you just need to add in the subfolder.
**/

$route["{$prepended_lang}subfolder"] = "subfolder/my_controller";
$route["{$prepended_lang}subfolder/controllerX/actionY/(.*)"] = "subfolder/controllerA/methodC/$1";


/**
* Finally you need to define a catchall route for the rest of the site.
* All other routes must come before this one if you want them to be caught.
* It simply remaps the url to the intended destination, whether the language
* code is included or not.
* e.g. the 2 following urls should point to the same controller/action if you set
* a route like described here.
* http://www.mysite.com/en/admin_folder/login_controller/authenticate_action
* http://www.mysite.com/admin_folder/login_controller/authenticate_action
**/

$route["{$prepended_lang}(.*)"]='$1';


4- Now in your controller if you want to know which language has been requested, do this:
Code:
$lang_code = $this->uri->segment(1);

Some gotchas:
- Verify the length of $lang_code. In the case that it is not 2 characters long, you can assume that $this->uri->segment(1) did not in fact return the language code, but rather the next item after it. This means that the request had no language code : e.g. http://www.mysite.com/somerequest. Make sure to then reset $lang_code to a default language.
- If $lang_code is 2 characters long, check its value against an actual list of supported language codes, in case a user sends you a request such as http://www.mysite.com/fr/somerequest, and you didn't prepare any fr translations. Here too you can use some logic to default any non supported language to a default.


Your application can now be accessed in 2 ways:

http://www.mysite.com/{lang_code}/contro...ar1/value1

and

http://www.mysite.com/controller/action/var1/value1

I'm still testing all this in my own app, please contribute if you find something odd or useful.
#2

[eluser]easymind[/eluser]
This seems to do the same:
Code:
$lang = "[a-zA-Z]{2}?";
$route["$lang/(:any)"] = "$1";

But it messes op the arguments passed to the functions (like your code does too on my system), so this works better:
Code:
$lang = "[a-zA-Z]{2}?";
$route["$lang/(:any)"] = "/$1";

But that one doesn't seem to work for controllers inside subfolders...
#3

[eluser]Michael Ekoka[/eluser]
Quote:This seems to do the same:
Code:
$lang = "[a-zA-Z]{2}?";
$route["$lang/(:any)"] = "$1";
But it messes op the arguments passed to the functions (like your code does too on my system)

I'm not sure if your regex will allow you to have a url without the language code. Why is this important? because sometimes your visitor won't type http://yoursite.com/en/fancy_page, but rather http://yoursite.com/fancy_page. The desired result is to still go to the same page with a default language set.

You need to group the language regex in parentheses before applying the '?' operator to it and make it optional. If you do, why not directly include the '/' with the language code. Hence, "([a-zA-Z]{2}/)?". In english: An upper or lower case letter, followed by an upper or lower case letter, followed by a slash, the whole thing may or may not be there.

If you group that regex in parentheses like described, you have two options when using backrefs:
1- you can start your backrefs with $2
2- or you can modify the regex slightly to use non-capturing groups, it then becomes
Code:
$lang = "(?:[a-zA-Z]{2}/)?";
It is slightly more confusing for people not very familiar with regexes, but does allow you to start your backrefs at $1.


In your route:
Code:
$route["{$lang}(:any)"] = "$2"; // if you go with option 1

As for the code messing up on your system, can you show me what you had originally and also tell me which version of CI you are currently using (also if it is from svn or simple download)? I have to admit that I am using CI 1.5.4 svn, which has some significant difference with the downloaded version and I had to apply one or 2 fixes relating to uri routing. So your problem may be related.

Quote:But that one doesn't seem to work for controllers inside subfolders…

CI has a bug for controllers inside subfolders. I have reported it here and explained a fix, but it's only for the current svn version. For a previous version's fix go here
#4

[eluser]easymind[/eluser]
I am using version 1.5.4, not from SVN. There is still a big routing bug there which I solved with a bugfix from somebody else. This was my threat:
http://ellislab.com/forums/viewthread/64013/

But for my language routing (even in 1 subfolder deep, and it works with or without language in url) I use:

Code:
$route["[a-zA-Z]{2}/(:any)"] = "$1";

This is the only routing I need to have things like this work:

Code:
http://xxx.xx/en/folder/controller/function/vars/../..
http://xxx.xx/folder/controller/function/vars
http://xxx.xx/nl/controller/function/vars
http://xxx.xx/controller/function/vars

They all work fine (but I needed to fix Router.php and Codeigniter.php like you can see on my threat or the original bug fix)

Original bug fix.: http://codeigniter.com/bug_tracker/bug/2849/
#5

[eluser]Michael Ekoka[/eluser]
Hey easymind, I am glad that you were able to fix the controllers/sub-folder thing and that your language code works. I have also tested your language code regex ("[a-zA-Z]{2}") and I think I understand why it doesn't work for me when I omit the language code in the url. For the sake of documenting, I will explain, as this may avoid confusion to others that might find problems with this:

First, it will help to understand the difference between your routing style and mine:

In my applications, I typically have a few controllers that take care of a bunch of pages. For example, for all my forms, I will have a single Forms controller, but I never indicate the controllers in my urls. I rather use the routing for that. e.g. I don't have 'http://my_app.xxx/forms/registration', but 'http://my_app.xxx/registration'. I then include something like this in the routes:
Code:
$route["{$lang}/(registration|login|password_change|send_to_friend)"] = "forms/$1";
$route["{$lang}/(success|welcome|password_changed|thank_you)"] = "messages/$1";
$route["{$lang}/(about|disclaimer|privacy|contact)"] = "public/$1";
$route["{$lang}/(articles|blog|members|forum|wiki)"] = "private/$1";

Everytime I add a form to my application, I add a handler for it to the route and in the Form controller. The same for the other functionalities.

As you can see, this practice results in my application having almost no direct relations between the url routes and the controllers they invoke. In this situation your language regex (which has not been made optional) will not work if I forget to include the language code in my url.

However, if I changed my practice and started to include controllers in my urls as you do, I wouldn't need such an elaborate routing as there would be a direct relationship between urls and the controllers that they call. I could just do :

Code:
$route["{$lang}/(.*)"] = "$1";

In this case, yes, with or without language code, the framework is able to match a controller, but not for the reason you might think. Understanding the url to controller routing mechanism might avoid you a few headaches when down the road you try to do something like this:
Code:
$route["{$lang}/key/(.*)"] = "decrypter/confirmation/$1";
and it doesn't work.


Second, we need to understand what your regex does:

Code:
$lang = "[a-zA-Z]{2}";
// this is a non optional regex, it HAS to be there for the match to work

$route["$lang/(:any)"] = "$1";
// "[a-zA-Z]{2}/(:any)"
// match a upper or lower case letter, followed by another upper or lower case letter,
// followed by a slash, followed by anything. Group that anything part and then use it
// to find a controller and a method.

Lets say that you invoke a url and omit the language code, e.g. http://my_app.xxx/forms/registration. If in this case we use your language code regex in routing, the match will fail (remember that your language code regex is not optional, it is mandatory). When a routing match fails, the framework reverts to a "failsafe mechanism" that consists of taking the url that was passed and try to match it directly to the list of controllers. i.e. the /forms/registration url will effectively find the Forms controller and the registration method.

If we invoke a url using my routing method, and your language code regex while omitting the language code in the url, e.g. http://my_app.xxx/registration, the routing match will fail (mandatory language code is missing) and the "failsafe mechanism" will fail as well (no Registration controller found).

This is the reason why I encourage you to modify the language code regex itself and make it optional, as it will work regardless of the routing method :
Code:
$lang = "([a-zA-Z]{2}/)?";
// an upper or lower case letter, followed by another upper or lower case letter
// followed by a slash. The whole thing is made optional with the ? operator.
or
Code:
$lang = "(?:[a-zA-Z]{2}/)?";
// see explanation in earlier post http://ellislab.com/forums/viewreply/314636/

then route like this:
Code:
// my style
$route["{$lang}(registration|login)"] = "forms/$2";
// match 2 letters followed by a slash,
// followed by either 'registration' or 'login'.
// The '2 letters and a slash' part may or may not be there.
// No matter, use the second part to call the corresponding
// method of the Forms controller

// your style
$route["{$lang}(:any)"] = "$2"; // "([a-zA-Z]{2}/)?(:any)"
// match 2 letters followed by a slash, followed by anything.
// The '2 letters and a slash' part may or may not be there.
// No matter, group the 'anything' part and use it to find
// the controller and the method.
#6

[eluser]easymind[/eluser]
Hmmmm....

To make the language code mandatory keeps you in control of the routing. But as far as I know the routing routine will follow all routing options, even if it finds more that fit. So after the first routing did or did not work (because language code was included or not) it will then take the newly formed url and continue to match the rest of the routing rules.

So I think I would keep the non mandatory language code as a first routing rule and then make other routing rules to handle the redirects to registration forms and so on....

But also I like the controller->function urls. Maybe not when using the index function to pass variables to, but I just don't do that. So I try to make as little routing as possible.


Hmm, it seems not to work as I just said. The routing rules are not compared with the 'newly formed' url. I think that is what you mean....
So all routing should have a non-mandatory language part....
#7

[eluser]Michael Ekoka[/eluser]
Routing is an optional functionality in CI. Many developers bypass it. They rely solely on their url structure to map to a corresponding controller and method. other developers (like me) actively use routing to remap to something different than what the url contains (if you think about it that is the actual purpose of the routing functionality).

In your case, you seem to be using routing solely for the purpose of excluding the language code from the match. That is, if you didn't need the language code you would probably not need routing either as the framework would automatically match your urls to the controllers and methods. So for you, whether the language code is included or not dictates if the routing match will be used or if the automatic url=>controller match will be used.

For developers who actively use routing to effectively remap their urls, the language code has to be optional if they want their routing to be respected by the framework.
#8

[eluser]easymind[/eluser]
I understand you now completly and I think you were right. I use routing sometimes, and you have been posting about how to route normally AND have a language code to route also.

You were right to do it your way.

Take care,
B
#9

[eluser]Maulwurf[/eluser]
Hey, this solved my endless tries to get routing working as you would expect it to do Smile
Thanks for sharing!
#10

[eluser]LuckyFella73[/eluser]
First of all thank you for contributing all this!

I implemented this into a test environment and now want to use this method
in a "real" project.

What I still don't understand is how to handle the uri segments. Lets say
I have a class "form" a method "update" and a third uri segment containing
the id of the db row to update. For the standard language there would be
2 possible urls to access my controller:

Code:
www.domain.com/form/update/id
// or for example
www.domain.com/en/form/update/id
In one case the id is found in uri segment 3 in the other it's 4. Do I have to place
multiple if statements in my controller to handle that problem or is there something
I just didn't understand?




Theme © iAndrew 2016 - Forum software by © MyBB