Welcome Guest, Not a member yet? Register   Sign In
Localization duplicate page problem when locale set in routes
#1

Hello everyone!

When we set locale in routes as says localization manual:

PHP Code:
$routes->get('{locale}/books''App\Books::index'); 

We get a problem when any URLs with any words in the locale segment become available: example.com/any_word/books

For example, your demo page https://website2.codeigniter.com/en/download can be also opened as https://website2.codeigniter.com/anyotherword/download
or https://website2.codeigniter.com/anything/download

This creates a big problem of duplicate pages in search engines. It is better that such pages return 404 errors

We can solve this problem by comparing first segment with supportedLocales. But this will not work if we do not need to specify the locale for the default language in routes.

How can we fix it correctly?
Reply
#2

(This post was last modified: 01-22-2020, 08:11 PM by jameslittle.)

So that is the way the documentation says it should behave. If CI can't match a locale, it's going to honor the route and serve your default locale.

But definitely agree that this arrangement could have SEO ramifications for duplicate content. And it's probably not good security to have arbitrary URLs that will answer to any string.

This isn't tested, but in your function, "App\Books::index" in the example, you probably need something like:

PHP Code:
$config = new \Config\App();

if (!
in_array($this->request->uri->getSegment(1), $config->supportedLocales)) {
   throw \CodeIgniter\Exceptions\PageNotFoundException::forPageNotFound();
   exit();



Hope that helps.
Designer, developer and Diet Dr. Pepper addict. Messing up PHP since <?= $when['year';] ?>
Reply
#3

Thanks for answer.

This solves the problem only partially. There is a case when the default language should not be specified in the URL. For example, when initially a site existed in only one localization and pages without specifying a locale already have a lot of weight. In this case, the described solution does not work.
Reply
#4

Hello Stormbringer,

Maybe you could use @jameslittlej 's solution, and you could define some redirections manually when it refers to old pages ?
https://codeigniter4.github.io/userguide...ing-routes
Reply
#5

So this problem only occurs when setting the locale in route's, right? Not when using Content Negotiation?
Reply
#6

(01-24-2020, 12:14 AM)littlej Wrote: Hello Stormbringer,

Maybe you could use @jameslittlej 's solution, and you could define some redirections manually when it refers to old pages ?
https://codeigniter4.github.io/userguide...ing-routes

This right decision if we change strategy, and transfer from old url standart to new with locale. Вut what if there are too many old pages?
With this we want to leave our old strategy:

1. RU (default lang) - site.com/URI
2. UK - site.com/uk/URI
3. EN - site.com/en/URI

It would be nice that all pages (old and new) in default lang remains without locale in URL. It is normal practice
Reply
#7

I may be over-complicating this, but you could do what you're talking about with the Regular Expression feature in routes.

https://codeigniter4.github.io/userguide...xpressions

You would essentially be treating the locale segment like any other URL segment, bypassing the special "locale" option. You'll want to follow the RegEx route with a catch-all site.com/uri route definition.

Again... not tested, but something like this:

PHP Code:
//grab the locales and implode into regex string
$config = new \Config\App();
$locs implode("|"$config->supportedLocales);

$routes->add('('.$locs.')/uri''Controller::home');
$routes->add('uri''Controller::home'); 

If you go that route (pun intended), you might need to catch the segment to explicitly set the language using lang()

https://codeigniter4.github.io/userguide...=lang#lang

Don't know if this is the BEST solution, but it might be a path forward.
Designer, developer and Diet Dr. Pepper addict. Messing up PHP since <?= $when['year';] ?>
Reply
#8

jameslittle, thanks for the hint. Both of our questions are successfully resolved by this code in routes.php:

PHP Code:
// localized routes
$supportedLocalesWithoutDefault array_diff($this->config->supportedLocales, array($this->config->defaultLocale));
if (
in_array($this->request->uri->getSegment(1), $supportedLocalesWithoutDefault)) {
    
$routes->get('{locale}/books', 'App\Books::index'); 
    
//other localized routes


First of all we define all locales without default. Next we set localized routes only for this locales. This allows us to leave familiar URLs for the default language, and excludes access to invalid duplicate pages.

Anyway, it looks like a crutch, think decisions must be released in core
Reply
#9

(This post was last modified: 01-28-2020, 03:38 PM by jameslittle.)

I'd agree with you that this is functionality that may make sense in core. More often than not, people are probably going to have to dance around the issues you faced.

Maybe there's some compelling reason why it's this way, but it's probably a good idea to open an issue on Github and let the team weigh in on it. 

https://github.com/codeigniter4/CodeIgniter4/issues
Designer, developer and Diet Dr. Pepper addict. Messing up PHP since <?= $when['year';] ?>
Reply
#10

(This post was last modified: 08-27-2021, 09:42 AM by takbitdev.)

If we can assign segment 1 of all URLs to locale, the problem will be solved. by this way, canonical URLs created and you can say to search engines.
For example:

https://example.com

will be converted to:

https://example.com/en

Of course this technique should apply to all URLs.

https://example.com/blog/article/1

will be converted to:

https://example.com/en/blog/article/1

I used this approach in CI3 and its SEO results was good and acceptable.

Can anyone say how this possible in CI4, by some sample codes?
Reply




Theme © iAndrew 2016 - Forum software by © MyBB