Welcome Guest, Not a member yet? Register   Sign In
ciHTACCESS the best .htaccess for CodeIgniter
#1

[eluser]kuroir[/eluser]
After a lot of time dealing with CodeIgniter I figured out that one of the main problems with it is the duplicated content that by accident you can end up with.

Lets say we're using a clean installation of CodeIgniter. This means we have the default_controller set to "welcome". Lets suppose we use the easy .htaccess files around the web that just remove the "index.php" file.

So... we have the following "alternatives" to access to our 'default' controller.

Code:
http://test.com/
http://test.com/welcome
http://test.com/welcome/
http://test.com/welcome/index
http://test.com/welcome/index/

And if you have a URL suffix, for instance ".html" you'll be able to access on more ways:
Code:
http://exolimpo.com/welcome.html
http://exolimpo.com/welcome/.html
http://exolimpo.com/welcome/index.html
http://exolimpo.com/welcome/index/.html

You can even do some crazy stuff like:
Code:
http://test.com/welcome///////////////index//////////.html

Every single variant is a different page for Search Engines.. so this is bad for us. But then: What do we need to do?

Code:
1. Make the 'default' controller and index.php point to "/"
2. Remove the /index/ segment from the URL.
3. Remove Trailing Slashes
4. Remove Multiple Slashes in between url.
5. Add the url sufix (optional)
6. Remove slashes before the url suffix (optional)

After some test I got the following:

Code:
<IfModule mod_rewrite.c>

# ciHTACCESS, by Mario "Kuroir" Ricalde
RewriteEngine On
RewriteBase /

# Redirect index.php and default controller (you need to edit this) to "/". This is to prevent duplicated
# Content. ( /welcome/index , index.php => /)
RewriteRule ^(welcome(/index)?|index(\.php)?)/?$ / [L,R=301]

# Remove /index/ segment on the URL.
RewriteRule ^(.*)/index/? $1 [L,R=301]

# Remove Trailing Slashes.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)(/+)$ $1 [L,R=301]

# Remove Multiple slashes in betweeen
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]

# Add the file SUFIX (sufix can be set on config.php).
# RewriteCond %{REQUEST_FILENAME} !-f
# RewriteCond %{REQUEST_FILENAME} !-d
# RewriteCond %{REQUEST_URI} !\.html
# RewriteRule ^(.+)$ $1\.html [L,R=301]

# Remove any slash before .html ( to prevent site/.html )
# RewriteCond %{REQUEST_URI} \/+\.html$ [NC]
# RewriteRule ^(.+)\/+\.html$ $1\.html [L,R=301]

# Send everything to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]

</IfModule>

Test it out and tell me if you find any bugs. Just save it into your .htaccess Smile

Any comment would really help.. I'm wondering if anyone needs this?
#2

[eluser]CroNiX[/eluser]
Im not sure why this would matter. If you code properly you use the same format on links throughout your site. It doesn't matter if you can access the content a different way because if there is no LINK to it no search bot will find it.
#3

[eluser]Jonas G[/eluser]
Good work.

But you don't get a duplicate content penalty for having the same page with different urls (http://googlewebmastercentral.blogspot.c...nalty.html).
#4

[eluser]kuroir[/eluser]
[quote author="CroNiX" date="1256127036"]Im not sure why this would matter. If you code properly you use the same format on links throughout your site. It doesn't matter if you can access the content a different way because if there is no LINK to it no search bot will find it.[/quote]

I've seen it. Apps on CI and CakePHP with Duplicated content cause of this. It's true that if you use $routes there's nothing to worry about but yeah.

Also about "no link to it" well... when your app is big enough malicious people will begin to mess around. But that can be fixed with a canonical.

I did this to add just an extra "insurance".
#5

[eluser]kuroir[/eluser]
[quote author="Jonas G" date="1256130355"]Good work.

But you don't get a duplicate content penalty for having the same page with different urls (http://googlewebmastercentral.blogspot.c...nalty.html).[/quote]

c14n is the real matter here, check Google Webmaster Panel. It's a pain to look at dummy data generated by url variants. Also

Quote: In step 3, if we aren't able to detect all the duplicates of a particular page, we won't be able to consolidate all of their properties. This may dilute the strength of that content's ranking signals by splitting them across multiple URLs.
#6

[eluser]Billy Shall[/eluser]
Using pieces of this on some of my sites. Thanks
#7

[eluser]cryogenix[/eluser]
yours is actually very similar to this one i've been using for a year now:

http://farinspace.com/codeigniter-htaccess-file/
#8

[eluser]Krynble[/eluser]
Thank you for this settings, they're very useful! Using them on all my projects now =)
#9

[eluser]Unknown[/eluser]
I registered to these forums just to say thank you. Helped me a ton.




Theme © iAndrew 2016 - Forum software by © MyBB