• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Tutorial - encoding issue

#1
Sad 
Hello

I'm new to CodeIgniter and I did the tutorial part about creating a news application.
I got an issue at the end with the slug encoding.
If I wrote words with accents (ex: é à è), accents are replaced with ??? in my slug field in my database whereas they are displayed correctly in others fields (title and text). I looked at url_title but there isn't an encode param.

[Image: 1508848542-issue.jpg]

In application/config/database.php, char_set is utf8 and dbcollat is utf8_general_ci


Thanks for help Smile
Reply

#2
I think this line is the cause: https://github.com/bcit-ci/CodeIgniter/b...r.php#L508
Reply

#3
I think ivantcholakov is right. A quick look at php.net suggests this:

PHP Code:
$lower_case_str mb_strtolower($str, 'UTF-8'); 

http://php.net/manual/en/function.mb-strtolower.php

Although there are some other functions in the strtolower user notes like this:

PHP Code:
<?php
function strtolower_utf8($inputString) {
    $outputString    utf8_decode($inputString);
    $outputString    strtolower($outputString);
    $outputString    utf8_encode($outputString);
    return $outputString;
}
?>

Paul.
Reply

#4
Personally I'm using the convert_accented_characters before using the string with url_title. So that you get e a e instead of é à è for cleaner URLs. The previous posts are correct, strtolower aren't UTF-8 safe.

You can add this into application/helpers/url_helper.php, i have replaced it for you.
https://pastebin.com/3dhY5SmY

@PaulD: Your second quote will replace all illegal characters (in ISO-8859-1) with ?, so that's not recommended in my opinion.

Just took a quick look at CI 2.2.6 and compared it with 3.1.6. In the old version all illegal characters where removed but they are now kept. And now it generates illegal URLs, in my opinion.

PHP Code:
$trans = array(
    
'&.+?;'                 => '',
    
'[^a-z0-9 _-]'          => '',
    
'\s+'                   => $separator,
    
'('.$q_separator.')+'   => $separator
); 

PHP Code:
$trans = array(
    
'&.+?;'                  => '',
    
'[^\w\d _-]'             => '',
    
'\s+'                    => $separator,
    
'('.$q_separator.')+'    => $separator
); 

https://github.com/bcit-ci/CodeIgniter/c...ec468540f1

This is referenced here:
https://github.com/bcit-ci/CodeIgniter/issues/4993

Personally, i think it should be changed back into, if everything else than a-z are considered illegal. Any thoughts on this?
PHP Code:
'[^a-z0-9 _-]'          => ''
Reply

#5
Hello

thanks for your replies.
I will use convert_accented_characters before using url_title to have cleaner urls without accents.
Reply

#6
Read this article on supporting the full Unicode character set.

How to support full Unicode in MySQL databases
What did you Try? What did you Get? What did you Expect?

Joined CodeIgniter Community 2009.  ( Skype: insitfx )
Reply


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2020 MyBB Group.