• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problems with UTF8

#1
[eluser]gh05t[/eluser]
Hey guys, I have a problem with UTF8. My view which prints text from the database won't display all the characters properly. I've done the following:
- set char_set to utf8 and dbcollat to utf8_general_ci in database.php
- edited head in the view: <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
- double-checked database and tables charsets: all is set to utf8

Is it a bug? Or am I missing something?

Please help.

I'm using:
CodeIgniter 2.0.2
MySQL 5.0.45
PHP 5.2.6
Apache 2.2.9
NetBeans 6.8

#2
[eluser]osci[/eluser]
The above should be enough I think.
I don't know if the files have to be saved in utf8 (no bom) as this is from data getting from db.

Are db data imported from another db?

#3
[eluser]gh05t[/eluser]
The text is extracted from PDF files and inserted into the database - all done with a bash script.

When I select couple of rows from the DB using terminal - it displays all the characters properly.

#4
[eluser]WanWizard[/eluser]
Check if the page you're loading in the browser is really encoded in UTF-8 (in FF for example via View, Character Encoding).

Some linux distro's for example have a default apache config that forces the output to ISO-8859-1 (look for AddDefaultCharset).

#5
[eluser]gh05t[/eluser]
Done that, it's UTF8.

More interesting things I've found out:
1. There are MySQL server variables: character_set_client, character_set_connection, character_set_reults; which are all set to 'latin1'. When I change them to 'utf8' they appear for a moment as 'utf8' but go back to 'latin1' when I close the command line. I'm using "SET GLOBAL variable_name='value'" command. Is this ok?
2. in /usr/share/mysql/charasets there is no utf8.xml - I have a strong feeling that it's supposed to be there.

Oh yeah: I'm running the server on Fedora 8 Werewolf (I know, outdated)

Thanks for the help so far guys.

#6
[eluser]gh05t[/eluser]
I've also tried putting:
Code:
$this->db->query('SET NAMES \'utf8\'');
            $this->db->query('SET character_set_connection=\'utf8\'');
            $this->db->query('SET character_set_results=\'utf8\'');
            $this->db->query('SET character_set_client=\'utf8\'');
In my controller before fetching the results, but echo mysql_client_encoding(); still returns latin1.

WTH?

#7
[eluser]WanWizard[/eluser]
The CI database driver sets the characterset and the collation, based on your db config. So this should not be needed.

#8
[eluser]InsiteFX[/eluser]
I have mentioned this a 100 times on the forums here there is a bug in MySQL that always sets the collation to latin1_swedish_ci ! No matter what you set the collation to.

PhpMyAdmin
1) click on your database name
2) click on the operations tab on top tab menu.
3) change collation for database at very bottom to utf8_general_ci or utf8_unicode_ci
4) SAVE

InsiteFX

#9
[eluser]gh05t[/eluser]
The problem persists. Any other ideas?


I've added
Code:
[mysqld]
skip-character-set-client-handshake=1
default-character-set=utf8

in my.cnf

all the variables (character_set_client, character_set_result etc.) are now UTF8. but echo mysql_client_encoding() still returns 'latin1'. Is there any way to convert latin1 to utf8 explicitly? I've tried
Code:
mb_convert_encoding($str,'UTF8','latin1');

but it doesn't work

edit #2:
now I'm really confused, mb_detect_encoding shows that the string is in UTF8, but it still doesn't display characters properly. What the hell is going on?


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2019 MyBB Group.