CodeIgniter Forums
Problem retrieving words with accents when they come from MySQL - Printable Version

+- CodeIgniter Forums (https://forum.codeigniter.com)
+-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20)
+--- Forum: Archived Development & Programming (https://forum.codeigniter.com/forumdisplay.php?fid=23)
+--- Thread: Problem retrieving words with accents when they come from MySQL (/showthread.php?tid=38222)



Problem retrieving words with accents when they come from MySQL - El Forum - 02-02-2011

[eluser]caperquy[/eluser]
Hello
I am implementing a Codeigniter application where I have to retrieve from a MySQL database all documents which contain a specific word, no matter the spelling. In my example I am looking for Eglise which can be written Eglise, église or Église.
If I run the following instructions :

Code:
$pattern="/(e|è|é|ê|ë)gl(i|ì|í|î|ï)s(e|è|é|ê|ë)/i";
$texte="xxxx églises yyyy Eglise zzzz Église tttt";
$nb=preg_match_all($pattern, $texte, $matches, PREG_OFFSET_CAPTURE);
echo "Matches found : $nb <br />";
for ($i=0;$i<$nb;$i++)
    {
        echo "Matches[0][$i][0] = ".$matches[0][$i][0]."<br />";
    }

three matches are found which means that all three spellings have been detected.

If now the text provided comes from the MySQL database then the spelling église is not found. What can explain that difference.
Many thanks to whoever can give me a clue.

CapErquy


Problem retrieving words with accents when they come from MySQL - El Forum - 02-02-2011

[eluser]xatrix[/eluser]
http://www.phpbuilder.com/board/showthread.php?t=10344217


Problem retrieving words with accents when they come from MySQL - El Forum - 02-02-2011

[eluser]caperquy[/eluser]
I looked to the document you mentioned. It seems to me that utf8_general_ci should be appropriate.
Does that mean that in config.php I should code :

Code:
$config['charset'] = "utf8_general_ci";

instead of

Code:
$config['charset'] = "UTF-8";

CapErquy


Problem retrieving words with accents when they come from MySQL - El Forum - 02-02-2011

[eluser]rogerwaldrup[/eluser]
http://forums.mysql.com/read.php?103,392215,392215 check this out.


Problem retrieving words with accents when they come from MySQL - El Forum - 02-02-2011

[eluser]caperquy[/eluser]
I checked what you said. I then changed all fields in the MySQL table I am using to utf8_general_ci first then to utf8_unicode_ci.
In both cases it still does not work.
I really do not know what to do.
CapErquy


Problem retrieving words with accents when they come from MySQL - El Forum - 02-03-2011

[eluser]xatrix[/eluser]
No, no. When you create your database or table (in phpMyAdmin or other tool) you should specify your preferred collation (i.e.: utf8_general_ci).

What is the result of the query that you're running preg_match_all?


Problem retrieving words with accents when they come from MySQL - El Forum - 02-03-2011

[eluser]Kobus M[/eluser]
One more thing you need to consider is that your document's character set should be exactly the same as your database character set. These are set in the &lt;head&gt;&lt;/head> tags for HTML, and as a call to the PHP header() function in PHP.

In HTML it would be something like this:

Code:
&lt;meta http-equiv="Content-Type" content="text/html;charset=UTF-8" /&gt;

In PHP it would look like this:

Code:
header('Content-type : text/html; charset=utf-8');

If your markup/script language does not coincide with your database charset, you will have some problems.

Kobus


Problem retrieving words with accents when they come from MySQL - El Forum - 02-04-2011

[eluser]caperquy[/eluser]
To answer xatrix question this is the text that I pass to the preg_match_all function :

Chaque étape est montrée dans des églises de différents coins de France

the word église I am looking for is really there

On the other hand I did what Kobus said :
I recreated my database using the following command :

Code:
CREATE DATABASE `ci_docavy` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

I also set my HTML code to look like this :

Code:
&lt;html &gt;
&lt;head&gt;
&lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /&gt;
&lt;meta http-equiv="Content-Style-Type" content="text/css" /&gt;
&lt;?php
header('Content-type : text/html; charset=utf-8');
?&gt;

Unfortunately there is no change.

CapErquy


Problem retrieving words with accents when they come from MySQL - El Forum - 02-04-2011

[eluser]Kobus M[/eluser]
[quote author="caperquy" date="1296836064"]To answer xatrix question this is the text that I pass to the preg_match_all function :

Chaque étape est montrée dans des églises de différents coins de France

the word église I am looking for is really there

On the other hand I did what Kobus said :
I recreated my database using the following command :

Code:
CREATE DATABASE `ci_docavy` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

I also set my HTML code to look like this :

Code:
&lt;html &gt;
&lt;head&gt;
&lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /&gt;
&lt;meta http-equiv="Content-Style-Type" content="text/css" /&gt;
&lt;?php
header('Content-type : text/html; charset=utf-8');
?&gt;

Unfortunately there is no change.

CapErquy[/quote]

There are a few final things I can think of for you to try:

1. Check the setting of your character sets in your server configuration files. php.ini, my.cfg, etc. All should also be set to UTF8
2. Even when recreating your database with UTF8, individual fields could still be set to ISO-8859-1 or something else. Make sure all data is in UTF8 format too.

If this does not help, I am sorry - I am out of ideas. Having struggled with this myself, I solved my issues by doing the things I suggested.

Kobus