CodeIgniter Forums
Convert database data to UTF-8 - Printable Version

+- CodeIgniter Forums (https://forum.codeigniter.com)
+-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20)
+--- Forum: Archived Development & Programming (https://forum.codeigniter.com/forumdisplay.php?fid=23)
+--- Thread: Convert database data to UTF-8 (/showthread.php?tid=35651)



Convert database data to UTF-8 - El Forum - 11-06-2010

[eluser]Swedie[/eluser]
Hello! Some of you might've seen my questions on the topic of encoding.

Now I need a good way of turning the data in the database into UTF-8. All data is currently in ISO-8859-1 forma so upon displaying it the swedish characters are all symbols. Not so good.

A quick fix is to run utf8_encode on the data before display it. But that's not gonna work in the long run. Instead I want to convert the data to the correct encoding in the database.

Would you suggest to loop through all tables and do a utf8_encode on everything? I know one negative about this, and it's that if the loop fails and you run it again, the already encoded characters will be encoded and displayed incorrectly again.

Code suggestion is welcome. Thanks Smile


Convert database data to UTF-8 - El Forum - 11-06-2010

[eluser]dudeami0[/eluser]
First off backing up your data is a must.

Second off, MySQL has these conversions built in.

Code:
ALTER DATABASE database DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

Hope this helps Smile


Convert database data to UTF-8 - El Forum - 11-06-2010

[eluser]Swedie[/eluser]
It's backed up. I'm also in the test environment.

Your above suggestion does not work, done that both ot the whole database but also the tables themselves.

The input into the database that was once made was with a page encoding of ISO-8859-1.

Changing just the database or table encoding only alters future inserts (as far as I know), not current data.


Convert database data to UTF-8 - El Forum - 11-06-2010

[eluser]dudeami0[/eluser]
Hmm, could try exporting and importing it:

Code:
mysqldump --user=username --password=password --default-character-set=charset --compatible=mysql40 dbname > dump.sql

Code:
mysql --user=username --password=password --default-character-set=utf8 dbname < dump.sql

Changing the commands to meet your needs.


Convert database data to UTF-8 - El Forum - 11-06-2010

[eluser]LuckyFella73[/eluser]
Hej Swedie,

Phil Sturgeon has posted a very nice tutorial how to set up
a clean utf-8 invironment or how to convert an existing
projekt into utf-8 encoding (database). Here is the link
to his blog:
http://philsturgeon.co.uk/news/2009/08/UTF-8-support-for-CodeIgniter

Hope that helps


Convert database data to UTF-8 - El Forum - 11-06-2010

[eluser]InsiteFX[/eluser]
You can alter the table all you want it will not change the database collation!

In PhpMyAdmin go to operations tab and chage the collation on the bottom dropdown.
It defaults to latin1_swedish_ci

InsiteFX


Convert database data to UTF-8 - El Forum - 11-07-2010

[eluser]Swedie[/eluser]
Well sadly none of above worked. So I wrote a PHP function to loop through all of the tables and convert the data when needed using PHP's mb_convert and mb_encode functions.

I think it worked Smile

Here's the code in case anyone wants it. You can true or false to convert data or just the tables (and columns). I haven't added collate, as it self-adjust to the correct one (atleast for me).

Code:
function fixEncoding($convertData = TRUE, $convertTableAndColumns = TRUE) {
    $table_encoding = "utf8";
    $start_on = "";
    $end_after = "";
    $sql_tables = mysql_query("SHOW TABLES");
    while($table = mysql_fetch_assoc($sql_tables)) {
        if($start_on AND substr($table['Tables_in_smc_www2'],0,strlen($start_on)) < $start_on) continue;
        if($end_after AND substr($table['Tables_in_smc_www2'],0,strlen($end_after)) > $end_after) continue;
        if($convertTableAndColumns) mysql_query("ALTER TABLE ".$table['Tables_in_smc_www2']." CONVERT TO CHARACTER SET ".$table_encoding.";");
        if($convertData) {
            ob_start();
            print "Table: ".$table['Tables_in_smc_www2']." - ";
            $sql_total = mysql_query("SELECT * FROM ".$table['Tables_in_smc_www2']."");
            $total_records = mysql_num_rows($sql_total);
            print " Total records ".$total_records." - ";
            $per_run = 1000;
            $runs = @ceil($total_records/$per_run);
            if($runs <= 0) $runs = 1;
            print "Runs: ".$runs." - Completed: ";
            $done = 0;
            for($i=0; $i < $runs; $i++) {
                if($i > 0) $done += $per_run;
                if($per_run > $total_records) $per_run = $total_records;
                $sql_run = mysql_query("SELECT * FROM ".$table['Tables_in_smc_www2']." LIMIT ".$done.", ".$per_run."");
                $run_count = mysql_num_rows($sql_run);
                if($per_run > $run_count) $per_run = $run_count;
                print $done."-".($done+$per_run).", ";
                if($run_count > 0) {
                    while($row = mysql_fetch_assoc($sql_run)) {
                        unset($data);
                        unset($first);
                        unset($values);
                        foreach($row as $field => $value) {
                            if(!$first) {
                                $first['field'] = $field;
                                $first['id'] = $value;
                            }
                            if(!is_numeric($value)) {
                                $value = stripslashes(html_entity_decode($value));
                                $value = mb_convert_encoding($value, 'ISO-8859-1', mb_detect_encoding($value, 'UTF-8, ISO-8859-1', true)); //from something to iso, then over to utf-8 db tables
                                $data[$field] = $value;
                            }
                        }
                        if(is_array($data)) {
                            foreach($data as $key => $value) {
                                if($key != $first['field']) {
                                    $values .= "$key = '".addslashes($value)."', ";
                                }
                            }
                            $query = "UPDATE ".$table['Tables_in_smc_www2']." SET ".substr($values, 0, -2)." WHERE ".$first['field']." = ".$first['id']."";
                            mysql_query($query) or die("Table: ".$table['Tables_in_smc_www2'].", Fieldname: ".$first['id'].": ".mysql_error());
                        }
                    }
                }
            }
        }
        print "Done!<br>";
        sleep(1);
        ob_flush();
    }
}