Welcome Guest, Not a member yet? Register   Sign In
Convert database data to UTF-8
#1

[eluser]Swedie[/eluser]
Hello! Some of you might've seen my questions on the topic of encoding.

Now I need a good way of turning the data in the database into UTF-8. All data is currently in ISO-8859-1 forma so upon displaying it the swedish characters are all symbols. Not so good.

A quick fix is to run utf8_encode on the data before display it. But that's not gonna work in the long run. Instead I want to convert the data to the correct encoding in the database.

Would you suggest to loop through all tables and do a utf8_encode on everything? I know one negative about this, and it's that if the loop fails and you run it again, the already encoded characters will be encoded and displayed incorrectly again.

Code suggestion is welcome. Thanks Smile
#2

[eluser]dudeami0[/eluser]
First off backing up your data is a must.

Second off, MySQL has these conversions built in.

Code:
ALTER DATABASE database DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

Hope this helps Smile
#3

[eluser]Swedie[/eluser]
It's backed up. I'm also in the test environment.

Your above suggestion does not work, done that both ot the whole database but also the tables themselves.

The input into the database that was once made was with a page encoding of ISO-8859-1.

Changing just the database or table encoding only alters future inserts (as far as I know), not current data.
#4

[eluser]dudeami0[/eluser]
Hmm, could try exporting and importing it:

Code:
mysqldump --user=username --password=password --default-character-set=charset --compatible=mysql40 dbname > dump.sql

Code:
mysql --user=username --password=password --default-character-set=utf8 dbname < dump.sql

Changing the commands to meet your needs.
#5

[eluser]LuckyFella73[/eluser]
Hej Swedie,

Phil Sturgeon has posted a very nice tutorial how to set up
a clean utf-8 invironment or how to convert an existing
projekt into utf-8 encoding (database). Here is the link
to his blog:
http://philsturgeon.co.uk/news/2009/08/U...odeIgniter

Hope that helps
#6

[eluser]InsiteFX[/eluser]
You can alter the table all you want it will not change the database collation!

In PhpMyAdmin go to operations tab and chage the collation on the bottom dropdown.
It defaults to latin1_swedish_ci

InsiteFX
#7

[eluser]Swedie[/eluser]
Well sadly none of above worked. So I wrote a PHP function to loop through all of the tables and convert the data when needed using PHP's mb_convert and mb_encode functions.

I think it worked Smile

Here's the code in case anyone wants it. You can true or false to convert data or just the tables (and columns). I haven't added collate, as it self-adjust to the correct one (atleast for me).

Code:
function fixEncoding($convertData = TRUE, $convertTableAndColumns = TRUE) {
    $table_encoding = "utf8";
    $start_on = "";
    $end_after = "";
    $sql_tables = mysql_query("SHOW TABLES");
    while($table = mysql_fetch_assoc($sql_tables)) {
        if($start_on AND substr($table['Tables_in_smc_www2'],0,strlen($start_on)) < $start_on) continue;
        if($end_after AND substr($table['Tables_in_smc_www2'],0,strlen($end_after)) > $end_after) continue;
        if($convertTableAndColumns) mysql_query("ALTER TABLE ".$table['Tables_in_smc_www2']." CONVERT TO CHARACTER SET ".$table_encoding.";");
        if($convertData) {
            ob_start();
            print "Table: ".$table['Tables_in_smc_www2']." - ";
            $sql_total = mysql_query("SELECT * FROM ".$table['Tables_in_smc_www2']."");
            $total_records = mysql_num_rows($sql_total);
            print " Total records ".$total_records." - ";
            $per_run = 1000;
            $runs = @ceil($total_records/$per_run);
            if($runs <= 0) $runs = 1;
            print "Runs: ".$runs." - Completed: ";
            $done = 0;
            for($i=0; $i < $runs; $i++) {
                if($i > 0) $done += $per_run;
                if($per_run > $total_records) $per_run = $total_records;
                $sql_run = mysql_query("SELECT * FROM ".$table['Tables_in_smc_www2']." LIMIT ".$done.", ".$per_run."");
                $run_count = mysql_num_rows($sql_run);
                if($per_run > $run_count) $per_run = $run_count;
                print $done."-".($done+$per_run).", ";
                if($run_count > 0) {
                    while($row = mysql_fetch_assoc($sql_run)) {
                        unset($data);
                        unset($first);
                        unset($values);
                        foreach($row as $field => $value) {
                            if(!$first) {
                                $first['field'] = $field;
                                $first['id'] = $value;
                            }
                            if(!is_numeric($value)) {
                                $value = stripslashes(html_entity_decode($value));
                                $value = mb_convert_encoding($value, 'ISO-8859-1', mb_detect_encoding($value, 'UTF-8, ISO-8859-1', true)); //from something to iso, then over to utf-8 db tables
                                $data[$field] = $value;
                            }
                        }
                        if(is_array($data)) {
                            foreach($data as $key => $value) {
                                if($key != $first['field']) {
                                    $values .= "$key = '".addslashes($value)."', ";
                                }
                            }
                            $query = "UPDATE ".$table['Tables_in_smc_www2']." SET ".substr($values, 0, -2)." WHERE ".$first['field']." = ".$first['id']."";
                            mysql_query($query) or die("Table: ".$table['Tables_in_smc_www2'].", Fieldname: ".$first['id'].": ".mysql_error());
                        }
                    }
                }
            }
        }
        print "Done!<br>";
        sleep(1);
        ob_flush();
    }
}




Theme © iAndrew 2016 - Forum software by © MyBB