Welcome Guest, Not a member yet? Register   Sign In
Normal Encoding vs CI Encoding = the same but different results
#1

[eluser]DarkDev[/eluser]
Hello everybody,
this is my first post on these forums, but i have been using CI since ages.
Sure I love it, it has made easier my working life and helped me out in a lot of cases.

I'm posting here beacuse I've been trying out the following application since 4 days with no success.

It's about UTF-8 encoding: in a normal PHP web page, everything works fine:

Normal PHP Page
Code:
<?php
    header('Content-type: text/html; charset=utf-8');
    ini_set('default_charset', 'utf-8');
    $txt = 'à ò ì ù';
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">    
&lt;html &gt;
&lt;!--
    HEAD
--&gt;
&lt;head&gt;
&lt;title&gt;Title&lt;/title&gt;

&lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /&gt;
&lt;/head&gt;

&lt;!--
    BODY
--&gt;
&lt;body&gt;
&lt;?php
    var_dump( headers_list() );
    echo $txt;
?&gt;
&lt;/body&gt;

It echoes 'à ò ì ù': works great.

CI Page
Code:
class Home extends MY_Controller {

    function __construct()
    {
        parent::MY_Controller();
        header('Content-type: text/html; charset=utf-8');
            ini_set('default_charset', 'utf-8');
    }
    
    function index() {
        
                $data = array('txt' => 'à ò ì ù');
        $this->load->view( 'home', $data );
    
    }
}

---------------
Home
---------------
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">    
&lt;html &gt;
&lt;!--
    HEAD
--&gt;
&lt;head&gt;
&lt;title&gt;Title&lt;/title&gt;

&lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /&gt;
&lt;/head&gt;

&lt;!--
    BODY
--&gt;
&lt;body&gt;
&lt;?php
    echo $txt;
?&gt;
&lt;/body&gt;

It echoes '� � � �' with wrong characters


Now, I'm really, really going to lose my mind. The charset config is set to 'UTF-8', the meta tag so, the codes are equal... I don't know how to solve it.

It's not a server fault, beacuse in the first case everything works fine, it's a CI fault, so I guess I'm doing something wrong.

Please, if you can, help me out!

Thanks so much in advice!
#2

[eluser]DarkDev[/eluser]
It's quite unbelievable: if I echo in the controller, it doesn't work, if I echo in the view, it works:

Controller
Code:
&lt;?php
class Home extends MY_Controller {

    function __construct()
    {
        parent::MY_Controller();
    }
    
    function index() {
        
        $txt = "à ò ì ù";
        echo $txt;
        
    }
}

?&gt;


View
Code:
$txt = "à ò ì ù";
echo $txt;

Is it possible that only my app gets this kind of strange error?
#3

[eluser]bl00dshooter[/eluser]
It's not normal behavior, but you are not suposed to echo anything on the controller, all output should be done on the view.
#4

[eluser]WanWizard[/eluser]
Works fine here in any controller I paste the echo in.
Just tried it on a new install of CI 1.7.2 and CI 2.0, still no problem.

Since you are typing in those characters, are you sure the encoding of your file is ok? Check your editor to see if your controller class file is saved in utf-8.
#5

[eluser]cahva[/eluser]
[quote author="bl00dshooter" date="1289365505"]It's not normal behavior, but you are not suposed to echo anything on the controller, all output should be done on the view.[/quote]

That would be the normal way yes, but it doesnt mean ALL output should be done in a view for example printing json, images etc. Smile And he forwarded the $data to the view in his first post so that was probably an example.

Anyway, are you sure that the controller page is encoded in UTF-8? If you have copied some original controller(eg. Welcome) and modified that, it probably is still ANSI. I dont't know what platform you are using but alteast with Windows Notepad++ you can see what type of encoding is used(and you can transform it to ANSI -> UTF-8 or vice versa).
#6

[eluser]smilie[/eluser]
Could it be that your editor has different page encoding for those two files (controller and view)?

If you have shell access, try to find out encoding of those two files.

Regards,
Smilie
#7

[eluser]DarkDev[/eluser]
@cahva, WanWizard, smilie: thank you so much guys, I switched the file encoding to UTF-8 and it works fine Smile

I absolutely didn't know that the file encoding would impact with the final output, I merely thought that the browser should have had the final words, my fault.

Are there some kind of articles/posts where I can figure out more informations about the encoding in the web environment?

And last but not least: which kind of editor do you suggest to use with CI?
I'm currently using DW, but I'm thinking to switch to a new one: does PHP Designer work fine?

Thank you again guys!

Regards
#8

[eluser]smilie[/eluser]
Well, if your editor saves files as i.e. UTF-8 but your text is in ISO-xxxx then it would be in conflict :-)

There are in total 4 points where page encoding can be set / changed:

1. file itself;
2. apache config;
3. PHP config;
4. browser itself.

All of these must 'match' to show good output. And ofcourse - your own text must match all of these criteria. It is a real bugger to get all working well - I know from personal experience.

There are tons of articles which cover this matter. Just Google for keyword: utf-8, website, encoding and you will receive enough results to keep you occupied for weeks Smile I do not know any specific website that explains this in a propper manner.

Regarding choice of editor, it is a personal choice. All editors have more or less same capabilities. In regard to page encoding, simply when editing a file, choose from File > Properties encoding you need. Almost all editor have this option.

I personally use Aptana (on Linux) as I find it not too hungry for resources and has everything I need (from syntax highlight for HTML / PHP / JS / CSS, to Subversion tools which I extendly use).

Cheers,
Smilie
#9

[eluser]DarkDev[/eluser]
Thank you so much smilie!

Only a final note on what you wrote:
"if your editor saves files as i.e. UTF-8 but your text is in ISO-xxxx then it would be in conflict"

If I have understood, my scenario was different:
i had text in ISO (or not?) and I saved file in ANSI --&gt; didn't work
i switched file encoding to UTF-8 --&gt; do work

So all I need to do is always to change the file encoding, but how i find out the "my text encoding" as you wrote?

Really, thank you so much guys.
Regards
#10

[eluser]smilie[/eluser]
DarkDev:
Yes, in your situation it was other way around from what I wrote - but it's same principle: text in your file is different encoding, then the file itself.

Regarding character encoding, take a look here:
http://en.wikipedia.org/wiki/Character_encoding

For example, if in your text you have characters from let's say German alphabet, then you will need to set character encoding (of file) to ISO 8859-16.
Because, ISO 8859-16 has following:
http://en.wikipedia.org/wiki/ISO/IEC_8859-16 (see Codepage Layout). You will see, that beside standard English alphabet it contains specials characters as well.

Compare it to UTF-8 (http://en.wikipedia.org/wiki/UTF-8#Codepage_layout) - which has only 'standard' ones and you will get the idea why and when which encoding to use Smile

Cheers,
Smilie




Theme © iAndrew 2016 - Forum software by © MyBB