[eluser]tinawina[/eluser]
Hi Coccodrillo - thanks for responding again. I see your point and changed my code so that the form input echos (only echoing from controller because I'm testing, never in actual production site -- no kittens are harmed during testing!) to the screen with the appropriate character set info. I have a breakdown of how this panned out in my testing below. First - the code change:
Code:
echo
'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
' . $this->input->post('title') . '
</body>
</html>';
I am checking my output in Chrome, Firefox, and IE, on Windows XP and Ubuntu Linux. Here's a breakdown of my testing -- an "ok" means the input echoed to the screen with diacritics in place; "no" means I got something garbled. I don't have IE on my Linux machine so no testing for that. Here are the results:
Code:
------------With proper HTML---|---Without proper HTML------
Windows | Linux | Windows | Linux
| | |
Chrome ok | ok | no | no
Firefox ok | ok | ok | ok
IE ok | -- | no | --
So this change does what I want -- input goes in with diacritics and comes out with diacritics intact. Perfect.
HOWEVER: When I try to do something with this input other than echo it to the screen, I'm back to square one.
I just tried to translate the diacritics in my string to HTML characters, and then echo to screen. Here's the code:
Code:
$search = explode(",","À,È,Ì,Ò,Ù,à,è,ì,ò,ù,Á,É,Í,Ó,Ú,Ý,á,é,í,ó,ú,ý,Â,Ê,Î,Ô,Û,â,ê,î,ô,û,Ã,Ñ,Õ,ã,ñ,õ,Ä,Ë,Ï,Ö,Ü,Ÿ,ä,ë,ï,ö,ü,ÿ");
$replace = explode(",","À,È,Ì,Ò,Ù,à,è,ì,ò,ù,Á,É,Í,Ó,Ú,Ý,á,é,í,ó,ú,ý,Â,Ê,Î,Ô,Û,â,ê,î,ô,û,Ã,Ntilde;,Õ,ã,ñ,õ,Ä,Ë,Ï,Ö,Ü,Ÿ,ä,ë,ï,ö,ü,ÿ");
$new_input = str_replace($search, $replace, $this->input->post('title'));
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
' . $new_input . '
</body>
</html>';
I entered "años son sobresalientes sólo existía un puñado" into the form and this is what I see on my screen:
Code:
a�os son sobresalientes s�lo exist�a un pu�ado
If I simply echo $new_input to the screen without any HTML directives I get the same as I would with directives in place -- garbled text.
When all of this is said and done, what I need to be able to do is 1) accept a text string that might include diacritics, 2) translate any diacritics in the string into HTML entities, and 3) store the string in my database. I don't get why echoing to the screen gives me the right output, but the same form input can't be manipulated.
Any other ideas you might have are appreciated! Thanks for reading all of this!