CI 2.0 and UTF-8 strings - Printable Version +- CodeIgniter Forums (https://forum.codeigniter.com) +-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20) +--- Forum: Archived Development & Programming (https://forum.codeigniter.com/forumdisplay.php?fid=23) +--- Thread: CI 2.0 and UTF-8 strings (/showthread.php?tid=39646) |
CI 2.0 and UTF-8 strings - El Forum - 03-16-2011 [eluser]Unknown[/eluser] Hi, I found the problem in validating text fields in form. Values just disappeared. Log message was: Quote:ERROR - 2011-03-16 22:03:48 --> Severity: Notice --> iconv() [...] : Wrong charset, conversion from `UTF-8' to `UTF-8//IGNORE' is not allowed <path>/system/core/Utf8.php 89So I checked Utf8.php and found this: Code: function clean_string($str) CI 2.0 and UTF-8 strings - El Forum - 04-07-2011 [eluser]Arjen van Bochoven[/eluser] I had the same problem and narrowed it down to iconv not working correctly. Passing a correct utf-8 string into Code: iconv('UTF-8', 'UTF-8//IGNORE', $str) I use MAMP 1.9.5 and filed a bug report. I can confirm the issue is not present in MAMP 1.7.1 The correct way to work around this is to extend the core class with your own as described in Creating Core System Classes * Create a new file: application/core/MY_Utf8.php * Copy/paste the code below Code: <?php if ( ! defined('BASEPATH')) exit('No direct script access allowed'); CI 2.0 and UTF-8 strings - El Forum - 04-07-2011 [eluser]Unknown[/eluser] I used mb_convert_encoding instead of iconv. Works perfect. CI 2.0 and UTF-8 strings - El Forum - 04-07-2011 [eluser]InsiteFX[/eluser] This is from PHP.net To strip bogus characters from your input (such as data from an unsanitized or other source which you can't trust to necessarily give you strings encoded according to their advertised encoding set), use the same character set as both the input and the output, with //IGNORE on the output charcter set. Code: <?php The result of the example does not give you back the dagger character which was the original input (it got lost when htmlentities was misused to encode it incorrectly, though this is common from people not accustomed to dealing with extended character sets), but it does at least give you data which is sane in your target character set. InsiteFX |