• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Multibyte characters validation problem

#1
[eluser]fensen[/eluser]
Hi,

I'm developing a website for Japan.

I'm having difficulties with the Validation Class handling multibyte characters. I have a 100-character text field:

Code:
$rules['text'] = "max_length[100]";

When I input single byte text (English alphabet characters) the 100 limit is correctly validated.

But when I input Japanese text, I can only input 33 characters. If I input 34 characters I get the error message saying that the limit is 100 chars.

Sometimes I need to input English and Japanese text together.

Any ideas?

Thanks in advance

#2
[eluser]nmweb[/eluser]
CI has no utf8 support for validation I belief. Rewriting the rules of the validation library is the only solution I think.

#3
[eluser]fensen[/eluser]
Very sad to hear that... I'm not sure I want to touch CI internal libraries...

Thanks for your reply.

#4
[eluser]xwero[/eluser]
There are Multibyte string functions but they are slower than the regular functions and php needs an install option to add the functions because they are not in the default installation.

If you are on a server where the option is added, check php.ini, you can extend the rules with the multibyte functions.

#5
[eluser]esra[/eluser]
[quote author="xwero" date="1205165929"]There are Multibyte string functions but they are slower than the regular functions and php needs an install option to add the functions because they are not in the default installation.

If you are on a server where the option is added, check php.ini, you can extend the rules with the multibyte functions.[/quote]

There are also Harry Fueck's php-utf extended string functions on SourceForge.org which could be used as helpers. I believe that the Akelos and Kohana frameworks check to see if the PHP mbstring extension is loaded and use the multibyte string functions if mbstring is loaded. If not loaded, they use the php-utf string functions.

It's probably possible to reimplement Harry Fuecks extended string functions as a CI class library, but that library probably needs to be loaded as a core library to make it available to the Validation library and other libraries. I was hoping to see EllisLabs tackle this on their own at some point in time. Maybe something like this will happen now that EE 2.0 is based on CodeIgniter since better support for UTF-8 should give EE more market share on the global market.

#6
[eluser]Atasa[/eluser]
Unfortunately there is no other way for us foreigners except to alter the Validation Library.
As I can see, there is not much energy in to these issues from CI team.
So all you have to do is instead of
Code:
return (strlen($str) < $val) ? FALSE : TRUE;
to say
Code:
return (mb_strlen($str, 'UTF-8') < $val) ? FALSE : TRUE;

same with max_length and exact_length
If you don't have that function then just use
Code:
strlen(utf8_decode($str))
that should work too.

#7
[eluser]obay[/eluser]
In CI 2.1.0 this seems to already be supported. It checks to see first if function_exists('mb_strlen'), then proceeds to use it if it does. Otherwise, it uses strlen()


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2020 MyBB Group.