Welcome Guest, Not a member yet? Register   Sign In
utf-8 character count help
#1

[eluser]Bahodir[/eluser]
Hi,

Is there a build-in function for counting utf8-characters?
I want to limit long lines to specific length using character_limiter(), but PHP is not giving me the proper lengths of a string.
For example, the following code should output 'при...', but its echoing 'привет...'
Code:
<?php
$str = "привет";
echo character_limiter($str, 3);

Any help would be awesome.
#2

[eluser]pistolPete[/eluser]
Try the mb_strlen function:

http://www.php.net/mb_strlen
#3

[eluser]Bahodir[/eluser]
I tried. It doesn't work.
#4

[eluser]pistolPete[/eluser]
What's your php internal encoding setting?
Try:
Code:
mb_strlen($string,'UTF-8');

If that doesn't work, try this:
Code:
$strlen = preg_match_all("/.{1}/us",$utf8string,$dummy);
#5

[eluser]Bahodir[/eluser]
pistolPete,
thank you for your help.
This code you gave me correctly counts the number of characters.
Code:
mb_strlen($string,'UTF-8');

But how do I trim my string using character_limiter()?
I tried this
Code:
$str = "привет";
echo character_limiter(utf8_decode($str), 3);

And, i think it is giving me the correct length, except the decoded characters show up as ????...

Now how can I show the correct characters?
#6

[eluser]Bahodir[/eluser]
[quote author="pistolPete" date="1235886037"]What's your php internal encoding setting?
[/quote]

Oh, I think it is utf-8
#7

[eluser]pistolPete[/eluser]
[quote author="Bahodir" date="1235891451"]Oh, I think it is utf-8[/quote]

You can check it using this function:
Code:
/* Display current internal character encoding */
echo mb_internal_encoding();

I modified the helper to work with utf8 strings:
Code:
function character_limiter($str, $n = 500, $end_char = '…')
{
        // set encoding to UTF-8
        mb_internal_encoding('UTF-8');
        mb_regex_encoding('UTF-8');
        
        if (mb_strlen($str) < $n)
        {
            return $str;
        }
        
        $str = preg_replace("/\s+/", ' ', str_replace(array("\r\n", "\r", "\n"), ' ', $str));
        if (mb_strlen($str) <= $n)
        {
            return $str;
        }

        $out = '';
        $split_str = mb_split(' ',trim($str));
        
        foreach ($split_str as $val)
        {
            $out .= $val.' ';
            
            if (mb_strlen($out) >= $n)
            {
                $out = trim($out);
                return (mb_strlen($out) == mb_strlen($str)) ? $out : $out.$end_char;
            }        
        }
}

Have a look at "Extending" Helpers.
#8

[eluser]Bahodir[/eluser]
thank you once more,

i haven't checked it yet, but i hope it works




Theme © iAndrew 2016 - Forum software by © MyBB