Welcome Guest, Not a member yet? Register   Sign In
Watermarks (image_lib) for non-latin or special characters
#1

[eluser]popovich[/eluser]
Hi,

thank you, CodeIgniter, that you exist.

Now, to the business.
I am having problems with generating watermark for special characters, like em-dash , and cyrillic characters. Sure, I use different fonts for latin/non-latin languages, but it does not help the problem. I have tried to look here http://de2.php.net/imagettftext , however the result had showed no improvement over the CI methods.

Any thoughts are greatly appreciated.
#2

[eluser]popovich[/eluser]
OK, if the cyrillics seems to be a hard nut to crack, what about special characters?
Here are two functions I have found, which are meant to convert the characters. Nothing works, though.
Oh, all my php files are saved in utf-8 encoding, the db fields collation (as well as the table itself) is utf8_unicode_ci.

Please?

Function one.
Code:
#
    # http://bugs.typo3.org/view.php?id=5078
    #
    function ricEncode($content) {
      $convmap = array(0x80, 0xff, 0, 0xff);
      return mb_encode_numericentity($content,$convmap, "utf-8");
    }

Function two:
Code:
function foxy_utf8_to_nce(
      $utf = EMPTY_STRING
    ) {
      if($utf == EMPTY_STRING) return($utf);
    
      $max_count = 5; // flag-bits in $max_mark ( 1111 1000 == 5 times 1)
      $max_mark = 248; // marker for a (theoretical ;-)) 5-byte-char and mask for a 4-byte-char;
    
      $html = EMPTY_STRING;
      for($str_pos = 0; $str_pos < strlen($utf); $str_pos++) {
        $old_chr = $utf{$str_pos};
        $old_val = ord( $utf{$str_pos} );
        $new_val = 0;
    
        $utf8_marker = 0;
    
        // skip non-utf-8-chars
        if( $old_val > 127 ) {
          $mark = $max_mark;
          for($byte_ctr = $max_count; $byte_ctr > 2; $byte_ctr--) {
            // actual byte is utf-8-marker?
            if( ( $old_val & $mark  ) == ( ($mark << 1) & 255 ) ) {
              $utf8_marker = $byte_ctr - 1;
              break;
            }
            $mark = ($mark << 1) & 255;
          }
        }
    
        // marker found: collect following bytes
        if($utf8_marker > 1 and isset( $utf{$str_pos + 1} ) ) {
          $str_off = 0;
          $new_val = $old_val & (127 >> $utf8_marker);
          for($byte_ctr = $utf8_marker; $byte_ctr > 1; $byte_ctr--) {
    
            // check if following chars are UTF8 additional data blocks
            // UTF8 and ord() > 127
            if( (ord($utf{$str_pos + 1}) & 192) == 128 ) {
              $new_val = $new_val << 6;
              $str_off++;
              // no need for Addition, bitwise OR is sufficient
              // 63: more UTF8-bytes; 0011 1111
              $new_val = $new_val | ( ord( $utf{$str_pos + $str_off} ) & 63 );
            }
            // no UTF8, but ord() > 127
            // nevertheless convert first char to NCE
            else {
              $new_val = $old_val;
            }
          }
          // build NCE-Code
          $html .= '&#'.$new_val.';';
          // Skip additional UTF-8-Bytes
          $str_pos = $str_pos + $str_off;
        }
        else {
          $html .= chr($old_val);
          $new_val = $old_val;
        }
      }
      return($html);
    }
#3

[eluser]popovich[/eluser]
Quick update.
Initially, I was using an OpenType font (UniversStd.otf), which was rendering the em dash as a crossed square. After trying the omnipresent Arial in TrueType format (arial.ttf) and adding
Code:
html_entity_decode($text);
the latin texts seems to render fine now, displaying special characters, umlauts and accents. Cool.

The only thing left — the cyrillics...

ps. Does it happen only to me, that after posting to a forum, the solution crystallizes itself in the air before anyone has replied yet? Smile
#4

[eluser]xwero[/eluser]
If you find the solution and you post it others will have less trouble Wink
#5

[eluser]popovich[/eluser]
The solution:
- encode everything as UTF-8 (your database table and field);
- user a TrueType (.ttf) font; though the freetype documentation states, that Type1 and OpenType can be used, the cyrillic characters will not be found in the font table; however, it works fine with latin characters;
- apply html_entity_decode() to be able to display special chars;
- triple check what you are testing — from strings you are trying to display to, well, the MVC setup (my case);

Cheers.




Theme © iAndrew 2016 - Forum software by © MyBB