• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Text input not being processed properly!

#1
[eluser]TheFuteballer[/eluser]
I'm having an issue deploying basic functionalities on text areas. I have a textarea within my application where users can paste content into and submit to a database. When I spit out whatever is in the database, I'm continually getting wierd characters showing up as it seems CI isn't processing these characters properly.

The characters in question are 'smart' quotes, whose html entity is ’ or “ . When these characters are inserted, I have not found a single PHP function I can run the posted input through that will properly convert these into the correct html entity (above) so it can be displayed back properly on a browser.

Just as a note I have XSS filtering on as a default but I doubt that would be causing any issues.

Any help would be greatly appreciated!

#2
[eluser]Tominator[/eluser]
Post your code please ... I think that in documentation, there is xss filtering (or some kind of security) by default in Active Records.

#3
[eluser]TheFuteballer[/eluser]
I have a feeling it's a charset issue that's not allowing this to show properly. As an example, this block of code here: bit.ly/bSqP2l is one of many causing issues and not converting quotes properly.


My code however is pretty simple, in my controller I'm just pulling the textarea content
Code:
$this->input->post('sampleTextArea');

and placing it in my database.

Now I've tried implementing the following functions to replace these odd characters and none of them work:

Solution 1:
Code:
function _convert_smart_quotes($string)
    {    
        $search = array(chr(145),chr(8217),chr(147),chr(148), chr(151));  
        // OR   $search = array('‘','’','“','”');  
        $replace = array('‘','’','“','”', '-');    
        return str_replace($search, $replace, $string);
    }


Solution 2:
Code:
function normalize_special_characters( $str )
    {
        # Quotes cleanup
        $str = str_replace( chr(ord("`")), "'", $str );        # `
        $str = str_replace( chr(8217), "'", $str );        # ´
        $str = str_replace( chr(ord("„")), ",", $str );        # „
        $str = str_replace( chr(ord("`")), "'", $str );        # `
        $str = str_replace( chr(ord("´")), "'", $str );        # ´
        $str = str_replace( chr(ord("“")), "\"", $str );        # “
        $str = str_replace( chr(ord("”")), "\"", $str );        # ”
        $str = str_replace( chr(ord("´")), "'", $str );        # ´
    
        $unwanted_array = array(    'Š'=>'S', 'š'=>'s', 'Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
                                    'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U',
                                    'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss', 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c',
                                    'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o',
                                    'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y' );
        $str = strtr( $str, $unwanted_array );
    
        # Bullets, dashes, and trademarks
        $str = str_replace( chr(149), "•", $str );    # bullet •
        $str = str_replace( chr(150), "–", $str );    # en dash
        $str = str_replace( chr(151), "—", $str );    # em dash
        $str = str_replace( chr(153), "™", $str );    # trademark
        $str = str_replace( chr(169), "©", $str );    # copyright mark
        $str = str_replace( chr(174), "®", $str );        # registration mark
    
        return $str;
    }

Other solutions:
Code:
str_replace()
Code:
preg_replace()
etc.

#4
[eluser]Tominator[/eluser]
Do you use same enconding in HTML/PHP (meta tag + file encoding)?

#5
[eluser]TheFuteballer[/eluser]
My html meta declaration is UTF-8 and in case my PHP encoding was off I ran the posted content through PHP's utf8_encode() function with no success.

#6
[eluser]Tominator[/eluser]
I see but are your files saved in UTF-8?

#7
[eluser]TheFuteballer[/eluser]
[quote author="Tominator" date="1272146314"]I see but are your files saved in UTF-8?[/quote]

They seem to be saved in US-ASCII (although it seems you can't be sure - I'm on OS X by the way) , could this be the root cause?

Could you suggest the best way to re-encode the files into UTF-8?

#8
[eluser]megabyte[/eluser]
I may be wrong, but can't you just change your database table to utf encode, and your web pages meta charset to utf

#9
[eluser]Tominator[/eluser]
[quote author="TheFuteballer" date="1272608339"][quote author="Tominator" date="1272146314"]I see but are your files saved in UTF-8?[/quote]

They seem to be saved in US-ASCII (although it seems you can't be sure - I'm on OS X by the way) , could this be the root cause?

Could you suggest the best way to re-encode the files into UTF-8?[/quote]

I suggest to set all page (html, files, db) into one encoding. I prefer UTF-8.

#10
[eluser]TheFuteballer[/eluser]
[quote author="megabyte" date="1272623576"]I may be wrong, but can't you just change your database table to utf encode, and your web pages meta charset to utf[/quote]

This has already been done and I still see the issues...


@Tominator, I've attempted to do just that, and made sure everything is UTF-8 and I still have the same issues


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2020 MyBB Group.