CodeIgniter Forums
Text input not being processed properly! - Printable Version

+- CodeIgniter Forums (https://forum.codeigniter.com)
+-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20)
+--- Forum: Archived Development & Programming (https://forum.codeigniter.com/forumdisplay.php?fid=23)
+--- Thread: Text input not being processed properly! (/showthread.php?tid=29859)

Pages: 1 2


Text input not being processed properly! - El Forum - 04-23-2010

[eluser]TheFuteballer[/eluser]
I'm having an issue deploying basic functionalities on text areas. I have a textarea within my application where users can paste content into and submit to a database. When I spit out whatever is in the database, I'm continually getting wierd characters showing up as it seems CI isn't processing these characters properly.

The characters in question are 'smart' quotes, whose html entity is ’ or “ . When these characters are inserted, I have not found a single PHP function I can run the posted input through that will properly convert these into the correct html entity (above) so it can be displayed back properly on a browser.

Just as a note I have XSS filtering on as a default but I doubt that would be causing any issues.

Any help would be greatly appreciated!


Text input not being processed properly! - El Forum - 04-24-2010

[eluser]Tominator[/eluser]
Post your code please ... I think that in documentation, there is xss filtering (or some kind of security) by default in Active Records.


Text input not being processed properly! - El Forum - 04-24-2010

[eluser]TheFuteballer[/eluser]
I have a feeling it's a charset issue that's not allowing this to show properly. As an example, this block of code here: bit.ly/bSqP2l is one of many causing issues and not converting quotes properly.


My code however is pretty simple, in my controller I'm just pulling the textarea content
Code:
$this->input->post('sampleTextArea');

and placing it in my database.

Now I've tried implementing the following functions to replace these odd characters and none of them work:

Solution 1:
Code:
function _convert_smart_quotes($string)
    {    
        $search = array(chr(145),chr(8217),chr(147),chr(148), chr(151));  
        // OR   $search = array('‘','’','“','”');  
        $replace = array('‘','’','“','”', '-');    
        return str_replace($search, $replace, $string);
    }


Solution 2:
Code:
function normalize_special_characters( $str )
    {
        # Quotes cleanup
        $str = str_replace( chr(ord("`")), "'", $str );        # `
        $str = str_replace( chr(8217), "'", $str );        # ´
        $str = str_replace( chr(ord("„")), ",", $str );        # „
        $str = str_replace( chr(ord("`")), "'", $str );        # `
        $str = str_replace( chr(ord("´")), "'", $str );        # ´
        $str = str_replace( chr(ord("“")), "\"", $str );        # “
        $str = str_replace( chr(ord("”")), "\"", $str );        # ”
        $str = str_replace( chr(ord("´")), "'", $str );        # ´
    
        $unwanted_array = array(    'Š'=>'S', 'š'=>'s', 'Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
                                    'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U',
                                    'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss', 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c',
                                    'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o',
                                    'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y' );
        $str = strtr( $str, $unwanted_array );
    
        # Bullets, dashes, and trademarks
        $str = str_replace( chr(149), "•", $str );    # bullet •
        $str = str_replace( chr(150), "–", $str );    # en dash
        $str = str_replace( chr(151), "—", $str );    # em dash
        $str = str_replace( chr(153), "™", $str );    # trademark
        $str = str_replace( chr(169), "©", $str );    # copyright mark
        $str = str_replace( chr(174), "®", $str );        # registration mark
    
        return $str;
    }

Other solutions:
Code:
str_replace()
Code:
preg_replace()
etc.


Text input not being processed properly! - El Forum - 04-24-2010

[eluser]Tominator[/eluser]
Do you use same enconding in HTML/PHP (meta tag + file encoding)?


Text input not being processed properly! - El Forum - 04-24-2010

[eluser]TheFuteballer[/eluser]
My html meta declaration is UTF-8 and in case my PHP encoding was off I ran the posted content through PHP's utf8_encode() function with no success.


Text input not being processed properly! - El Forum - 04-24-2010

[eluser]Tominator[/eluser]
I see but are your files saved in UTF-8?


Text input not being processed properly! - El Forum - 04-29-2010

[eluser]TheFuteballer[/eluser]
[quote author="Tominator" date="1272146314"]I see but are your files saved in UTF-8?[/quote]

They seem to be saved in US-ASCII (although it seems you can't be sure - I'm on OS X by the way) , could this be the root cause?

Could you suggest the best way to re-encode the files into UTF-8?


Text input not being processed properly! - El Forum - 04-29-2010

[eluser]megabyte[/eluser]
I may be wrong, but can't you just change your database table to utf encode, and your web pages meta charset to utf


Text input not being processed properly! - El Forum - 04-30-2010

[eluser]Tominator[/eluser]
[quote author="TheFuteballer" date="1272608339"][quote author="Tominator" date="1272146314"]I see but are your files saved in UTF-8?[/quote]

They seem to be saved in US-ASCII (although it seems you can't be sure - I'm on OS X by the way) , could this be the root cause?

Could you suggest the best way to re-encode the files into UTF-8?[/quote]

I suggest to set all page (html, files, db) into one encoding. I prefer UTF-8.


Text input not being processed properly! - El Forum - 04-30-2010

[eluser]TheFuteballer[/eluser]
[quote author="megabyte" date="1272623576"]I may be wrong, but can't you just change your database table to utf encode, and your web pages meta charset to utf[/quote]

This has already been done and I still see the issues...


@Tominator, I've attempted to do just that, and made sure everything is UTF-8 and I still have the same issues