CodeIgniter Forums
lacking unicode support for file uploading class? - Printable Version

+- CodeIgniter Forums (https://forum.codeigniter.com)
+-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20)
+--- Forum: Archived Development & Programming (https://forum.codeigniter.com/forumdisplay.php?fid=23)
+--- Thread: lacking unicode support for file uploading class? (/showthread.php?tid=3421)



lacking unicode support for file uploading class? - El Forum - 09-29-2007

[eluser]Arjen van Bochoven[/eluser]
It probably is a php issue, but it would be nice if CI checked for unicode support or sanitizes the filename somewhat stricter.

When I upload a file with unicode chars in the filename, I get an error. Can anyone confirm this? I followed the guidelines in the User Guide which resulted in a working application.

I'm not to good in handling unicode, so I took the url_title function and extended the CI_Upload class with this:
Code:
<?php  if (!defined('BASEPATH')) exit('No direct script access allowed');


class MY_Upload extends CI_Upload {

     /**
      * Extra filename cleaning for unicode chars etc.
      *
      * @return string
      * @author bochoven
      **/

    function clean_file_name($filename)
    {
        $replace = '_';
        $trans = array(
                        "\s+"                                => $replace,
                        "[^a-z0-9\.".$replace."]"                => '',
                        $replace."+"                        => $replace,
                        $replace."$"                        => '',
                        "^".$replace                        => ''
                       );

        $filename = strip_tags(strtolower($filename));

        foreach ($trans as $key => $val)
        {
            $filename = preg_replace("#".$key."#", $val, $filename);
        }

        return trim(stripslashes($filename));
        
    }
}

This strips the filename, and all is fine


lacking unicode support for file uploading class? - El Forum - 09-30-2007

[eluser]xwero[/eluser]
You are right about unicode being a php issue, unicode support is scheduled for php6.

Instead of cleaning up the filename you could use the encrypt option of the configuration class.


lacking unicode support for file uploading class? - El Forum - 09-30-2007

[eluser]Arjen van Bochoven[/eluser]
Thanks for the info, I could not find anything useful on php.net.

Although I like the encryption option, I don't like my end users confronted with d28e2bca687873aba49b22dfd61c3443.jpg when they upload "A picture of the beach.jpg". I would like 'proper' filenames to pass thru while improper ones get sanitized.


lacking unicode support for file uploading class? - El Forum - 09-30-2007

[eluser]xwero[/eluser]
I understand, i was just suggesting there are CI build-in options to circumvent the unicode problem for files. Cleaning the filename is a nice option so thank you for sharing.


lacking unicode support for file uploading class? - El Forum - 09-30-2007

[eluser]esra[/eluser]
Harry Fuecks maintains a collection of Unicode UTF-8 encoding string handlers on SourceForge. The string handlers were orginally designed for the WACT framework and released later as a standalone package for more widespread use.

Some information about the more larger problem with references to the string handlers.

http://www.phpwact.org/php/i18n/charsets

Sourceforge download for the string handlers is here:

http://sourceforge.net/projects/phputf8

These use mbstring if it is enabled in php.ini for a slight performance gain. I believe that these string handlers were added as a helper to the Kohana fork.


lacking unicode support for file uploading class? - El Forum - 10-01-2007

[eluser]Référencement Google[/eluser]
Thanks for sharing Arjen van Bochoven, I was just having the same issue and looking for a solution, that works perfectly for me!