• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
lacking unicode support for file uploading class?

#1
[eluser]Arjen van Bochoven[/eluser]
It probably is a php issue, but it would be nice if CI checked for unicode support or sanitizes the filename somewhat stricter.

When I upload a file with unicode chars in the filename, I get an error. Can anyone confirm this? I followed the guidelines in the User Guide which resulted in a working application.

I'm not to good in handling unicode, so I took the url_title function and extended the CI_Upload class with this:
Code:
<?php  if (!defined('BASEPATH')) exit('No direct script access allowed');


class MY_Upload extends CI_Upload {

     /**
      * Extra filename cleaning for unicode chars etc.
      *
      * @return string
      * @author bochoven
      **/

    function clean_file_name($filename)
    {
        $replace = '_';
        $trans = array(
                        "\s+"                                => $replace,
                        "[^a-z0-9\.".$replace."]"                => '',
                        $replace."+"                        => $replace,
                        $replace."$"                        => '',
                        "^".$replace                        => ''
                       );

        $filename = strip_tags(strtolower($filename));

        foreach ($trans as $key => $val)
        {
            $filename = preg_replace("#".$key."#", $val, $filename);
        }

        return trim(stripslashes($filename));
        
    }
}

This strips the filename, and all is fine

#2
[eluser]xwero[/eluser]
You are right about unicode being a php issue, unicode support is scheduled for php6.

Instead of cleaning up the filename you could use the encrypt option of the configuration class.

#3
[eluser]Arjen van Bochoven[/eluser]
Thanks for the info, I could not find anything useful on php.net.

Although I like the encryption option, I don't like my end users confronted with d28e2bca687873aba49b22dfd61c3443.jpg when they upload "A picture of the beach.jpg". I would like 'proper' filenames to pass thru while improper ones get sanitized.

#4
[eluser]xwero[/eluser]
I understand, i was just suggesting there are CI build-in options to circumvent the unicode problem for files. Cleaning the filename is a nice option so thank you for sharing.

#5
[eluser]esra[/eluser]
Harry Fuecks maintains a collection of Unicode UTF-8 encoding string handlers on SourceForge. The string handlers were orginally designed for the WACT framework and released later as a standalone package for more widespread use.

Some information about the more larger problem with references to the string handlers.

http://www.phpwact.org/php/i18n/charsets

Sourceforge download for the string handlers is here:

http://sourceforge.net/projects/phputf8

These use mbstring if it is enabled in php.ini for a slight performance gain. I believe that these string handlers were added as a helper to the Kohana fork.

#6
[eluser]Référencement Google[/eluser]
Thanks for sharing Arjen van Bochoven, I was just having the same issue and looking for a solution, that works perfectly for me!


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2019 MyBB Group.