Welcome Guest, Not a member yet? Register   Sign In
lacking unicode support for file uploading class?
#1

[eluser]Arjen van Bochoven[/eluser]
It probably is a php issue, but it would be nice if CI checked for unicode support or sanitizes the filename somewhat stricter.

When I upload a file with unicode chars in the filename, I get an error. Can anyone confirm this? I followed the guidelines in the User Guide which resulted in a working application.

I'm not to good in handling unicode, so I took the url_title function and extended the CI_Upload class with this:
Code:
<?php  if (!defined('BASEPATH')) exit('No direct script access allowed');


class MY_Upload extends CI_Upload {

     /**
      * Extra filename cleaning for unicode chars etc.
      *
      * @return string
      * @author bochoven
      **/

    function clean_file_name($filename)
    {
        $replace = '_';
        $trans = array(
                        "\s+"                                => $replace,
                        "[^a-z0-9\.".$replace."]"                => '',
                        $replace."+"                        => $replace,
                        $replace."$"                        => '',
                        "^".$replace                        => ''
                       );

        $filename = strip_tags(strtolower($filename));

        foreach ($trans as $key => $val)
        {
            $filename = preg_replace("#".$key."#", $val, $filename);
        }

        return trim(stripslashes($filename));
        
    }
}

This strips the filename, and all is fine
#2

[eluser]xwero[/eluser]
You are right about unicode being a php issue, unicode support is scheduled for php6.

Instead of cleaning up the filename you could use the encrypt option of the configuration class.
#3

[eluser]Arjen van Bochoven[/eluser]
Thanks for the info, I could not find anything useful on php.net.

Although I like the encryption option, I don't like my end users confronted with d28e2bca687873aba49b22dfd61c3443.jpg when they upload "A picture of the beach.jpg". I would like 'proper' filenames to pass thru while improper ones get sanitized.
#4

[eluser]xwero[/eluser]
I understand, i was just suggesting there are CI build-in options to circumvent the unicode problem for files. Cleaning the filename is a nice option so thank you for sharing.
#5

[eluser]esra[/eluser]
Harry Fuecks maintains a collection of Unicode UTF-8 encoding string handlers on SourceForge. The string handlers were orginally designed for the WACT framework and released later as a standalone package for more widespread use.

Some information about the more larger problem with references to the string handlers.

http://www.phpwact.org/php/i18n/charsets

Sourceforge download for the string handlers is here:

http://sourceforge.net/projects/phputf8

These use mbstring if it is enabled in php.ini for a slight performance gain. I believe that these string handlers were added as a helper to the Kohana fork.
#6

[eluser]Référencement Google[/eluser]
Thanks for sharing Arjen van Bochoven, I was just having the same issue and looking for a solution, that works perfectly for me!




Theme © iAndrew 2016 - Forum software by © MyBB