CodeIgniter Forums

Full Version: lacking unicode support for file uploading class?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

El Forum

[eluser]Arjen van Bochoven[/eluser]
It probably is a php issue, but it would be nice if CI checked for unicode support or sanitizes the filename somewhat stricter.

When I upload a file with unicode chars in the filename, I get an error. Can anyone confirm this? I followed the guidelines in the User Guide which resulted in a working application.

I'm not to good in handling unicode, so I took the url_title function and extended the CI_Upload class with this:
<?php  if (!defined('BASEPATH')) exit('No direct script access allowed');

class MY_Upload extends CI_Upload {

      * Extra filename cleaning for unicode chars etc.
      * @return string
      * @author bochoven

    function clean_file_name($filename)
        $replace = '_';
        $trans = array(
                        "\s+"                                => $replace,
                        "[^a-z0-9\.".$replace."]"                => '',
                        $replace."+"                        => $replace,
                        $replace."$"                        => '',
                        "^".$replace                        => ''

        $filename = strip_tags(strtolower($filename));

        foreach ($trans as $key => $val)
            $filename = preg_replace("#".$key."#", $val, $filename);

        return trim(stripslashes($filename));

This strips the filename, and all is fine

El Forum

You are right about unicode being a php issue, unicode support is scheduled for php6.

Instead of cleaning up the filename you could use the encrypt option of the configuration class.

El Forum

[eluser]Arjen van Bochoven[/eluser]
Thanks for the info, I could not find anything useful on

Although I like the encryption option, I don't like my end users confronted with d28e2bca687873aba49b22dfd61c3443.jpg when they upload "A picture of the beach.jpg". I would like 'proper' filenames to pass thru while improper ones get sanitized.

El Forum

I understand, i was just suggesting there are CI build-in options to circumvent the unicode problem for files. Cleaning the filename is a nice option so thank you for sharing.

El Forum

Harry Fuecks maintains a collection of Unicode UTF-8 encoding string handlers on SourceForge. The string handlers were orginally designed for the WACT framework and released later as a standalone package for more widespread use.

Some information about the more larger problem with references to the string handlers.

Sourceforge download for the string handlers is here:

These use mbstring if it is enabled in php.ini for a slight performance gain. I believe that these string handlers were added as a helper to the Kohana fork.

El Forum

[eluser]Référencement Google[/eluser]
Thanks for sharing Arjen van Bochoven, I was just having the same issue and looking for a solution, that works perfectly for me!