Welcome Guest, Not a member yet? Register   Sign In
UTF-8 BOM file upload conversion before Validation
#1
Question 

Hi CodeIgniters,

I am working on a web app with a standard file upload form through jQuery AJAX call.

Here's the deal:

Files allowed are CSV formatted and can come in both .txt and .csv format. They are exported from Excel which, as we all know, adds BOM to text files.
That's why CodeIgniter' validation throws an error, e.g "csvFile does not have a valid file extension.".

Everything works fine if I preconvert TXT files to UTF-8 without BOM in Notepad++.

What I did was trying to manipulate the $_FILES temp file before passing it to validation. This solution is partly based on the example found here: https://www.memorylack.com/solving-bom-o...nd-header/

Here's the code:

HTML

Code:
<?= form_open_multipart('productdata/importfile', 'class=""'); ?>
   <input class="form-control" type="file" id="csvFile" name="csvFile">
   <button type="button" id="importfile" name="importfile" class="btn btn-primary">Upload</button>
<?= form_close(); ?>

jQuery

Code:
$(document).on('click', '#importfile', function() {

   // Get the selected file
   var files = $('#csvFile')[0].files;

   if(files.length > 0)
   {
            var fd = new FormData();
           
            // Append data
            fd.append('csvFile', files[0]);

            $.ajax({
                "url": site_url+"/productdata/importfile",
                "headers": {'X-Requested-With': 'XMLHttpRequest'},
                "type": "POST",
                "data": fd,
                "contentType": false,
                "processData": false,
                "dataType": "json",
                "success": function(response) {
                      // response visualization
                   }
              };
   // other stuff
}


Custom Helper

PHP Code:
function fileSandBox($csvFile)
{
    $file_handler fopen($csvFile"r"); // pathname to target file(s) to remove the BOM from.

    $contents fread($file_handlerfilesize($csvFile));

    fclose($file_handler);
    
    
for ($i 0$i 3$i++)
    {
        $bytes[$i] = ord(substr($contents$i1));
    }

    if ($bytes[0] == 0xef && $bytes[1] == 0xbb && $bytes[2] == 0xbf)
    {
        $file_handler fopen($csvFile"w");
        fwrite($file_handlersubstr($contents3));
        mb_convert_encoding($contents'UTF-8');
        fclose($file_handler);
    }

    return array(
        'csvFileWithoutBom' => [
            'name' => $_FILES['csvFile']['name'],
            'type' => $_FILES['csvFile']['type'],
            'tmp_name' => $_FILES['csvFile']['tmp_name'],
            'error' => $_FILES['csvFile']['error'],
            'size' => $_FILES['csvFile']['size']
        ]
    );



Controller

PHP Code:
public function importFile()
{
   $response = array();

   if ($this->request->isAJAX())
      {
         $csvFileWithoutBom fileSandBox($_FILES['csvFile']['tmp_name']);

         // Validation
         $this->validation->setRules([
            $csvFileWithoutBom['csvFileWithoutBom']['name'] =>
            'uploaded['.$csvFileWithoutBom['csvFileWithoutBom']['tmp_name'].']|
            max_size['
.$csvFileWithoutBom['csvFileWithoutBom']['tmp_name'].',1024]|
            ext_in['
.$csvFileWithoutBom['csvFileWithoutBom']['tmp_name'].',csv,txt]'
         ]);

         // Not valid
            if ($this->validation->withRequest($this->request)->run() == FALSE)
            {
               $response = array(
                  'token' => csrf_hash(),
                  'success' => '0',
                  'error' => $this->validation->getError('csvFile'// Error response
               );
             }
             else
             
{
                 // Valid
                 if($file $this->request->getFile('csvFile'))
                 {
                    if ($file->isValid() && ! $file->hasMoved())
                    {
                       // all magic stuff here
                    }
// etc 

$_FILES

Code:
Array
(
    [csvFile] => Array
        (
            [name] => txt_file_no_bom.txt
            [type] => text/plain
            [tmp_name] => /tmp/phpDnAk5I
            [error] => 0
            [size] => 9937
        )

)

Then, Validation throws "txt_file_no_bom.txt is not a valid uploaded file."

I am pretty sure manipulating the $_FILES directly is not a brilliant idea.
Do you guys have any suggestion on how to achieve this goal?

Thanks!
Cheers!
Reply
#2

You can turn the BOM off in your programming editor.

Each editor has a different way of doing it so you will have to find yours.
What did you Try? What did you Get? What did you Expect?

Joined CodeIgniter Community 2009.  ( Skype: insitfx )
Reply
#3

(10-18-2021, 01:42 AM)InsiteFX Wrote: You can turn the BOM off in your programming editor.

Each editor has a different way of doing it so you will have to find yours.

Thanks!

It is the easiest solution, of course.

However, these txt/csv files come from external sources I cannot have control over. They contain customer data mostly exported out of Excel.
I cannot just tell the user at the other end of the pipeline to take additional steps and reconvert their files because they have BOM ("They have what!? - would they ask...").

And our goal as developers is to find the most user-friendly solution, right?  Smile

Cheers!
Reply




Theme © iAndrew 2016 - Forum software by © MyBB