CodeIgniter Forums
Extract an Excel in background (for large files) - Printable Version

+- CodeIgniter Forums (https://forum.codeigniter.com)
+-- Forum: Using CodeIgniter (https://forum.codeigniter.com/forum-5.html)
+--- Forum: General Help (https://forum.codeigniter.com/forum-24.html)
+--- Thread: Extract an Excel in background (for large files) (/thread-71116.html)



Extract an Excel in background (for large files) - kaitenz - 07-08-2018

Hi,

I'm Kaitenz, nice to meet you.

I don't want to post a question in StackOverflow because you need a code so you can get some answer/s.
But I don't have the code and I'm using CodeIgniter so I will ask it here.  Angel

I want to create a controller that will allow the users to upload a very large Excel file (I am using PhpSpreadsheet to read Excel files).
And since I will do data validation for each column and then write it to the database, it will take time to process.

My plan is to create a background process for this. Once done uploading the file to the server (just upload, no data validation or insertion process is happening), the user can now leave the page and let the PHP process it in background. But how can I achieve this? I tried searching it around the web but any search terms I've used, doesn't answer my question.  Huh


Still learning to develop using PHP and CodeIgniter. So please bear with me. Thanks.

Heart  Kaitenz


RE: Extract an Excel in background (for large files) - Pertti - 07-09-2018

When handling big files, you are on right track, the best solution is to upload a file, and handle it with different process behind the scenes.

It's up to you to figure out the best solution for you, but it really comes down to only three parts - a worker script, a notification system that worker script can query to see if it needs to do something or not and a way to notify user that their file has been processed.

For worker script, your need to make a controller that you can call from command line:

Code:
php /your-app-folder/index.php controller/method

Make sure you add command line check to your controller (https://www.codeigniter.com/user_guide/general/common_functions.html?highlight=cli#is_cli). This is so no-one can launch processor heavy tasks by simply going to right URL in their browser.

PHP Code:
class Worker extends CI_Controller
{
    public function 
__construct()
    {
        
parent::__construct();
        if (!
is_cli()) {
            echo 
"Unauthorised access.";
            exit;
        }
    }

    public function 
process()
    {
        
// check if there are any files to process, get one queued file

        // if not, exit

        // lock returned file so next process can't start processing same file

        // process file

        // ...
    
}


When file is uploaded, assuming you move it to a folder within your project structure, you can either add necessary meta-data with additional .json file or DB entry.

The thing to look out here is you want to set it up in a way that you can lock certain files as being processed. You could have state field in your DB, so you fetch 1 row from "pending" state, but not "in progress", or if you go with meta-file solution, you can scan your special folder for .json files, but rename it to .locked while you process the file.

It's something called "race condition", so lets say it'll take 5 minutes for you to process a file, and you check for new uploads once a minute, you don't want to end up in loop where same file is processed multiple times before removed.

Once you get the code working by calling it manually from command line, you need to set up cron job on your server that calls same command once a minute.

Because all this is running behind the scenes, I suggest logging requests, at least at first, so you know the crons are running properly.


RE: Extract an Excel in background (for large files) - kaitenz - 07-09-2018

Wow, thank you for giving me some ideas. Hope I can make my first background-processing(?) file uploader. Hahaha.


RE: Extract an Excel in background (for large files) - Pertti - 07-09-2018

Good luck Smile

I have to say, it's very different from your typical hit URL -> compile HTML -> display it in browser, so it does take a bit time to work your head around it, but once you get more comfortable with it and get more practice, it's not that hard of a concept.

Also, probably best not to quote me on actual terms, I'm mostly self-taught myself Wink


RE: Extract an Excel in background (for large files) - kaitenz - 07-09-2018

Hi,

Is it possible to insert cron job logs inside the database? So we can check it in our Admin Panel?

Thank you.


RE: Extract an Excel in background (for large files) - Pertti - 07-09-2018

(07-09-2018, 03:16 AM)kaitenz Wrote: Is it possible to insert cron job logs inside the database? So we can check it in our Admin Panel?

Yes, of course.

You can create your own format that you are comfortable displaying in your own special dev dashboard controller, or you can just record event and keep it in DB and then browse with PHPMyAdmin or something if you want to.

I find log files a bit easier personally, but I'm not doing any log analytics per say (ie taking data and making graphs or anything like that), but it's completely up to you Smile