• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Memory consumption bug

Hi all- I've tested this as thoroughly as I can and continue to be baffled where the buildup is coming from. If anyone wants to recreate or make suggestions please do. Note: I have already worked around this, but would like to determine if this is a framework issue so it gets addressed.

I am using a custom CLI command (App\Commands) to import a large CSV into the database. The actual processing is very simple, and uses existing models (with internal commands only, no modification) to save to the database. After processing a reasonable ~200k lines the script crashes with "Allowed memory size ... exhausted". However, there should be no accruing memory storage as all the variables are reused on each iteration. I suspect something in the model is hanging onto a variable by reference such that it doesn't get cleared on each iteration.

Here's a generalized (but still accurate) version of the code:

PHP Code:
public function run(array $params)
$myModel = new MyModel();
$myModel2 = new MyModel2();
$file "data.csv";
$row 0;
    if ((
$handle fopen($file"r")) !== false):
        while ((
$data fgetcsv($handle1000",")) !== false):
            // validate line
$num count($data);
            if (
$num != 4):
CLI::write("Invalid number of columns on line {$row}: "CLI::color(implode(","$data), 'yellow'));
// get the timestamp
if ($timestamp strtotime($data[2])):
CLI::write("Invalid timestamp on line {$row}: "CLI::color($data[2], 'yellow'));

/* PRIMARY ENTRY */            
            // build the row
$row = [
'source' => "import",
'created_at' => date("Y-m-d H:i:s"$timestamp)
// add to the database
$ref_id $myModel->getInsertID();

/* DEPENDENT ENTRY */            
            // build the row
$row2 = [
'ref_id' => $ref_id,
'content' => $data[1],



By default in development mode, each query object is saved for displaying in the debug bar, etc.

Can you check app/Config/Toolbar.php? There should be a setting, $maxQueries in there that determines the maximum number of queries that are saved. If this is something that you've been using for a bit and upgrading through the releases, that setting might have gotten missed putting in, so you should add it:

    | Max Queries
    | If the Database Collector is enabled, it will log every query that the
    | the system generates so they can be displayed on the toolbar's timeline
    | and in the query log. This can lead to memory issues in some instances
    | with hundreds of queries.
    | $maxQueries defines the maximum amount of queries that will be stored.
    public $maxQueries = 100;

If the data from your CSV rows is rather large than you might just need to lower that value.
Support Development  • Practical CodeIgniter 3  •
Myth:AuthVulcan - CLI Tools for CI4

This particular app started on alpha-5, and does have `$maxQueries = 100` so I don't believe that is it. I will play around with those settings and maybe disabling the toolbar to see if it makes a difference. Any docs available on the toolbar function & code?

Still no resolution on this one, but some progress! The scope (330k lines/db calls) should be well within normal operational limits. I've checked Collector\Database and made sure it's only holding 100 maxQueries.
Here's the clincher: running the same script in the same setting with CI_ENVIRONMENT set to "production" has way higher capacity (I tested up to 10x more). So I'm definitely suspicious of system/Debug/Toolbar/Collectors but so far have been unable to find the culprit.

Have you tried importing the CSV data using the "Load Data" command? I found it was amazingly fast:

download and make your system directory strict

I’ve used that in the past too. Worked great on CDV with a giant data set. Loaded into a temp table, did normalization within the table and then copied to the final table.
Support Development  • Practical CodeIgniter 3  •
Myth:AuthVulcan - CLI Tools for CI4

Further to my orevious post, here is a post of mine from another forum:

// quote start
This may be of interest...

I was looking at different options to reduce the six minute time time taken to download and convert two XLS spreadsheets into a common database table. I reduced the time down to less than a minute mostly by using the following instead of stepping through each CSV row then adding to the table of over 22,000 rows.

https://dev.mysql.com/doc/refman/5.7/en/load-data.html 3

It was not easy to setup because the table did not exactly match the CSV fields so I now create a new matching table, import the CSV data. After importing I then add additional columns then finally add a primary index. The latter two operations only take fractions of a second. Before adding the primary key it was necessary to delete duplicates which only took a fraction of a second.

Adding and verifying a couple of hundred updates (from another CSV file with a different structure) does not take long.
download and make your system directory strict

I'll check that out, thanks for the reference. Worth noting though that this is an issue with any script running a large number of calls, not just data imports. After importing all the data mentioned above I ran a second command to cross-reference and tally the data and yet again had to set my environment to production before running it.

Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  

  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2020 MyBB Group.