Welcome Guest, Not a member yet? Register   Sign In
Large CSV File Import
#1

[eluser]alainm[/eluser]
Hi All,

I'm just starting to play with CodeIgniter, appreciate all the good work that has gone into it.

Now i am building an application that will have an automated process of importing a large CSV file nightly ~2Gig in Size. >1Million rows.

What would be the best approach for this, i'm looking at running this PHP script via a cron job to get away from the server limits for timeout and ensuring a consistent schedual.

Will the Library CSVReader work in this case?

thought and comments welcomed.
#2

[eluser]Sbioko[/eluser]
Why do you need other libraries? You can read such big files with ajax(cutting reading to, for example, 20 parts). One question: why do you need so HUGE file? It will never be uploaded through PHP script with default time limits :-)
#3

[eluser]danmontgomery[/eluser]
Whether you run it as a cron or not has no impact on timeout limits, those are PHP settings that can be changed no matter the context of the script. The method I'd choose has less to do with the size and frequency of the script, and more to do with the nature of it. That said, I'd be leaning towards cron just because of the size, if I was actually waiting for such a thing to finish I might get bored quickly.

With such a big script your main concern is going to be memory management, you'll want to watch execution time as well as php's memory usage. You'll also probably want to use a lower priority so you're not hogging system resources.
#4

[eluser]JHackamack[/eluser]
If you're talking about loading in a CSV file into a MySQL table one thing to look at is MySQL Load Data Infile:
http://dev.mysql.com/doc/refman/5.1/en/load-data.html

That would help cut down on the time loading if you don't need to do any processing on the columns beforehand.
#5

[eluser]alainm[/eluser]
Yeah the size of the file is very large due to the nature of this data, it's transactional information. so this file will be very large.

Now i'm looking at this as a fully automate process that would happen in the background, hence looking at using cron.




Theme © iAndrew 2016 - Forum software by © MyBB