Welcome Guest, Not a member yet? Register   Sign In
[Best practice] Heavy background tasks?
#1

[eluser]Clooner[/eluser]
I'm working on a heavy mysql site right now and using CI for the front-end. This site however needs to do a lot of reporting and database querying, thousands of queries will be needed to the generate the reports. Just for testing I have a task page setup that refreshes itself on a set interval. This however won't work in the live site. I think I need to create an app in Java or Python to do this job in the background.

Does anyone know any good articles to read on this subject or has a suggestion on how to do this?
#2

[eluser]rogierb[/eluser]
Why not use a cronjob to execute a script on a specific interval? There is a cron thingie that should do the trick
#3

[eluser]kgill[/eluser]
Just some quick thoughts:

The usual solution to these kind of problems are cron jobs, if your host doesn't give you access to cron, get a new host. Once you've got cron working for you, you want to identify your non-peak times and do the heavy work then - we have several processes that are very CPU intensive so we set them to run each night at 4am when we have almost 0 users.

Other more expensive options are to work with a replicated database for your reporting, so server 1 is your production database and that gets replicated to server 2, you then can work with server 2 without affecting your production performance.
#4

[eluser]Clooner[/eluser]
Actually I know about cron jobs but the problem is not that I can't run a cron job. The app will run on dedicated servers. It is a financial site which get's updated data every 60 seconds and every 24 minutes new reports have to be made up. As I said it will be using thousands of queries. I'm sorry if I was not clear before but I'm looking in server load reduction solutions like activeMQ, Having several processes generate a certain amount of data. The site will run across multiple servers So my thoughts are now that one will do raw data chunking serving that to another computer who works its magic on it. This data will be served to the front end(which is in CI). My question is how to handle large quantities of queries and data(not CI front end related actually).
#5

[eluser]jedd[/eluser]
This is a hugely broad question.

The general response is to have a separate OLAP database for this kind of work. If your report cycle is 24 minutes, then this is less (but still) feasible -- you'd just have to be more careful in your assessment of whether it's worth the cost of maintaining a separate and unique schema dedicated to the reporting side of the system.

Have you done any performance tests already to determine if what you want to do does or does not work properly on the extant hardware and database? I'd assume you have, if you're looking at ways to reduce that load.

If it's marginal, just throw more CPU and memory at the problem.

Tune the crikey out of your database.

Depending where you are in the development cycle, the option of de-normalising your primary database may exist.

As I say .. huge question, with way too many variables.
#6

[eluser]Clooner[/eluser]
[quote author="jedd" date="1239119207"]Depending where you are in the development cycle, the option of de-normalising your primary database may exist.[/quote]

I am still at the proof of concept phase. The development will start next month. I have the system working already but I'm sure it will fail when the load will get to high. De-normalizing is actually a really good idea and will get rid of a lot of calculations, difficult queries. There are some down sides to this. Just have to figure out what they are...
#7

[eluser]jedd[/eluser]
Quote: There are some down sides to this. Just have to figure out what they are...

Well, the obvious one (and I mention this only for the other viewers at home) is that you are far more likely to end up with an inconsistent database .. hence the appeal of an OLAP one-way-sync'd bugger to handle this kind of work. A master/slave approach here (that you overwrite every half hour) wouldn't really help, unfortunately, as your reports won't be making any changes -- it's the primary DB's integrity you're worried about.

Solutions sometimes involve the dirty words 'stored procedures' .. but in reality any decent API (in our case, a Model) and a commitment to not subvert same, should suffice.

Or, if you want a one word answer, control. Wink




Theme © iAndrew 2016 - Forum software by © MyBB