Welcome Guest, Not a member yet? Register   Sign In
best way to create nested arrays from mySQL
#1

Hey guys,  I am writing a new application using AngularJS on the front end, with CI providing both the MVC for the single page apps and the AJAX responses for the angular.service calls.  The backend database is going to by MySQL (but could just as easily be MSSQL, Oracle, or postgress - as long as it is RDBMS)  Data is passed between the client and server using JSON notation, but in many cases the JSON contains nested objects.  For example a list of customer orders
Code:
{
  "customer":
 {
   "customer_name": "Bananaman",
   "address":
     {
       "line1":"29 Acacia Road",
       "line2":"Nuttytown"
     }
 }
  "orders":
  {
   {
     "order_id":27,
     "order_date":"2014-01-01"
     "order_lines":
       {
          "line_id":1,
          "item_id": "ABC123",
          "qty": 3
       },
       {
          "line_id":2,
          "item_id": "XYZ345",
          "qty": 1
       }
    },
    {
     "order_id":39,
     "order_date":"2014-05-25"
     "order_lines":
       {
          "line_id":1,
          "item_id": "FFF333",
          "qty": 2
       },
       {
          "line_id":2,
          "item_id": "GGG444",
          "qty": 1
       }
    }
  }
}

So this data comes from three tables Customers, Orders [FK on CustomerID] and OrderLines [FK on OrderID]
The example above contains one customer record, followed by two orders, each with two lines

I am looking for a stategy to build these models that is consistent and performant.  

One option is the get the customer list from the customer model as result_array then foreach through the array and query for the orders for that customer and from the array of orders, foreach through the orders to get the order lines and simply iteratively build the nested array and convert it to json using json_encode but this could require several hundred database calls and could be slow as PHP is responsible for the record associations.

A second option would be to return a flat result array by joining the customers, orders and lines into a single query and then iterate through the results set changing the nesting level each time the dimensional value (customer or orderID) changes.  This becomes very unweildly if you are more than 3 levels deep

A third option would be to get the customers results, customerOrders results and customerOrdersLines and use some array intersect logic to force the subsets into their outer sets but again this gets unweildy with more levels and means lots of duplicated data in memory at the same time.

Which method is considered the most performant or are there any other alternatives.

The ultimate objective is to build the models so that order_model has a function to return the nested objects (i.e. is the responsibility of the model to contruct the array rather than the controller to call the separate models and construct the array)

One other option is to actually store the json objects pre-rendered in the database.  Once an order is completed it is a fixed dataset and so safe to be json'ed
Reply
#2

This all really depends on where you experience slow-downs in the process. For example, you could find that the query to retrieve all of the data at once is significantly faster or slower than retrieving it as multiple queries. In fact, this particular issue is responsible for a number of different design patterns related to data access.

In most cases, no matter which path you choose, the database access itself is more likely to be the performance bottleneck than the PHP code required to transform the returned data into a JSON-encoded object for your script. The several hundred database calls is the slowdown, not the fact that PHP is responsible for the record associations, because adding several hundred entries to an object or array in PHP does not require a significant amount of time.

There are pre-existing libraries (like DataMapper) to handle most of this in a generalized fashion, which will certainly generate more database calls than is strictly necessary, but will work well for most cases. If this is the only time you're going to run into a problem like this in your application, whether it is worth your time to look into them is up to you. If you're going to run into it a lot, it will probably be worthwhile. In most cases, something like DataMapper will be good enough for performance in a situation like this.

Occasionally, though, the database access is excessive and you end up needing to optimize the data access by writing something specific to your needs, in which case you're almost back to where you started. Generally, I've found that if you're really running into a situation where you might be making a significant number of queries to the database to build your dataset, it's faster to make the joins in SQL and reduce the dataset in PHP. The code required is usually not as unwieldy as it might seem at first glance, though you may need to make use of recursion and leave copious notes behind via code comments to ensure you can debug it later if an issue (or change) comes up.

Another thought is that a blended approach may serve you better. For example, you could retrieve the Customers in one query and combine the Orders and OrderLines in a second query, grouped by CustomerID, or retrieve the Customers and Orders in one query and the OrderLines (grouped by OrderID) in a second (or even retrieve the data from each table as a separate query, but limiting it to three queries, for example "select * from Orders where CustomerID in $arrayOfCustomerIDs"). This would reduce the complexity of reducing the dataset, while preventing the potentially hundreds of queries on the database.

Ultimately, the idea of storing JSON in the database for completed orders is not a bad one, either, as long as the table of JSON data is used as a performance cache rather than as the data store. You could insert the data when the order is finalized and set foreign key fields on the table which would allow updates if something changes in a completed order (for some reason). I would recommend keeping the completed orders in the original tables, though, simply because you may find some day that you want (or need) to change the format of your JSON objects or the JSON proves to be an additional headache for a simple database operation.
Reply
#3
Smile 

Thanks mwhitney!

Basically take each case on its merits and don't be tempted by premature optimisation Dodgy

The factors I guess will be whether we are returning a small part of a large dataset (e.g. orders for a specific customer) or a significant quantity of the data (e.g. all customers and their lifetime value) and whether the user is likely to visit all of the data. (for example, If I am displaying a list of products and the user is only going to choose 1 then there is little point in passing the complete dataset for all products to the angular $scope until I know which one has been chosen)

One other option (and one of the reasons I like angular) would be late fetch and bind of the data. If the customer list is paginated, Then you could return an entire list of customers to angular but only return the order list for the previous, current and next pages. This way the volume of data passed via the initial http request and the subsequent ajax requests is limited and effectively pre-fetched which give the appearance of speed, even if the data in the background is slower than normal.

I guest the most important thing is to ensure that the interface between the angular $scope model and the api json response is consistent; how that is built is the whole point of separation of concerns in the MVC world.
Reply
#4

(03-05-2015, 08:29 AM)mwhitney Wrote: Ultimately, the idea of storing JSON in the database for completed orders is not a bad one, either, as long as the table of JSON data is used as a performance cache rather than as the data store.

Yeah, coz otherwise it would be MongoDB or FireBase Confused . Actually CodeIgniter and FireBase sound like they should work well together Heart
Reply




Theme © iAndrew 2016 - Forum software by © MyBB