Welcome Guest, Not a member yet? Register   Sign In
How to structure the database - MYSQL performance question
#1

I am re-writing a legacy application in CI Bonfire. The app will be using the old database which I do have scope for re-structuring if required.

Most apps I've worked on I've always used heavily normalised databases but I know that this can have implications on performance for larger applications, which this is so I'm less inclined to normalise and instead leave the structure flatter.

I also think that heavy normalisation was more useful when business logic was encapsulated in the database structure, whereas these days we would do that in the application code.

The app hosts submissions of aircraft pictures (getting on for 1 million entries). When a new aircraft is submitted the manufacturer, model and type need to be looked up. These details are currently held in a single table, see screenshot for a sample of the data:

[Image: Image%202015-09-30%20at%2012.54.48%20PM.png]

In the legacy app this is pulled into dropdowns using SQL similar to the following:

Code:
SELECT * FROM aircraft_table GROUP BY manufacturer ORDER BY manufacturer ASC

Now the easiest thing to do would be to keep with the same system but I am concerned by the performance hit of the GROUP BY.

The other option is to normalise into Manufacturer, Manufacturer_Generic_Type, Generic_Type, Generic_Type_Type, Type tables which will obviously involve a lot of joins and come with its own performance overhead.

I had thought about using ENUMs but the application users need to be able to add new aircraft types so this solution doesn't work for that case.

All of the tables are InnoDB,they were MyISAM but I'm moving over to InnoDB primarily for performance improvements, mainly in avoiding table locking.

I'm no MySQL expert, particularly not where performance is concerned, so I'm looking for advice on what would be the best way to structure this table and others like it for optimum performance. Could some of these issues be overcome by careful use of indexes (another area in which I am a complete novice)? Any other advice?

Thanks!
Reply
#2

(09-30-2015, 05:07 AM)jhob Wrote: Could some of these issues be overcome by careful use of indexes (another area in which I am a complete novice)?

Absolutely. You should create the appropriate indexes depending on the queries that you'd execute, and the same goes for performance optimizations via restructuring.

You can't optimize the data itself and by itself for anything else than size, which you already know as normalization. Not knowing what kind of lookups you'd be doing on it, we can't give you any meaningful advice on how to structure it.
Reply
#3

(09-30-2015, 07:20 AM)Narf Wrote: Absolutely. You should create the appropriate indexes depending on the queries that you'd execute, and the same goes for performance optimizations via restructuring.

I had been reading up on just that yesterday. I don't yet know myself exactly what queries I'll running.

(09-30-2015, 07:20 AM)Narf Wrote: You can't optimize the data itself and by itself for anything else than size, which you already know as normalization. Not knowing what kind of lookups you'd be doing on it, we can't give you any meaningful advice on how to structure it.

To give an example, here's two queries that are currently in the code. There will be certainly be more complex queries that will be written:

Code:
$airports_list = $this->airports_model->select('airport')->find_all();
$countries_list = $this->airports_model->select('country')->group_by('country')->find_all();

I'm guessing that an index on the country field would help, but less so on the airport as it's a straight 'return everything'.
Reply
#4

Indexes on:
SELECTed fields
ORDER BY fields
WHERE fields
fields JOINed on
Reply
#5

(10-05-2015, 11:43 AM)CroNiX Wrote: Indexes on:
SELECTed fields
ORDER BY fields
WHERE fields
fields JOINed on

Thanks for the clarification CroNiX

I'm surprised that an index makes a difference on SELECTed fields, do you know why that is?

Is there any harm in 'over-indexing', or I suppose you could call it defensive indexing?  I appreciate there would be an overhead in maintaining the index, but for data that is not highly volatile I can't see that being too much of an issue.  Are there any other performance factors to consider?
Reply
#6

There is a good answer here: http://stackoverflow.com/questions/54461...very-field

So speed is the issue, but if you have a big enough RAM on your server....
Reply
#7

(10-05-2015, 11:57 AM)jhob Wrote:
(10-05-2015, 11:43 AM)CroNiX Wrote: Indexes on:
SELECTed fields
ORDER BY fields
WHERE fields
fields JOINed on

Thanks for the clarification CroNiX

I'm surprised that an index makes a difference on SELECTed fields, do you know why that is?

You're right to be surprised, because that's nonsense.

Indexes make a (positive) difference when used on fields that you do lookups by. That's why I previously said they should be created depending on your queries ... they're just meaningless otherwise.

Use the EXPLAIN statement to identify bottle-necks in your queries (what you've shown are PHP method calls, not SQL queries) and see what kind of look-ups on which fields they make. After you can see those points, knowing what to index should come naturally.
Reply




Theme © iAndrew 2016 - Forum software by © MyBB