Welcome Guest, Not a member yet? Register   Sign In
What is Considered a Large Database Table?
#1

[eluser]Fielder[/eluser]
What would you guys consider to be a LARGE database (in terms of number of records on 1 table)? A number that would begin to impact query performance (standard SELECT, JOIN, ORDER BY, GROUP BY statements)?

Right now I expect my database to be 500,000 records in 1 table within 3 years from now. And likely 2 other tables near the same size.

Oh yea, and I'm using mySQL.

Thanks for the opinions.
#2

[eluser]JHackamack[/eluser]
I've dealt with tables holding over a million records. With the right indexes and optimizing you queries (looking at EXPLAIN statements for delays) you can most likely deal with it. If you're doing a lot of inserts and deletes you might want to occasionally optimize your table
#3

[eluser]Ben Edmunds[/eluser]
I've got a table I've been working on for a client that is over 12 million rows and 6gb. It's slow but managable.

I've worked with ~1 million row tables quite a bit and like JHackamack said, as long as you optimize your queries you'll be fine.
#4

[eluser]Fielder[/eluser]
Thanks, a couple other questions come to mind.

Does microsoft SQL or any other database perform better with larger tables than mySQL? Should I be preparing to migrate to a different database sometime in the future once I hit the million record tables in mySQL?

How is querying performance affected (if at all) if I have a million record table, and I'm doing a simple SELECT * FROM table AS t1 WHERE id = 548435? id would be indexed and unique. Are JOIN queries more taxing on a table of that magnitude? I'm wondering what types of queries are intense on a table of that size.

When these big companies start getting huge databases, besides optimization, do they do things like break large tables up into multiple smaller tables, or split the database and duplicate it on 2 or more physical servers?

Any links on database optimization 101? Want to make sure I've got a good schema right now while it's just beginning allowing me to do those optimizations in the future.
#5

[eluser]JHackamack[/eluser]
MySQL is probably the best database I've come across to handle a million+ rows that is free. If you want to start paying for a database Oracle might be something to consider, but from what I've heard MicrosftSQL isn't adapt at handling the data.

Using simple selects should be fine, its how you join that could slow you down. Taking a table with a million rows and joining another table with a 50 rows where table 1 has x might be slow, since all rows are joined up together before the where clause takes effect, but if you select x on the 50 row table then join the million you have a lot less rows to consider. MySQL's Explain statement helps at optimizing your query telling you where you can index.

There are master/slave relationships that you can set up behind a load balancer. Another example is memcache to cache your results so they don't hit the database at all. (http://www.facebook.com/note.php?note_id=23844338919)

Take a look at the following for advice:
http://dev.mysql.com/doc/refman/5.0/en/ha-overview.html
http://www.amazon.com/High-Performance-M...940&sr=8-1
#6

[eluser]kea13[/eluser]
Hi,

we have >15 million entries in one table of a PostgreSQL database and no signs of sluggishness Smile
And the machine running it isn't one of those multi-core, multi-gigabyte RAM monsters.

Regards,
Roman




Theme © iAndrew 2016 - Forum software by © MyBB