Learnings

Wednesday, February 27, 2008

DB2 Partitioning - The Why and How?

If you are dealing with large DB2 databases, you may want to think about paritioning with DB2 DPF (Database Partitioning Feature) at some point. DB2 DPF is already available with your DB2 Enterprise Server Edition but needs to be separately licensed. Before you jump into it, you will probably have questions like:
  1. I have a large SMP box with many CPU's and tons of memory, can't DB2's intra-partition parallelism take care of it? Why should I consider DPF for one box?
  2. Is DPF suitable only for data warehousing or can I safely use it for OLTP systems also?
  3. Is performance the only reason for moving to DPF or are there other benefits as well?

To start with, even for a single large SMP box, there is a limit to what parallelism can achieve without partitioning. For instance, even if you have a LARGE amount of RAM, if there is only one DB2 instance, you may still not get the best possible throughput because some internal operations will still have to serialize (there may be critical latches protecting linked lists in shared memory that can end up becoming a bottleneck).

Also, for example, if I have 12 CPU's, it does not mean that the optimizer will always start off 12 parallel threads for all my large queries (there may be some constructs in my query which prevent parallelism, for instance) - this again may lead to under-utilization of resources.

And then there is the issue of manageability - things like backups or data loads could end up taking a lot of time. With DPF, DB2 can do an effective "divide and conquer" and make your large database look like many smaller ones and give you better throughput, better utilization of available resources and better manageability of the whole setup even on a single SMP box.

That was just a teaser, if I have got you interested, you should read this.

0 Comments:

Post a Comment

<< Home