Wednesday, September 15, 2010

Of DB2 for z/OS Clustering Indexes, Implicit and Otherwise

Do all of your DB2 for z/OS tables have clustering indexes? Unless you have one or more tables with no indexes at all, the answer to this question is, "yes." How's that? Read on...

About three years ago, I posted an entry to my "old" blog (the one I maintained during my years as an independent DB2 consultant) in which I explained the importance of data clustering in a DB2 environment. A lot of you probably know that an index defined on a table will be used by DB2 to physically sequence that table's rows if it (the index) was created with the CLUSTER attribute (or if, sometime after being created, it was altered to have this attribute). The index with the CLUSTER attribute (and there can be only one such index for a given table) is referred to as the associated table's explicit clustering index (or, more commonly, as just the clustering index). Suppose, however, that none of a table's indexes has the CLUSTER attribute? What then? In that case, the table will have an implicit clustering index. That will be the index that was the first one defined on the table.

The effect of an implicit clustering index on DB2 processing is the same as that of an explicit clustering index:
  • When a new row is to be added to a table that lacks an explicit clustering index, the target data page for the row (i.e., the page into which it should go if optimal data clustering is to be maintained) will be determined via the table's implicit clustering index.
  • When a tablespace is reorganized using the IBM DB2 REORG utility, rows unloaded from the tablespace will be sorted based on the key of the implicit clustering index of the table (or tables) stored in the tablespace.
Note that the statement above pertaining to REORG TABLESPACE has not always been true. There was a time when the IBM DB2 REORG utility would re-sequence rows in a table only if the table had an explicit clustering index. I'm not sure when this behavior changed, but I know that at least since DB2 for z/OS Version 8, REORG will sort data rows according to a table's clustering key, regardless of whether the clustering index is explicit or implicit.

Given that a table with an index will always have a clustering index, and that an implicit clustering index works as does an explicit clustering index, you might think that defining an explicit clustering index on a table is no big deal. I'd disagree with that assessment, and here's why: relying on the status of an index as the "oldest" one on a table to ensure its use as the table's clustering index (if you go the implicit route) could lead to an unexpected situation if that index were to be accidentally dropped. See, if you re-create the dropped index according to the original DDL (which lacked the CLUSTER attribute), it will no longer be the table's implicit clustering index if there are other indexes defined on the table. Why? Because the re-created index will no longer be the oldest one on the table (it will instead be the newest). One of the other indexes on the table will be the "new oldest" one, and that one will be the table's new implicit clustering index. In my opinion, having an explicit clustering index on each of your tables that has an index is a best practice.

I'll conclude this entry with a look at two interesting (to me, anyway) scenarios. First, consider a situation in which a table's implicit clustering index is altered with the addition of a new column (an extension of ALTER INDEX functionality introduced with DB2 for z/OS Version 8). Will it still be the table's implicit clustering index? Yes. The clustering key may have changed by way of the ALTER INDEX... ADD COLUMN operation, but the index is still the oldest one defined on the table.

Scenario two: suppose that index ABC, the first one created on table XYZ, has the CLUSTER attribute. If an ALTER INDEX statement with a NOT CLUSTER specification is subsequently executed for index ABC, does that mean that the index is no longer the table's clustering index? No, that's not what it means. Until such time as an ALTER INDEX statement with a CLUSTER specification is executed for some other index on table XYZ (or until a new index with the CLUSTER attribute is defined on table XYZ), index ABC will remain the table's clustering index (albeit an implicit clustering index) by way of its status as the oldest index defined on the table.

Take out the guesswork, OK? If you are going to have a clustering index on a table (and remember, you will if there are any indexes on the table), label it as such.

Friday, September 3, 2010

Thoughts on DB2 for z/OS Buffer Pool Sizing

A couple of years ago, on my Catterall Consulting blog, I posted an entry in which I urged mainframe DB2 people to take advantage of 64-bit addressing, a capability that debuted with DB2 for z/OS Version 8 and which allowed, among other things, the allocation of very large buffer pools. Now, I'm singing the second verse of that same song (actually, about the twentieth verse -- I have to keep repeating it): IF YOU HAVE BIG MEMORY, YOU SHOULD HAVE BIG DB2 BUFFER POOLS.

Again and again I see it: a mainframe server (or logical partition thereof) that has a dozen or more gigabytes of central storage, and a production DB2 subsystem -- the only one on that z/OS instance -- that has a a buffer pool configuration with an aggregate size of a a gigabyte or less. Sometimes, pages of large and very heavily accessed tables and/or indexes are cached in a pool that has 80,000 buffers, or 40,000, or 15,000, or maybe just 10,000 buffers. The result? Way high rates of I/O activity, as in thousands of disk reads per second. That's a big drag on application performance, negatively impacting both throughput and CPU efficiency. When you have the server memory available to eliminate this undesirable situation, USE IT.

So, first things first: get a handle on your buffer pool I/O situation. You can use a DB2 monitor to do this, but my preference is to use the output of the command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL. More specifically, I like to issue this command once, and then again an hour later. That way I get, in the output of the second issuance of the command, one hour's worth of buffer pool activity data. For each pool referenced in the output of the second command, I sum the numbers in five fields:
  • SYNC READ I/O (R)
  • SYNC READ I/O (S)
  • PREFETCH I/O (under SEQUENTIAL PREFETCH)
  • PREFETCH I/O (under LIST PREFETCH)
  • PREFETCH I/O (under DYNAMIC PREFETCH)
I take that total figure and divide it by the number of seconds between the first and second -DISPLAY BUFFERPOOL(ACTIVE) DETAIL commands, and voila -- I have the rate of disk read I/Os per second for each active pool. If that number is less than 100 for a buffer pool, I'm not likely to have a performance concern. If it's between 100 and 1000, I'll probably want to see about increasing the size of the pool. If it's north of 1000, I will definitely want to make that pool larger -- maybe much larger (the largest number I've seen for disk read I/Os associated with a DB2 buffer pool is a little over 8000 per second -- I'm sure there are larger numbers out there).

If you have a really high rate of disk read I/O activity for a pool and you want to make it larger, can you? That depends on whether or not the z/OS system in question has enough memory to accommodate a buffer pool size increase without negatively impacting other work on the server. How do you know that? A good way is to check the demand paging rate on the system. This figure indicates the number of times per second that a page, previously moved from central storage to auxiliary storage, is brought back into central storage to satisfy a program request. It's available by way of a z/OS monitor. The lower the demand paging rate, the less pressure there is on the system's central storage resource. What you want is for the demand paging rate to be less than 10 per second. You also want it to be more than zero, because a super-low demand paging rate means that the mainframe memory your organization has paid for is probably not being leveraged as it should be for system performance. If the demand paging rate is less than one per second, memory on your system is likely being underutilized. Enlarging one or more of your DB2 buffer pools is often a very good way to put underutilized server memory to productive use. In my experience, if there is one production DB2 subsystem on a z/OS instance, and if the size of that DB2 subsystem's buffer pool configuration is less than 10% of the amount of memory available to the z/OS instance, it's almost certain that the demand paging rate is very low. If you make a DB2 buffer pool (or pools) larger, check the demand paging rate again after the fact and make sure that it's still less than 10 per second (and note that an occasional incident such as a dump can cause the demand paging rate to briefly spike -- that's not a big deal).

Something else: if a buffer pool's I/O rate is really high (as in a few thousand per second), and the pool is pretty small (e.g., 20,000 buffers or less for a 4K pool), and the system's demand paging rate is really low, don't just dink and dunk your way to a larger buffer pool configuration. In other words, don't add 1000 buffers to a 15,000-buffer pool. Go for way bigger. Think in terms of doubling the pool's current size, or tripling it, or even quadrupling it. Later, after it's bigger (like, 80,000 buffers or more), you can think about growing it in smaller chunks, percentage-wise (e.g., maybe make it 50% larger). Whenever you take any action to enlarge a buffer pool, follow that with the -DISPLAY BUFFERPOOL commands that I mentioned previously, to gauge the effect of the size increase on the pool's disk read I/O rate. When you do that, in addition to checking on read I/Os per second, look to see if the number of of synchronous reads associated with sequential access (SYNC READ I/O(S)) went down. When a buffer pool is really undersized relative to the volume of requests for pages in objects assigned to the pool, it may be that when a set of 32 pages is brought into the pool via a prefetch read I/O, some of those pages are flushed from the pool before the requesting application process can access them. That drives the number of synchronous reads related to sequential access higher (because those flushed-out pages, when requested, will be read into memory individually). With a larger pool, page residency time increases, and it's less likely that prefetched pages will get flushed before they are accessed for the requesting application process.

Another thing: once an individual buffer pool (as opposed to the whole buffer pool configuration) gets to be pretty big (say, a gigabyte in size), before making it larger still consider the possibility of creating a new pool (maybe 25% or half the size of the big one) and moving some of the highest-activity objects from the really big pool to the new one. This is NOT a technical requirement, as a single 4K buffer pool can grow to a terabyte in size; rather, it's an opportunity to split some objects out in a way that will provide you with more granular buffer pool statistics and a chance to more finely tune the overall buffer pool configuration. Just a thought.

Finally, if your DB2 environment operates in data sharing mode on a parallel sysplex mainframe cluster, keep in mind that an increase in the size of a buffer pool (and you usually want a given pool to be the same size on all members) may make an enlargement of the associated group buffer pool highly advisable -- this mainly to ensure that the group buffer pool will have enough directory entries to prevent directory entry reclaims, which get in the way of top performance. A long time ago, when the earliest users of data sharing asked for a formula to help with group buffer pool sizing, I came up with a pretty simple one: add up the size (in MB) of the local pools, and divide that sum by three (this assumes the default ratio of five directory entries for every data entry in a group buffer pool). Fifteen years later, that simple formula still works really well. So if, in a four-way data sharing group, you want to grow BP5 to 120,000 buffers (480 MB) on each member, a good size for GBP5 would be:

(480 MB X 4 members) / 3 = 1920 MB / 3 = 640 MB

If GBP5 is currently smaller than that, take it up to 640 MB (or more) before taking BP5 to 120,000 buffers.

I hope that this information will help you to assess your buffer pool configuration from a performance perspective, and I hope that you'll grow your DB2 buffer pools to improve throughput and CPU efficiency (assuming that you have enough memory for this on your system -- and I'm sure that a lot of you do).