Friday, September 23, 2016

DB2 for z/OS: Using PGFIX(YES) Buffer Pools? Don't Forget About Large Page Frames

Not long ago, I was reviewing an organization's production DB2 for z/OS environment, and I saw something I very much like to see: a REALLY BIG buffer pool configuration. In fact, it was the biggest buffer pool configuration I'd ever seen for a single DB2 subsystem: 162 GB (that's the combined size of all the buffer pools allocated for the subsystem). Is that irresponsibly large -- so large as to negatively impact other work in the system by putting undue pressure on the z/OS LPAR's central storage resource? No. A great big buffer pool configuration is fine if the associated z/OS LPAR has a lot of memory, and the LPAR in question here was plenty big in that regard, having 290 GB of memory. The 128 GB of memory beyond the DB2 buffer pool configuration size easily accommodated other application and subsystem memory needs within the LPAR, as evidenced by the fact that the LPAR's demand paging rate was seen, in a z/OS monitor report, to be zero throughout the day and night (I'll point out that the DB2 subsystem with the great big buffer pool configuration is the only one of any size running in its LPAR -- if multiple DB2 subsystems in the LPAR had very large buffer pool configurations, real storage could be considerably stressed).

A couple of details pertaining to this very large buffer pool configuration were particularly interesting to me: 1) the total read I/O rate for each individual buffer pool (total synchronous reads plus total asynchronous reads, per second) was really low (below 100 per second for all pools, and below 10 per second for all but one of the pools), and 2) every one of the buffer pools was defined with PGFIX(YES), indicating that the buffers were fixed in real storage (i.e., not subject to being paged out by z/OS). And here's the deal: BECAUSE the buffer pools all had very low total read I/O rates, page-fixing the buffers in memory was doing little to improve the CPU efficiency of the DB2 subsystem's application workload. Why? Because all of the pools were exclusively using 4K page frames.

Consider how it is that page-fixing buffer pools reduces the CPU cost of DB2 data access. When the PGFIX(YES) option of -ALTER BUFFERPOOL was introduced with DB2 Version 8 for z/OS, the ONLY CPU efficiency gain it offered was cheaper I/O operations. Reads and writes, whether involving disk volumes or -- in the case of a DB2 data sharing configuration on a Parallel Sysplex -- coupling facilities, previously had to be bracketed by page-fix and page-release actions, performed by z/OS, so that the buffer (or buffers) involved would not be paged out in the midst of the I/O operation. With PGFIX(YES) in effect for a buffer pool, those I/O-bracketing page-fix and page-release requests are not required (because the buffers are already fixed in memory), and that means reduced instruction pathlength for DB2 reads and writes (whether synchronous or asynchronous).

DB2 10 extended the CPU efficiency benefits of page-fixed buffer pools via support for 1 MB page frames. By default, in a DB2 10 (or 11) environment, a PGFIX(YES) buffer pool will be backed by 1 MB page frames if these large frames are available in the LPAR in which the DB2 subsystem runs. How does the use of 1 MB page frames save CPU cycles? By improving the hit ratio in the translation lookaside buffer, leading to more cost-effective translation of virtual storage addresses to corresponding real storage addresses for buffer pool-accessing operations. DB2 11 super-sized this concept by allowing one to request, via the new FRAMESIZE option for the -ALTER BUFFERPOOL command, that a page-fixed pool be backed by 2 GB page frames (note that 2 GB page frames may not save much more CPU than 1 MB frames, unless the size of the buffer pool with which they are used is 20 GB or more).

Having described the two potential CPU-saving benefits of page-fixed buffer pools, I can make the central point of this blog entry: if you have a PGFIX(YES) buffer pool that has a low total read I/O rate, and that pool is backed by 4 KB page frames, the PGFIX(YES) specification is not doing you much good because the low read I/O rate makes cheaper I/Os less important, and the 4 KB page frames preclude savings from more-efficient virtual-to-real address translation.

This being the case, I hope you'll agree that it's important to know whether a page-fixed buffer pool with a low read I/O rate is backed by large page frames. In a DB2 11 environment, that is very easy to do: just issue the command -DISPLAY BUFFERPOOL, for an individual pool or all of a subsystem's buffer pools (in that latter case, I generally recommend issuing the command in the form -DISPLAY BUFFERPOOL(ACTIVE)). You'll see in the output for a given pool one or more instances of a message, DSNB546I. That message information might look like this:

DSNB546I  - PREFERRED FRAME SIZE 1M
        0 BUFFERS USING 1M FRAME SIZE ALLOCATED
DSNB546I  - PREFERRED FRAME SIZE 1M
        10000 BUFFERS USING 4K FRAME SIZE ALLOCATED

What would this information tell you? It would tell you that DB2 wanted this pool to be backed with 1 MB page frames (the default preference for a PGFIX(YES) pool), but the pool ended up using only 4 KB frames. Why? Because there weren't 1 MB frames available to back the pool (more on this momentarily). What you'd rather see, for a PGFIX(YES) pool that is smaller than 2 GB (or a pool larger than 2 GB for which 2 GB page frames have not been requested), is something like this:

DSNB546I  - PREFERRED FRAME SIZE 1M
        43000 BUFFERS USING 1M FRAME SIZE ALLOCATED

(This information is also available in a DB2 10 environment, though in a somewhat convoluted way as described in an entry I posted to this blog a couple of years ago.)

So, what if you saw that a PGFIX(YES) pool is backed only by 4 KB page frames, and not by the preferred larger frames (which, as noted above, are VERY much preferred for a pool that has a low total read I/O rate)? Time then for a chat with your friendly z/OS systems programmer. That person could tell you if the LPAR has been set up to have some portion of the real storage resource managed in 1 MB (and maybe also 2 GB) page frames. Large frames are made available by way of the LFAREA parameter of the IEASYSxx member of the z/OS data set SYS1.PARMLIB. Ideally, the LFAREA specification for a z/OS LPAR should provide 1 MB page frame-managed space sufficient to allow PGFIX(YES) buffer pools to be backed to the fullest extent possible by 1 MB frames (and/or by 2 GB frames as desired). It may be that DB2 is the one major user of large real storage page frames in a z/OS LPAR, and if that is the case then the amount of 1 MB (and maybe 2 GB) page frame-managed space could reasonably be set at just the amount needed to back page-fixed DB2 buffer pools (in the case of 1 MB frames, I'd determine the amount needed to back PGFIX(YES) buffer pools, and increases that by about 5% to cover some smaller-scale uses of these frames in a z/OS environment). If WebSphere Application Server (WAS) is running in the same z/OS LPAR as DB2, keep in mind that WAS can use 1 MB page frames for Java heap memory -- your z/OS systems programmer should take that into account when determining the LFAREA specification for the system.

There you have it. To maximize the CPU efficiency advantages of page-fixed buffer pools, make sure they are backed by large page frames. This is particularly true for pools with a low total read I/O rate. The more active a buffer pool is (and the GETPAGE rate is a good measure of activity -- it can be thousands per second for a buffer pool), the greater the CPU cost reduction effect delivered by large page frames.

And don't go crazy with this. Don't have a buffer pool configuration that's 80% of an LPAR's memory resource, and all page-fixed. That would likely lead to a high level of demand paging, and that would be bad for overall system performance. Know your system's demand paging rate, and strive to keep it in the low single digits per second or less, even during times of peak application activity. Leveraging z Systems memory for better performance is a good thing, but like many good things, it can be overdone.