Monday, August 31, 2015

DB2 for z/OS: Which Buffer Pools Should You Page-Fix?

A few days ago, I delivered a presentation on DB2 for z/OS performance monitoring and tuning for the DBA team at a large financial institution. At one point during the session, an attendee asked this question: "Which of our buffer pools should we page-fix?" Because that's a good question, and because it's a question to which the answer has changed over the years, I decided to make it the subject of this blog entry.

The ability to fix a buffer pool in memory, accomplished through the PGFIX(YES) option of the -ALTER BUFFERPOOL command, was introduced with DB2 for z/OS Version 8. [Note that PGFIX(YES), unlike other -ALTER BUFFERPOOL specifications, does not take effect immediately for an already-allocated buffer pool. Actualizing this change requires deallocation and reallocation of the pool -- something typically accomplished by stopping and restarting the associated DB2 subsystem.] The benefit initially ascribed to page-fixed DB2 buffer pools was cheaper (in terms of CPU cost) I/O operations. How so? Well, if a DB2 buffer is not fixed in memory (and PGFIX(NO) is the default), DB2 has to request that z/OS fix the buffer in memory (i.e., make it non-pageable) every time a table space or index page is read into or written from the buffer (and that is true for reads and writes from and to group buffer pools in a DB2 data sharing environment, as well as reads and writes from and to disk volumes). When the I/O operation is complete, the buffer is "de-fixed" (i.e., made pageable again). Why is this done? It's done so that the buffer won't be stolen out from under DB2 by z/OS in the midst of the I/O operation. One pair of page-fix and page-release operations is pretty inexpensive, but when these happen at a rate of thousands per second, you're talking about a fairly significant consumption of processor resources (and keep in mind that a single prefetch read that brings, say, 32 pages of an object into memory will require 32 page-fix and 32 page-release actions). When the pages of a buffer pool are fixed in memory from the get-go, obviously page-fix and page-release actions are not required for every movement of a page into or out of a buffer, and thus I/O operations associated with a page-fixed buffer pool are more CPU-efficient than they otherwise would be. In-DB2 CPU time can be reduced, in some cases, by several percentage points for programs accessing data in a page-fixed DB2 buffer pool.

So, the original answer to the question, "Which of our buffer pools should we page-fix?" was, "The pools with the highest I/O rates." More specifically, I'd tell people to check on the total read I/O rate for each of the buffer pools of a production DB2 subsystem, and page-fix those pools having a rate of 1000 or more read I/Os per second, and perhaps as well pools with read I/O rates in the high hundreds per second. [The read I/O rate for a buffer pool is the number of synchronous reads plus the number of prefetch reads associated with the pool, expressed as a per-second figure. I/O activity information can be obtained from a DB2 monitor, or from the output of the DB2 command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL. If the command is used, it should be issued once, then issued again an hour later. The statistics in the output of the second issuance of the command can then be divided by 3600 to get per-second figures.] In DB2 Version 8 and DB2 9 environments, specifying PGFIX(YES) for a low-I/O pool wouldn't do much good.

Along came DB2 10 for z/OS, and the answer to the "Which of our buffer pools should we page-fix?" question changed. It changed because of DB2 10's support for 1 MB real storage page frames (versus traditional 4 KB frames) for page-fixed buffer pools (1 MB page frames are themselves made available in a z/OS LPAR via the LFAREA parameter in the IEASYSxx member of SYS1.PARMLIB). When a buffer pool's pages are backed by 1 MB page frames, CPU efficiency is improved because the translation of virtual storage addresses to real storage addresses in that context is a less-costly process versus the 4 KB page frame situation (this thanks to a better hit ratio in what's called the translation lookaside buffer). That being the case, in DB2 10 (and 11) environments, high-I/O buffer pools are joined, as good choices for PGFIX(YES), by high-activity pools, and some in the latter category may not be in the former category. Sure, high-I/O pools are very likely to be high-activity pools (and I think that GETPAGEs per second is a good measure of activity), but a high-activity pool could very well be a low-I/O pool. In fact, a high-activity pool could be a no-I/O pool (no read I/Os, that is): consider a pool, defined with the DB2 10-introduced PGSTEAL(NONE) option, that is used to "pin" (that is, entirely cache) one or more database objects in memory. Ideally, you would see no read I/Os for such a pool once all of the pages belonging to table spaces or indexes assigned to the pool have been read into memory from disk (and for a PGSTEAL(NONE) buffer pool, DB2 will asynchronously read all of an associated object's pages into memory when the object is first referenced after pool allocation). If a "pinning" pool has a high GETPAGE rate, PGFIX(YES) could be a nice CPU-saving move.

The most recent development affecting the answer to the "Which buffer pools should we page-fix?" question is the advent of really large memory resources for z/OS LPARs. I like to see at least 20-40 GB of memory for each engine in a production z/OS LPAR (zIIP engines included, so I would want 160-320 GB -- or more -- of memory for a production z/OS LPAR with four general-purpose and four zIIP engines), and I'm seeing more LPARs that have this Big Memory characteristic. In a z/OS system with memory to burn, you might consider page-fixing ALL of the buffer pools for a production DB2 subsystem. [Here, I'm talking about a situation in which a z/OS LPAR holds a single production DB2 subsystem, and the size of that subsystem's buffer pool configuration is less than half of the LPAR's real storage size. If there are multiple DB2 subsystems in the LPAR, I'd want the combined size of those subsystems' buffer pool configurations to be not more than half of the LPAR's real storage size.] A set-up like this is not blue sky thinking on my part. It's reality. An organization with which I've worked a good bit over the past few years has a production DB2 subsystem in a z/OS LPAR with 9 engines (5 general-purpose, 4 zIIP) and 212 GB of real storage. The DB2 subsystem has a 90 GB buffer pool configuration, and every one of the buffer pools is defined with PGFIX(YES). The demand paging rate for that z/OS LPAR (my preferred measure of the pressure on a z/OS LPAR's memory resource) is zero -- even with 90 GB of the LPAR's real storage fenced off for exclusive use for buffering DB2 data, the remaining 122 GB is more than enough for other memory users in the system. Now, even if you have a z/OS LPAR with a ginormous amount of real storage, don't just change to PGFIX(YES) for all of a production DB2 subsystem's buffer pools at one time. Do that, instead, in stages, and keep an eye on the LPAR's demand paging rate (available from a z/OS monitor). You're good if that rate is zero. If it's non-zero but low (e.g., 1 or 2 per second), you're not in a bad situation, but you might want to put the brakes on additional use of PGFIX(YES) until you can get more memory on the system.

In conclusion, the answer to the question, "Which buffer pools should we page-fix?" depends on the nature of your DB2 environment:
  • If the z/OS LPAR's demand paging rate is on the high side (high single digits or more per second), you might want to stay away from PGFIX(YES) altogether until you can add to the LPAR's real storage resource.
  • If the z/OS LPAR's demand paging rate is low but non-zero (e.g., in the vicinity of 1 or 2 per second), consider page-fixing one or more of your buffer pools that have the highest total read I/O rates.
  • If the z/OS LPAR's demand paging rate is zero and you are running DB2 10 or 11, consider page-fixing the pools that have the highest read I/O rates and the pools that have the highest GETPAGE rates (assuming that you are managing some of the LPAR's real storage resource in 1 MB page frames).
  • If you have a z/OS LPAR with memory to burn (running z/OS on what I would call a muscle machine -- rev it up, baby), and you're running DB2 10 or 11, consider page-fixing all of the buffer pools for a production DB2 subsystem. [And one more thing: if you have one or more REALLY BIG buffer pools (meaning 20 GB or more for a single pool) in a DB2 11 environment, consider backing that pool with 2 GB page frames (if the system has such -- again, this is related to the LFAREA parameter in member IEASYSxx of SYS1.PARMLIB), by way of the new FRAMESIZE option of the -ALTER BUFFERPOOL command. For a pool that's not really big, 2 GB page frames won't make much of a difference, CPU efficiency-wise.]

Hope this information proves to be helpful for you. Survey your DB2 landscape, and leverage buffer pool page-fixing as it makes sense for your environment.

Wednesday, August 12, 2015

What Does a DB2 for z/OS System Look Like When You Have Memory to Burn?

I've long been an advocate of Big Memory (meaning, lots of real storage) for DB2 for z/OS systems. For years, I was disappointed at seeing one production DB2 subsystem after another running in a z/OS LPAR with 16 GB, 20 GB, maybe 40 GB of memory -- enough real storage to run decently, but not enough to let DB2 do its thing with maximum performance. It was like seeing a thoroughbred racehorse in a small coral. You want to see that beast run free in a big pasture, to see what it can do with some real space to work in.

Lately, I have noticed things changing for the better. Memory sizes for z/OS LPARs -- especially the ones that I really care about, which are those that house production DB2 subsystems -- are finally starting to get seriously large. That's large as in more than a hundred gigabytes -- sometimes several hundred gigabytes -- for one LPAR. This change is being fueled by multiple factors:
  • The cost of z Systems memory continues to go down on a per-GB basis. The z13 servers took us another big step in that direction, and if you get enough memory when upgrading your z196 or zEC12 mainframe to a z13 -- what our sales reps call "mega memory" -- then the discount versus the list price of the memory can be downright ginormous.
  • You can get -- and use -- a lot more mainframe memory than you could before. A single z13 server can be configured with as much as 10 TB of real storage (versus a previous max of 3 TB), and up to 4 TB of z13 memory can be used for a single z/OS LPAR (versus 1 TB previously), if you're running z/OS 2.2 -- soon to be available -- or z/OS 2.1 with some PTFs.
  • Organizations are finally paying attention to z Systems memory. Mainframe engines have become really powerful (about 1000 MIPS of processing capacity per CPU), and z Systems configurations were getting a little out of balance for a while, with big-time processing power being paired with too-small real storage resources. Memory sizes are now catching up with engine capacity. People are also increasingly waking up to the fact that Big Memory is somewhat analogous to zIIP engines: it boosts performance and throughput for DB2 workloads (and for Java applications, as well) without impacting the license cost of z/OS software.

Now, some DB2 for z/OS people might be thinking, "What would I do if I had a ton of memory to work with (let's say, at least a few hundred GB in a z/OS LPAR)?" How might I exploit that resource, and what would my DB2 system look like with that resource exploited in a major way?" Well, I'm glad you asked. I think your DB2 system would look something like this:
  • Your buffer pool configuration size is really big, and the total read I/O rate for each pool is really low. Size-wise, given at least a few hundred GB of memory in the LPAR, I'd say that your buffer pool configuration would be at least 100 GB (i.e., the aggregate size of all buffer pools allocated for the DB2 subsystem would be 100 GB or more). In general, when a z/OS LPAR houses a single production DB2 subsystem, I think that a buffer pool configuration size that is 30-40% of the LPAR's real storage size is very reasonable; so, if you have 400 GB in the LPAR, a buffer pool configuration of 120-160 GB should fit very nicely. With a buffer pool configuration of that size, you might see really low read I/O rates for each pool (the read I/O rate for a buffer pools is the rate of all read I/Os for the pool, synchronous plus asynchronous, per second). In my mind, a "really low" total read I/O rate for a given pool is less than 100 per second. That said, with a really big buffer pool configuration you might use some pools for "pinning" certain objects in memory (you'd use the PGSTEAL(NONE) option in that case), and for those pools your target read I/O rate would be zero. Also with a really big buffer pool configuration, you might have one or more individual pools sized at 20 GB or more, and for pools of that size 2 GB real storage page frames (usable for page-fixed buffer pools starting with DB2 11) could deliver additional CPU savings. Finally, with a whole lot of real storage on hand, you might decide to page-fix most, and maybe even all, of your buffer pools, for maximum CPU efficiency.
  • Your packages associated with frequently-executed transactions that re-use threads, and packages associated with batch jobs that issue frequent commits, are bound with RELEASE(DEALLOCATE). For packages bound or rebound in a DB2 10 or DB2 11 system, almost all of the virtual storage associated with those packages when they are allocated to threads for execution goes above the 2 GB bar in the DB2 DBM1 address space, and it uses agent local pool storage versus the EDM pool, so you don't need to worry about running out of space in a virtual storage sense (RELEASE(DEALLOCATE), in combination with threads that persist through commits, increases virtual and real storage utilization). CICS-DB2 thread re-use can be boosted through protected entry threads, IMS-DB2 thread re-use can be increased via pseudo-WFI and/or WFI regions, and DDF thread re-use can be achieved with high-performance DBATs. For relatively simple transactions (those with relatively low in-DB2 CPU times), the combination of RELEASE(DEALLOCATE) packages and thread re-use can reduce in-DB2 CPU time by 10% or more. For batch programs that issue lots of commits, RELEASE(DEALLOCATE) has the added benefit of making sequential prefetch and index lookaside more effective. Note that DB2 11 provided relief for the problem of some bind/re-bind, DDL, and utility operations being blocked by RELEASE(DEALLOCATE) packages executed via persistent LOCAL threads. [If you need to keep RELEASE(DEALLOCATE) packages associated with DDF work from blocking database administration tasks, you can turn off high-performance DBAT functionality via the command -MODIFY DDF PKGREL(COMMIT), and then later turn it back on with the command -MODIFY DDF PKGREL(BNDOPT).]
  • The hit ratio for your DB2 dynamic statement cache is north of 90%. More memory allows for a larger dynamic statement cache, and that means more cache hits and more avoidance of full PREPAREs.
  • All of the RID list processing operations performed on your system are completed using only RID pool space. Starting with DB2 10, two important things related to RID list processing occurred: 1) the default RID pool size went way up (to 400 MB, from 8 MB), and 2) RID list processing operations that can't complete using only RID pool space (because there's not enough of that resource) will continue, using space in 32K-page work file table spaces. Your DB2 monitor (if it supports DB2 10) will show you the extent to which work file space is used for the completion of RID list processing operations that ran out of RID pool space, and if you see such RID list processing "spill-over" activity, you make your RID pool larger (which you can do because the LPAR in which the DB2 subsystem is running has a whole lot of memory). That action allows RID list processing operations to complete in the RID pool, and that boosts performance (versus having to use work file space).
  • You have a really big DB2 sort pool, and that reduces use of work file space for SQL sorts, and that improves SQL sort performance. The sort pool (sized per the value specified for the SRTPOOL parameter in ZPARM) is the amount of in-memory work space that can be used for each concurrent SQL-related sort executing in your DB2 system (so, if the SRTPOOL value is Y, and there are 10 large SQL-related sorts executing concurrently on your system, you could have 10Y of space in the DB2 DBM1 address space used for in-memory processing of these sorts). The default value of SRTPOOL is 10 MB (up from 2 MB prior to DB2 10). Because you have lots of real storage in your z/OS LPAR, you have a larger SRTPOOL value (maybe 40 MB or more), and that means more SQL-related sort work gets done in memory, and that is good for performance.
  • If you run DB2 in data sharing mode, your group buffer pools are large enough so that you have zero directory entry reclaims AND high "XI" GBP read hit ratios. Here, I'm assuming that your coupling facility LPARs, as well as your z/OS LPARs, have lots and lots of memory. Don't know what the "XI" GBP read hit ratio is? Read about it here.
  • With all these big uses of Big Memory, your z/OS LPAR's demand paging rate is still zero, or close to zero. The demand paging rate, my preferred indicator of pressure (or lack thereof) on a z/OS LPAR's real storage resource, is the rate at which pages that had been moved out of memory to auxiliary storage by z/OS (to make room for other pages to be brought into memory) are brought back into server memory on-demand. With a way big real storage resource in your z/OS LPAR, you're able to have a really large buffer pool configuration, lots of RELEASE(DEALLOCATE) packages executed via persistent threads, a big dynamic statement cache, and lots of RID pool and sort pool space, while still leaving plenty of memory for other DB2 and non-DB2 uses. With enough memory to go around, demand paging should be nil or close to it.

So, does your system look like that? If not, why not? Do you not have hundreds of gigabytes of real storage in the LPARs in which you run production DB2 subsystems? If you don't, work on getting there. If you do have a z/OS LPAR with tons of memory and a production DB2 subsystem running therein, and you've not leverage that big memory resource, get to work on that; otherwise, you're missing out on optimal DB2 performance. DB2's a thoroughbred. Give it room to run.