Sunday, January 21, 2018

Db2 12 for z/OS and Insert Algorithm 2: Faster, More-Efficient INSERTs than Ever Before

Near the end of the blog entry I most recently posted prior to this one, I mentioned that a specification of MEMBER CLUSTER for a Db2 for z/OS universal table space could help maximize INSERT throughput and efficiency, in situations where the importance of that objective outweighs that of keeping data rows physically ordered by the target table's clustering key. I also noted that MEMBER CLUSTER sets you up to take advantage of a Db2 12 for z/OS enhancement called (rather cryptically) insert algorithm 2.

So, what's insert algorithm 2? I'll tell you, but before getting to the "what" of this Db2 12 feature, I want to spend a little time on the "why."

I recall seeing, way back in the mid 1980s, not long after the introduction of what was then called Db2 for MVS (the fruit of an IBM research effort that led to the invention of relational database technology), an IBM "red book" about the software product. This document cautioned users against thinking of Db2 as a database management system suitable for high-volume batch and online transactional workloads: Db2 Version 1 Release 1 was intended to support business intelligence-type queries, and was not designed to provide the performance organizations generally required for operational, "run the business" applications.

Well, that advice was widely ignored in the user community. Relational technology and the SQL data-interface language (also invented by IBM) delivered such an advance in programmer productivity versus traditional data stores that companies jumped on Db2 as a foundation for operational applications, in spite of the initial performance challenges. IBM responded to this development by providing, in successive versions and releases of Db2, features and functions that vastly boosted the ability of the DBMS to, among other things, ingest large amounts of new data records in compressed time frames. What were the bottlenecks holding back INSERT throughput and efficiency? Log processing? Index maintenance? Lock contention? Latch contention? Instruction path length? Client-server network management? Smart people at the IBM Silicon Valley Lab kept flattening these INSERT speed bumps, until organizations were routinely able to drive thousands of inserts per second into Db2 tables. And still, in some cases, that wasn't fast enough,

So we come to the Db2 11 for z/OS time frame, and the question is, how do you take Db2 INSERT throughput from really high to super-duper high? You (if you're in the Db2 development organization) do what you've always done: you identify what's holding throughput back in the current technology environment. It was determined that in a Db2 11 system, a key INSERT bottleneck involved the search for space, in pages of a target table, to hold incoming rows. How do you address that? Here, a page was taken from the Db2 playbook of long ago: asynchronous, anticipatory processing is the ticket to success.

Long ago, read I/O was seen as a major drag on Db2 performance, were it to be always handled in the traditional, synchronous way (i.e., read a single page from disk into memory on-demand, at the time a program needed the contents of the page). The solution was something that, today, we can scarcely imagine Db2 being without: prefetch. Through prefetch, Db2 reads batches of table space and/or index pages into memory, in anticipation that the pages will be requested by a process that seems likely to scan a significant number of pages. Because prefetch reads are anticipatory in nature, application elapsed time is reduced: when a prefetch-driving application process requests a page, it is likely that the page has already been read into a Db2 buffer pool in memory, so there is no need to suspend the application's task in order to perform a single-page, on-demand read. Because prefetch reads are asynchronous in nature, application elapsed time is further reduced: prefetch reads are executed by way of Db2 tasks (specifically, by preemptable SRBs in the DB2 database services address space), so application tasks can remain "on task" with regard to processing data (versus having to bring data into memory).

Apply those same concepts - anticipatory, and asynchronous - to an insert-intensive application process, and - presto! - you have a good understanding of the essential nature of insert algorithm 2.

The "how" of enabling insert algorithm 2 is largely table space-based - it can be used only for inserts into universal table spaces that are defined with the MEMBER CLUSTER option (that option can be specified in a CREATE TABLESPACE or an ALTER TABLESPACE statement). What is additionally required is telling Db2 that insert algorithm 2 is to be used for universal, MEMBER CLUSTER table spaces. That can be done at a Db2 subsystem level (at function level V12R1M500 or above) by way of a new (with Db2 12) ZPARM parameter, called DEFAULT_INSERT_ALGORITHM. When the value of that parameter is set to 2 (the default value), insert algorithm 2 will be automatically used for universal table spaces defined with MEMBER CLUSTER (unless that behavior is overridden at the table space level - more on that to come). When DEFAULT_INSERT_ALGORITHM is set to 1, insert processing will be done in the traditional way for universal MEMBER CLUSTER table spaces, unless insert algorithm 2 has been explicitly enabled for a given table space (again, table space-level control over insert algorithm 2 will be covered momentarily). When DEFAULT_INSERT_ALGORITHM is set to 0, insert algorithm 2 functionality is disabled for the subsystem - it cannot be used for any table space.

OK, table space-level control: if the ZPARM parameter DEFAULT_INSERT_ALGORITHM is set to 2 (meaning, insert algorithm 2 is used by default for universal MEMBER CLUSTER table spaces), you can de-activate the functionality (i.e., go with traditional insert processing) for an individual table space by specifying INSERT ALGORITHM 1 for the table space (either at CREATE TABLESPACE time, or via ALTER for an existing MEMBER CLUSTER universal table space). Similarly, if the ZPARM is set to 1 (traditional insert processing will be in effect, by default, for MEMBER CLUSTER universal table spaces), and you want to override that value for a particular MEMBER CLUSTER universal table space (i.e., you want insert algorithm 2 to be in effect for that table space), you can specify INSERT ALGORITHM 2 for the table space (again, either in a CREATE TABLESPACE or an ALTER TABLESPACE statement). Note that a value of 0 has a different meaning for the table space specification versus the ZPARM. For the DEFAULT_INSERT_ALGORITHM parameter in ZPARM, a value of 0 means that insert algorithm 2 functionality is disabled for the Db2 subsystem. For a MEMBER CLUSTER universal table space, INSERT ALGORITHM 0 (the default value for CREATE TABLESPACE, and a specification that is valid only for CREATE TABLESPACE) means that the insert algorithm that will be in effect for the table space is the one specified via DEFAULT_INSERT_ALGORITHM in ZPARM.

Let's say that insert algorithm 2 is in effect for a MEMBER CLUSTER universal table space, and an insert process targeting the table in that table space gets going. What happens? Here's what happens: Db2 provides a "pipe" (an in-memory area) that will be used to supply the insert process (or processes - several can use one pipe) with pages, belonging to the partition receiving new rows, that have space to hold new rows (if several partitions of a table space are receiving new rows, each partition will have its own fast-insert pipe). If row-inserting processes are taking pages from the pipe, what's putting pages into the pipe? That would be a Db2 task that puts insert-ready pages into the pipe, in an asynchronous manner (that is, in a way that does not interrupt the tasks of inserting processes, as a Db2 prefetch task does not interfere with the tasks of page-accessing programs).

What this auto-filled page pipe approach does is remove, for an inserting process, the need to find pages of a partition that have space enough to hold new rows. Even though the space search mechanism used for traditional insert processing is quite efficient (especially for segmented table spaces, thanks to the relatively precise "space available" information maintained in space map pages - and all universal table spaces are segmented), insert throughput can be expected to increase when an inserting process can simply pull "insert space available" pages from a fast-insert pipe versus having to locate such pages itself.

How much will insert throughput increase when insert algorithm 2 versus algorithm 1 (traditional insert processing) is used, all other things being equal? You probably know the answer to that question: it depends. In some cases, insert throughput might increase by a few percentage points. In other cases, the throughput improvement could be substantially greater (some tests have shown a throughput increase in the vicinity of 25%, and even larger increases have been observed in other test scenarios). CPU efficiency gains can also be quite varied - in some cases, in-Db2 CPU time (aka "class 2" CPU time) may decrease only slightly for an inserting process, while in other cases in-Db2 CPU efficiency might improve by 10%, or closer to 25%. Why the variance in throughput and CPU savings for inserting processes? One factor is indexes: generally speaking, the fewer the number of indexes defined on a target table, the greater the performance improvement delivered by insert algorithm 2 will be (the feature is used for inserts of rows into table pages, not for the insert of entries into index pages). The other main factor is the presence of bottlenecks other than row-accommodating-page-search that might "bubble to the surface" and constrain the performance of an insert process, when insert algorithm 2 is in effect. For example, consider a case in which inserts are coming from network-attached processes (i.e., DDF-using applications) that are running on servers that are physically separate from the Db2 subsystem, versus the same processes running in Linux systems on the IBM Z server that also houses the Db2 for z/OS subsystem. In the latter case, the greater network throughput that can be available thanks to IBM HiperSocket technology (memory-to-memory communications links) might result in a comparatively larger performance improvement with insert algorithm 2 in effect, because a more-pronounced network-related bottleneck might constrain the fast-insert improvement achieved with insert processes running on "boxes" that are physically separated from the Db2 server (to put this another way: when insert algorithm 2 removes one insert performance bottleneck, another bottleneck could come into play).

So, your mileage may vary, as they say.

How about the memory impact of insert algorithm 2? Those page-supplying pipes occupy some space, don't they? Yes, but a fast-insert pipe should take up less space than a data-compression dictionary (remember that every partition of every compressed table space has an associated compression dictionary), and at many sites there will be quite a few more COMPRESS YES table spaces than table spaces for which insert algorithm 2 is in effect (recall that insert algorithm 2 is only applicable to universal table spaces defined with MEMBER CLUSTER). Translation: if the real storage resource of a z/OS system were under a good bit of pressure (i.e., if the system's demand paging rate were in the mid-single-digits per second or higher), I might want to add memory to that system before making extensive use of insert algorithm 2; otherwise, you are likely good to go from a memory perspective.

IBM keeps boosting the performance of Z servers. Taking full advantage of these hardware enhancements depends in large part on software design, and Db2 12's insert algorithm 2 is part of that story. With the new insert algorithm in effect, insert rates of hundreds of thousands of rows per second have been observed in tests of Db2 12 systems (in some tests involving tables on which no indexes are defined, insert rates exceeding 1 million per second have been achieved). Does that kind of throughput require a substantial amount of CPU processing capacity? Yes. The insert-algorithm 2 value proposition is this: if you have Z horsepower to apply to an insert throughput challenge, Db2 12 lets you put it to work like never before.

Sunday, December 31, 2017

Db2 for z/OS: Opportunities for CPU Efficiency When Data Clustering is not a Priority

Not long ago, a veteran Db2 for z/OS DBA contacted me with an interesting question. He had, in the system he supported, a table with a clustering index, and he wanted to know how he could get Db2 to act, with regard to INSERT operations and also with regard to executions of the REORG utility, as though the table had no clustering index. In other words, the DBA had a table for which INSERT and REORG CPU efficiency mattered more than having rows physically ordered per a clustering key. That's understandable. There are certainly situations in which the "locality of reference" provided by a clustering index is not of significant value.

You might think to yourself, "This is not a question that's hard to answer - just issue an ALTER for the table's clustering index, with a NOT CLUSTER specification. Problem solved." Eh, not so fast. ALTER INDEX with NOT CLUSTER is a useful means of changing a table's explicit clustering index from IX1 (or whatever it's called) to IX2 (in that case, an ALTER INDEX IX1 NOT CLUSTER would be followed by an ALTER INDEX IX2 CLUSTER). ALTER INDEX with NOT CLUSTER is not a good way to cause a table to not have a clustering index, because if a table has any index then it will always have a clustering index.

"Hold on, there," some might say. "If IX1 is the clustering index for table T1, and I do an ALTER of IX1 with NOT CLUSTER and do not subsequently alter another index on T1 with a CLUSTER specification, table T1 has no clustering index, right?" Wrong. Given that scenario, T1 does not have an explicit clustering index. It has a clustering index of the implicit variety. And what index is that? Index IX1. "But wait," you might counter, "I just altered IX1 with NOT CLUSTER. How can IX1 still be the table's clustering index?" Because the rules of the Db2 for z/OS game are as follows: when the index that had been table T1's explicit clustering index is altered with NOT CLUSTER, that index will continue to be T1's clustering index - albeit in an implicit sense - until such time as another index on T1 is made the table's explicit clustering index by being altered with a CLUSTER specification."

"OK," says you, "The DBA should drop and re-create the table's indexes, without giving any of them the CLUSTER designation. Presto: a table with no clustering index." Wrong again (with all due respect), says I. The table still has an implicit clustering index. Which index will that be? The first one created for the table (so, if IX2 and IX3 and IX1 are created, in that order, on table T1, and CLUSTER was not specified for any of those indexes, IX2 will be T1's implicit clustering index).

"So, if a Db2 for z/OS table has any indexes at all, it will always have a clustering index?" Yes. I said that a few paragraphs ago, and I meant it. I said it a few years ago in an entry in this blog.

How, then, do you get Db2 to act as though there were no clustering index on T1, with respect to INSERT and/or REORG processing, if clustering for T1 is not a priority for you? I'll first address the REORG part of that question, then the INSERT part.

Until quite recently, you COULDN'T tell Db2 to ignore a table's clustering index when REORGing the associated table space - that index was going to be in the picture, one way or another. All you could do was tell Db2 how the clustering index would affect REORG processing: either the table's data would be unloaded by REORG in the order prescribed by the table's clustering index, or it would be sorted in clustering sequence after having been unloaded in "as-is" sequence (the latter approach is the utility's default - the former is used when REORG is executed with the SORTDATA NO option specified). Db2 11 for z/OS delivered an enhancement whereby REORG can be executed with no regard for a table's clustering index. That enhancement will be in effect if REORG is run with SORTDATA NO in combination with the new (with Db2 11) option RECLUSTER NO. With SORTDATA NO and RECLUSTER NO specified, REORG will run as quickly and efficiently as possible when re-clustering is not the motivation for executing REORG. And when might data re-clustering NOT be a motivation for running REORG? How about when REORG is executed with the DISCARD option to remove a lot of rows from a table in a very CPU-efficient manner (as was the case for the DBA I mentioned in the opening part of this blog entry)? How about when REORG is run to materialize a pending DDL change for a table space, such as a change in page or segment size (and each release of Db2 for z/OS adds things to the list of what can be non-disruptively changed for a table or table space or index via pending DDL - changes effected with an ALTER followed by an online REORG)?

OK, so REORG can be executed in a way that does not take a table's clustering key into account. How about INSERTs? Can they be executed without regard to a table's clustering index? The answer to that question is definitely, "Yes." How is that done? Pretty easy: specify APPEND YES when you create the table (or alter an existing table with APPEND YES). On top of that, if the table is in a universal table space, MEMBER CLUSTER YES should be in effect (through either specification of MEMBER CLUSTER YES in the CREATE TABLESPACE statement, or with MEMBER CLUSTER YES specified for an ALTER of an existing table space).

And here's an added bonus associated with MEMBER CLUSTER YES: it sets you up to exploit the new, ultra-high-efficiency and ultra-high-throughput insert algorithm introduced with Db2 12 for z/OS - an enhancement known as "Insert Algorithm 2." How does that work? I'll let you know in the next entry I write for this blog, which I expect to post in January (and I mean January 2018 - happy new year, folks).

In the spirit of "tell 'em what you told 'em," here's the quick reiteration of the main points made in this blog entry:

  • Any Db2 for z/OS table with at least one index has a clustering index, period.
  • If you want to run REORG as efficiently as you can, and re-clustering data in a table is not important for you, run REORG with SORTDATA NO and RECLUSTER NO.
  • If you want to maximize the efficiency of INSERT operations, and row-clustering is not important for you, use APPEND YES for the table in question (and MEMBER CLUSTER YES for the associated table space, if that table space is of the universal variety).
On to 2018!

Tuesday, November 21, 2017

Db2 12 SQL Enhancement: Temporal Logical Transactions

Temporal data support, introduced with Db2 10 for z/OS, is one of the more interesting SQL-related Db2 enhancements delivered in recent releases of the DBMS. Temporal data support comes in two flavors (which can both be utilized for a single table): business-time temporal and system-time temporal. With business-time temporal support enabled for a table, an organization can put future changes into the table (e.g., price changes for products or services that will not go into effect until some future date) without affecting programs that, by default, access data rows holding information that is currently in effect (among the use cases for business-time temporal: profitability forecasts, utilizing queries that access price values that will be in effect six months from now).

In contrast to business-time temporal, system-time temporal enables a look back, as opposed to forward, regarding in-effect data values. By that I mean that system-time temporal allows an application or a user to see data rows that were current at a time in the past. The mechanism through which this capability is provided is conceptually pretty simple: when system-time temporal support is enabled for a table (referred to as the base table), a logically equivalent history table is associated with the base table ("logically equivalent" means that the history table has the same columns as the base table: same names, same order, same data types - I've pointed out that physical equivalence is NOT a requirement). Subsequently, when a row is made non-current in the base table by way of an UPDATE or a DELETE operation, the "before" image of the updated or deleted row is stored - automatically by Db2 - in the associated history table. Db2 knows, thanks to a couple of timestamp columns in the base table (and the history table) that are maintained by Db2, when a row became current (i.e., when it was either inserted into the base table, or when it was updated) and when it became non-current (i.e., when it was deleted from the base table, or replaced in the base table via an UPDATE). With Db2 having that information, it can respond to a query that contains a temporal predicate, which would be of the form FOR SYSTEM TIME AS OF timestamp-value (or BETWEEN timestamp-value1 AND timestamp-value2 or FROM timestamp-value1 TO timestamp-value2).

OK, with that background information in mind, consider this scenario: a program inserts a row into base table T1, which has been enabled for system-time temporal (one could also say that T1 is enabled for "row versioning"), and then, in the same unit of work, updates the just-inserted row. Then the program commits. You might expect to find, following the completion of this unit of work, the "before" image of the row that was changed by the program's UPDATE statement; but, that row is nowhere to be found in T1_HIST (or whatever you decided to name T1's history table). Why is that so? It's so because the update of the row occurred in the same unit of work as the insert of the row. What can you do about that, if you want to see, in T1_HIST, the "before" image of the row changed by the aforementioned UPDATE? Well, you could break the single unit of work into two units of work, with the row-insert in one and the UPDATE of the row in the other. That might not be feasible, and it could be a big hassle even if feasible. Here's what likely would be a more attractive option: have two temporal logical transactions in the one unit of work.

Two what?

Temporal logical transactions, or TLTs, are a new capability introduced with Db2 12 for z/OS (at function level V12R1M500). How would you put this functionality to work in a program? It's pretty easy: first, you have the program issue the following SQL statement:


That statement, which references a new (with Db2 12) special register, tells Db2 to allow (for your session) multiple TLTs in one unit if work.

Next, the program would issue a SQL statement with this form (and note that timestamp-value1 could be CURRENT TIMESTAMP):


Then the program would perform the row INSERT into T1 as it had before. In that row, the "start" timestamp value (the value interpreted by Db2 as the time at which the row became current) would come from the value supplied via the TEMPORAL_LOGICAL_TRANSACTION_TIME special register, which was previously set by the program.

After performing the INSERT operation, the program would again issue the statement below (and again, timestamp-value2 could be CURRENT TIMESTAMP - it would just need to be a little "ahead" of the timestamp-value1 previously used, and as the temporal timestamp value goes to the picosecond level, CURRENT TIMESTAMP, used again, would take care of that "little bit ahead" requirement):


The program would then update the just-inserted row, as it had before. In a similar way as for the INSERT, the "start" timestamp value for the updated row in the base table would be timestamp-value2, based on the value to which the TEMPORAL_LOGICAL_TRANSACTION_TIME special register was set just prior to the UPDATE.

Then the program would issue a commit, as it had before, and now what would one see in T1_HIST? One would see the "before" image of the row updated within the just-completed unit of work, even though the UPDATE targeted a row that had been inserted into T1 in the same unit of work. That's temporal logical transaction functionality in action: you have inserted multiple TLTs into one physical transaction, giving you the row-change-history capability you want even for a row acted on more than once in a unit of work, without having to change the commit scope of the unit of work.

AND, that's not all - TLT functionality can also work in the other direction, allowing you to incorporate several physical transactions into one temporal logical transaction. How is that done? It's not hard: just have your program issue the statement below (keeping in mind that, as previously noted, timestamp-value can be CURRENT TIMESTAMP):


Following issuance of that statement, the program could update a row in system-time-enabled base table T1, then commit, then update the same row, then commit again. What you'd see in T1 and T1_HIST in this case (sticking with the same base and history table names used above) would be what you'd see if the updates had been made in one physical transaction versus the actual pair of physical transactions: one "before" row image captured in T1_HIST, showing the target row as it appeared prior to the first update performed by the program, and the row in T1 as it appeared after the second update performed by the program. Why no second row in T1_HIST, reflecting the change made by the second UPDATE operation? Because you told Db2 (through specifying a value for the TEMPORAL_LOGICAL_TRANSACTION_TIME special register) that you wanted the two physical transactions treated as one from a system-time temporal perspective.

And there's your overview of temporal logical transaction functionality. Having that Db2 12 capability on-hand in your shop could well open up new use cases for system-time temporal data support. Give it some thought.

Friday, October 27, 2017

Db2 12 for z/OS: Enhanced Lock Avoidance in Data Sharing Environments

In the last entry posted to this blog, I described the mechanisms (including PUNC bits and commit log sequence numbers, aka CLSNs) used by Db2 for z/OS for lock avoidance purposes (lock avoidance here refers to a capability whereby Db2 can, in many cases, verify the committed state of a data value without having to get an S-lock - i.e., a read lock - on the page or row in which the value is located). As lock and unlock requests consume CPU cycles, lock avoidance is a CPU efficiency booster for Db2 application workloads.

Db2 data sharing (a technology through which multiple Db2 subsystems in a Parallel Sysplex mainframe cluster share concurrent read/write access to one database) has long delivered unmatched levels of system scalability and availability. An enhancement delivered with Db2 12 for z/OS can reduce the CPU cost of data sharing by increasing the degree to which locks can be avoided in a data sharing environment.

In the aforementioned blog entry on Db2 lock avoidance, I described how commit log sequence numbers (CLSNs), which are maintained for each and every Db2 page set (or partition, in the case of a partitioned object), are used in support of lock avoidance. When Db2 data sharing technology was introduced in the mid-1990s (with Db2 Version 4), its use had the effect of reducing lock avoidance because a single CLSN, called the global CLSN (or GCLSN), was used for ALL group buffer pool-dependent data sets (i.e., data sets that are the object of inter-Db2 read/write interest in a data sharing group - quite likely the majority of open data sets in the system). The GCLSN value is the log point at which the currently-still-outstanding-longest-running-data-changing unit of work  got started. The implication: if ONE data-changing unit if work is running for a long time, lock avoidance for ALL group-buffer-pool-dependent data sets is negatively impacted, because in that case the GCLSN value will go a long time without being positively incremented, and in THAT case more S-locks on pages and/or rows will have to be requested to verify the committed state of data values.

Along comes Db2 12 for z/OS, and with it an important change in how CLSNs affect lock avoidance in a data sharing environment. The change: instead of a single GCLSN value reflecting the start time of the longest running still-out-there data-changing unit of work in the whole data sharing system, Db2 12 caches, in memory and in the shared communications area (SCA) in a coupling facility LPAR, the 500 oldest CLSNs in the data sharing system. What does that mean with regard to lock avoidance? Well, let's say that you're an application program accessing data in table TABLE_A, which is in table space TSPACE_A, which is group buffer pool-dependent and has a CLSN that is one of the aforementioned 500 oldest in the system. For you, lock avoidance is just as it would be in a non-data sharing system, because Db2 is using your table space's CLSN value to determine when it can retrieve committed data values from TABLE_A without having to issue S-lock requests for pages (or rows, as the case may be) in the table.

Now, let's say you're an application program accessing data in TABLE_X, which is in table space TSPACE_X, which is group buffer pool-dependent and has a CLSN that is not one of the 500 oldest in the system. What's the deal in that case? Here's the deal: for you, Db2 will use the CLSN that is the newest of the 500 oldest that it's cached. Is that going to be good with regard to the level of lock avoidance you get? It should be really good. Think about it. In a Db2 11 (or earlier) system, Db2 would be using, for you, for lock avoidance purposes, the oldest CLSN in the system. What if that CLSN is associated with a unit of work that's been chugging along for an hour, changing data in TABLE_Y (in TSPACE_Y), without committing? That really old CLSN (the global CLSN for the system) could mean a lot of lock requests not avoided for you. In a Db2 12 environment, the CLSN associated with the unit of work that started changing data in TABLE_Y an hour ago (and hasn't committed since) affects lock avoidance only for TSPACE_Y, not for TSPACE_X, which you're accessing. Does the next-oldest CLSN in the system, which is (let's suppose) 45 minutes old, affect the level of lock avoidance you're getting? Nope. How about the 10th oldest CLSN? Nope. The 100th oldest? Nope. The 352nd oldest? Uh-uh. Your level of lock avoidance is impacted by the 500th oldest CLSN in the system. Maybe that one is associated with a unit of work that has gone only a few seconds (or less) without a commit. Think you'll get more lock avoidance with a few-seconds-old CLSN versus an hour-old CLSN? Odds of that are real good, I'd say.

And there you have it - just one of multiple ways in which Db2 12 delivers improved performance versus its predecessor. Over the next several months in this blog I'll get to some of the other performance-boosting features of Db2 12. Lots to like there, I can tell you now.

Friday, September 29, 2017

Db2 for z/OS: Lock Avoidance

Not long ago, I served as part of the teach team for a Db2 12 for z/OS Technology Workshop that IBM offered in a city in the USA. During that workshop, I covered, among other things, a Db2 12 enhancement that can improve the CPU efficiency of a Db2 data sharing system by reducing locking activity in that environment. During a break, one of the workshop attendees told me (in words to this effect), "That lock avoidance enhancement is good news, but I think a fair number of Db2 for z/OS people are a little uncertain as to what lock avoidance is - and how it works - in the first place." I think this person was right on that score, and that's why I'm writing this blog entry: to facilitate understanding of the fundamentals of Db2 lock avoidance. In a follow-on entry that I'll try to post within the next couple of weeks, I'll cover the aforementioned Db2 12 data sharing-related lock avoidance enhancement.

OK, let's start with the "why" of Db2 lock avoidance. In support of a large application workload, Db2 does a lot of stuff. One thing that Db2 does about as much as anything else is issue lock and unlock requests. To see what I mean, take a look at a statistics long (aka statistics detail) report, generated by your Db2 monitor, covering a period of high activity on the system (or check the monitor's online display of lock activity). I looked at a report just now, reflecting activity in a large real-world production Db2 for z/OS subsystem, and what does it show? 106 million lock requests and 23 million unlock requests in one hour. That's about 30,000 lock requests per second, and about 6400 unlock requests per second, on average (there are more lock than unlock requests because multiple X-locks, acquired by an application process in the course of modifying data values, are released at a commit point with one unlock request). Now, the workload associated with all that lock activity is a big one, driving in excess of 700 million SQL data manipulation statements during the hour between the report's FROM and TO times, but that kind of volume is by no means unusual for a large Db2 site. Why the big lock request number? Two words: data integrity. Locks acquired on data pages or rows (depending on the granularity of locking in effect for a table space) help to ensure that data consistency is maintained in a Db2 for z/OS system, and that programs retrieve only committed data from the database (i.e., values that have been "hardened" in the database in the wake of data-change operations).

One lock request consumes very little in the way of CPU cycles, but tens of thousands of lock requests per second? That's another matter. When the execution volume for a given Db2 operation is really high, there is CPU-efficiency value in reducing the frequency of said operation. The Db2 development team had that in mind when, more than two decades ago, they delivered a capability called lock avoidance. Essentially, lock avoidance enables Db2 to reduce lock activity for an application workload by issuing S-lock requests (i.e., data-read locks associated with execution of queries) only when such locks are needed to ensure retrieval of committed data values.

How does Db2 know when an S-lock on.a page or row is needed to guarantee retrieval of committed data values (i.e., to avoid returning uncommitted data changes to a program)? It knows through the use of two lock-avoidance indicators: CLSNs and PUNC bits.

  • CLSNs - This acronym stands for "commit log sequence number." In every page of every Db2 page set, Db2 records the log point corresponding to the most recent update of the page's contents. Additionally, for every page set or partition (for partitioned objects), Db2 keeps track of the starting log point of the oldest still-in-flight unit of work that is changing data in the page set or partition. That latter log point is the commit log sequence number of the page set or partition. When going to retrieve data in a page, Db2 can check the page's last-updated log point and compare that to the CLSN of the page set or partition. Using simple numbers for example, suppose that a page's last-updated log point (information that, again, is stored in the page) is 20, and the CLSN of the page set or partition in which the page is located is 30. Because the starting log point of the oldest still-in-flight unit of work affecting the page set or partition is higher than (meaning, is later in time than) the log point of the last update of the page, Db2 knows that all the data in the page is in a committed state, so guaranteed-committed data can be retrieved from the page with no need for S-lock acquisition on the page (or on rows therein). When the  oldest still-in-flight data-changing unit of work affecting a page set or partition commits, the CLSN of the page set or partition moves up to the starting log point of what had previously been the next oldest still-in-flight data-changing unit of work affecting the page set or partition. Even if (using simple numbers as before) the last-updated log point of a page is seen to be 50, and the CLSN of the associated page set or partition is 40 (meaning that data in the page might have been updated by a still-in-flight data-changing unit of work), lock avoidance may still be possible for a data retrieval operation targeting the page, thanks to PUNC bits.  
  • PUNC bits - PUNC is short for "possibly uncommitted." Included in the control information on every Db2 data page and every index page are bits that indicate whether or not a row contains possibly uncommitted data (in the case of an index page, the PUNC bits are associated with the RIDs, or row IDs, that point to rows in the underlying Db2 table). When a row is changed, its PUNC bit is set. That being the case, when Db2 examines a row (or its RID in an index) and sees that the PUNC bit is NOT set, Db2 knows that the data in the row is committed, and the data can be retrieved without the need for an S-lock on the data page or row to ensure data committed-ness. So, why is this indicator called the "possibly" uncommitted bit, as opposed to the "for sure uncommitted" bit? Because Db2 does not synchronously reset a "turned on" PUNC bit when the data change that caused the bit to be set is committed - doing that would have an unacceptable overhead cost. Instead, PUNC bits that are turned on as a result of data-change activity are reset asynchronously, in the background, when certain events happen (one such event is execution of the REORG utility for a table space; another is when more than 25% of the rows in a page have their PUNC bits turned on and the page set's or partition's CLSN advances). Because of the asynchronous nature of PUNC bit resetting, relative to the turning on of a PUNC bit, the turned-on PUNC bit setting can only be interpreted as meaning, "Maybe the data in this row is committed and the PUNC bit hasn't been reset, or maybe the data is in fact not committed." When an application program wants only committed data values to be retrieved by queries (i.e., when isolation level UR, short for "uncommitted read," is not in effect for the program), "maybe" isn't good enough, and Db2 will request an S-lock on the row or page to ensure the committed state of data values (successful acquisition of an S-lock means that the page or row is not X-locked, and that means the row or page contains only committed data).

I want to impart now a couple more items of information pertaining to Db2 for z/OS lock avoidance. First, lock avoidance can be utilized to the maximum extent possible when a Db2 package is bound with ISOLATION(CS) - short for cursor stability - and CURRENTDATA(NO). Second, what I have described in this entry is lock avoidance as it occurs in a non-data sharing Db2 system. Lock avoidance in a Db2 data sharing environment - including the Db2 12 enhancement referenced at the start of this entry - will be the subject of my next post to this blog. I hope to have that written within the next two weeks, so check back in if you use - or are interested in - Db2 data sharing.

Wednesday, August 30, 2017

Db2 12 for z/OS SQL Enhancements: Advanced Triggers

In this, the fourth of a set of four entries covering SQL enhancements introduced with Db2 12 for z/OS, I will describe new trigger capabilities delivered by way of what are called "advanced triggers" (the kind of trigger that you could define prior to Db2 12 -- and which you can still create in a Db2 12 system -- is now referred to as a "basic trigger"). Before getting to the details of advanced trigger functionality, I want to emphasize that there are considerably more than four SQL-related enhancements delivered through Db2 12 -- I've just selected my four favorite of these enhancements for highlighting in this blog (the previous three entries in this series covered piece-wise DELETE, result set pagination, and the much-improved MERGE). A good source of information on the other SQL enhancements provided by Db2 12 is the "Application enablement" section of the "What's new" part of the Db2 12 Knowledge Center on the Web.

Note that advanced trigger functionality is available in a Db2 12 system when the activated function level is V12R1M500 or above.

OK, advanced triggers: the most important thing to know about this new kind of trigger is that it can contain (in the CREATE TRIGGER statement) a compound SQL statement. Basically, that means that you can define a trigger using SQL PL (aka SQL procedure language -- the same language that enables the creation of native SQL procedures and "native" SQL user-defined functions).

Before getting into the implications of SQL PL in the body of a trigger, I want to do a little level-setting. A trigger, for those who don't know, is a mechanism by which a data-changing action (INSERT, UPDATE, or DELETE) targeting one table can "trigger" the automatic execution of some other SQL action. A trigger can be "fired" before or after the "triggering" SQL statement has been executed. A very simple example: an AFTER UPDATE trigger can cause an UPDATE that changes column C1 of table T1 to drive an INSERT of some information into table C2.

Suppose you want the action taken when a trigger gets fired to be somewhat involved versus really simple. In that case, prior to Db2 12 it was often necessary to have the trigger call a stored procedure. That can be kind of clunky from a coding perspective. With the ability to code SQL PL in the body of an advanced trigger, you can drive a fairly sophisticated action when a triggering SQL statement executes, without having to put a stored procedure call in the trigger body (in essence, you can use SQL PL to put the equivalent of a native SQL procedure in an advanced trigger).

An example of an advanced trigger appears below, followed by some color-coded comments (this BEFORE INSERT trigger examines start and end times for classes in records to be inserted into a Db2 table, sets the end time to one hour after the start if the end time value is NULL, and returns an error if the class end time is after 9 PM):

WHEN(N.ending IS NULL OR n.ending > '21:00')
 IF (N.ending IS NULL) THEN
    SET N.ending = N.starting + 1 HOUR;
 IF (N.ending > '21:00') THEN
    SET MESSAGE_TEXT = 'Class ending time is beyond 9 pm';

Things to note about this advanced trigger:
  • You have some new options -- Because an advanced trigger can include SQL PL statements, you can debug it, just as you can debug a native SQL procedure or a compiled SQL scalar function (Data Studio is particularly handy for debugging SQL PL routines). Another new option for advanced triggers: you can provide a high-level qualifier to be used with unqualified objects referenced in the body of the trigger.
  • You can include SQL control statements (i.e., logic flow control statements) in the body of the trigger -- IF (shown in the example) is one such statement. Among the others are ITERATE, LOOP, REPEAT, and WHILE. These SQL PL statements enable the coding of a trigger that has pretty sophisticated functionality.
  • There are new possibilities for the SET statement -- With an advanced trigger, SET is not restricted to transition variables -- it can also be used with global variables and SQL variables (the latter term refers to variables declared in the body of the trigger).
And, there's something that's notable by its absence in the example CREATE TRIGGER statement -- namely, the phrase MODE DB2SQL. It is, in fact, the absence of MODE DB2SQL in a CREATE TRIGGER statement that indicates that the trigger will be an advanced trigger, as opposed to a basic trigger.

Besides providing advanced functionality versus basic triggers, advanced triggers eliminate what had been a really vexing problem encountered by many users of (what are now called basic) triggers -- a problem best illustrated by example. Suppose you create three basic triggers on a table, all of which are "fired" when a particular type of statement targets the table (e.g., an UPDATE of a certain column in the table), and all of which "fire" in the same relative time period with regard to the execution of a triggering SQL statement (e.g., all three are AFTER triggers). In that case, the order in which the triggers will fire is determined by the order in which they were created: if trigger A was created first, then trigger B and then trigger C, they will fire in that order (A then B then C) upon the execution of a triggering SQL statement. Let's say that this A-B-C trigger firing sequence is important to you. Now, suppose that that trigger A has to be modified. Only way to get that done with a basic trigger is DROP and re-CREATE. But wait! That re-CREATE will make trigger A the last of the three triggers in this example to be created, resulting in an undesirable firing order of B then C then A when a triggering SQL statement is executed. How do you change trigger A and preserve the desired A-B-C firing order? Here's how: you DROP all three triggers, then re-CREATE all three in A-B-C order. What a hassle!

Along comes advanced trigger functionality, and this problem is solved -- and not only solved, but fixed by your choice of three options. All three of these hassle-free trigger modification options are made possible by the fact that advanced triggers, like native SQL procedures and compiled SQL scalar functions, have versions. So, back to the example of the preceding paragraph: you have triggers A, B, and C, and you want to change A while maintaining the A-B-C firing sequence. If they are now advanced triggers, no problem! Here are your three -- count 'em: three -- options for changing advanced trigger A without messing up the firing sequence of the triggers (I'll assume that the current version of trigger A is V2, and I'll highlight syntax that is new with Db2 12 in red):

Choose any of the above options, and you successfully modify trigger A while maintaining the A-B-C firing sequence of the three triggers, without having to DROP and re-CREATE triggers B and C. And the crowd goes wild!

Well, there you have it: another incentive to get to Db2 12 for z/OS (and to activate function level V12R1M500 or later). Enjoy those Db2 12 SQL enhancements, folks!

Thursday, August 17, 2017

Db2 12 for z/OS SQL Enhancements: a Better MERGE

Greetings, and welcome to the third in a series of entries covering my favorite Db2 12 for z/OS SQL enhancements. The first two posts in the series described piece-wise DELETE and result set pagination. This entry focuses on the new and very much improved MERGE functionality delivered via Db2 12 (usable at function level V12R1M500 and above).

The MERGE statement, introduced with Db2 9 for z/OS, is sometimes referred to as the "upsert" statement, because it enabled, via a combination of update and/or insert operations, the "merging" of one "table" into another (I'll explain the quotes around "table" momentarily): in a MERGE statement, you'd indicate what constitutes a match between an input "table" "row" (again, quote marks to be explained) and a target table row, and where a match exists the target table row is updated with information in the matching input "table" "row," and when a match doesn't exists the input "table" "row" is inserted into the target table. That functionality sounds pretty useful, but the initial Db2 for z/OS implementation of MERGE left a good bit to be desired (read on to see what I mean).

What's great about the new versus the old Db2 for z/OS MERGE statement is best shown by way of example. Here is what a pre-Db2 12 MERGE statement might look like, with color-coded complaints following:

USING (VALUES (:hv_id, :hv_amount)
ON (A.ID = T.ID)

Complaint: the input "table" has to be represented as a series of host variable arrays -- one for each "column" of the "table" -- Thus the quotation marks I've been putting around input "table" and "row." It's not really a Db2 table that's used for MERGE-input purposes. It's a clunky representation of a table. What a hassle.

Complaint: only a very simple row-match qualification can be specified, and only a simple pair of actions are possible: when a match is found, do this update, and when a match is not found, do this insert -- Not a lot of sophistication or flexibility here.

Complaint: the NOT ATOMIC CONTINUE ON SQLEXCEPTION clause, which indicates that input "rows" (if there are several of them) are processed separately, with processing continuing in the event that an error is encountered for a given row, is required -- You might not want that behavior.

And, on top of all that, a target table row can be operated on more than once in a single execution of a pre-Db2 12 MERGE statement (it could be inserted and subsequently updated) – Again, you might not want that behavior.

Along comes Db2 12 (at function level V12R1M500 or above, as previously noted), and boy, is MERGE ever better than it was before. I'll stick with the color-coded-example approach to show how MERGE has been enhanced. A Db2 12 MERGE might look like this:


Love it: with Db2 12's new-and-improved MERGE, you can actually merge an honest-to-goodness Db2 table with another table; or, you can merge a view with a table; or (as shown in this example), you can merge the result set of a SELECT statement with a table -- So long, host variable arrays. Can't say I'll miss you.

Love it: in one MERGE statement, you can have a multitude of “when matched” and “when not matched” clauses, differentiated through various additional predicates -- Benefit: a number of update, insert, and/or delete actions can be driven via execution of a single MERGE statement (and, yes, the ability to drive DELETE operations -- in addition to UPDATE and DELETE -- via a MERGE statement is another part of the "new-and-improved" story).

Love it: you can use the SQL statement SIGNAL to provide customized error codes and messages -- Maybe you could have some fun with that, like, SET MESSAGE TEXT = 'BETTER LUCK NEXT TIME'

Love it: there's a new IGNORE option -- MERGE input rows probably hate to be ignored, but sometimes it's just necessary.

And on top of all that, with the new-and-improved MERGE, a target table row can be operated on (via INSERT, UPDATE, or DELETE) only once -- Personally, I kind of like that.

And one more thing: if an error is encountered during execution of a new-and-improved MERGE, the whole statement is rolled back -- It's all or nuthin'.

Now, if you find yourself getting all sentimental and missing the pre-Db2 12 MERGE statement, you can always bring it back: just include in your coding of the statement the NOT ATOMIC CONTINUE ON SQLEXCEPTION clause. Do that, and MERGE behavior will be as it was prior to Db2 12, right down to the requirement that input "table" "rows" be in the form of host variable arrays.

Me? I'll take the new MERGE.