Tuesday, May 24, 2011

Some Nuggets from IDUG in the OC, Part 2

Following up on the entry I posted last week, here are some more items of information picked up in sessions I attended during the IDUG 2011 North American DB2 Tech Conference, put on by the International DB2 Users Group earlier this month in Anaheim, California (in Orange County, aka "the OC"):

IBM Distinguished Engineer John Campbell delivered a presentation on DB2 10 for z/OS migration and early user experiences that was (as usual where John's concerned) full of useful, actionable content. One of the first points made during the session had to do with hash access, a new (with DB2 10) way of organizing rows in a table that can provide, under the right circumstances, super-efficient access to data. John told attendees that the "sweet spot" for hash-organization of data -- the set of data access scenarios in which hash organization would be the right choice for a table -- is smaller than he'd originally anticipated. I was pleased to hear that note of caution, as I feel that some folks have gotten a little carried away with the notion of accessing data rows via a hash key. It's something best used in a targeted way, versus generally. The hash route could be good for tables for which reads well outnumber inserts, and for which read access is dominated by single-row retrieval using a unique search argument (that would be the table's hash key column, or column group).

John also talked about the increased use of real storage (i.e., server memory) that should be expected in a DB2 10 environment relative to prior-release DB2 systems. He mentioned that DB2 10 could require 10-30% more real storage "to stand still," and more than that if people get aggressive with memory-using features such as high-performance DBATs (about which I blogged in an entry posted last month). In a lot of environments, this shouldn't be a concern, as I've seen plenty of production z/OS systems with tens of gigabytes of memory and demand paging rates in the low single digits per second; however, if you're running with DB2 V8 or DB2 9 and you're kind of tight on memory in your production z/OS LPAR (and I'd say that you are if the demand paging rate is near or more than 10 per second), consider adding memory to that LPAR prior to migrating to DB2 10 (and John pointed out that the cost of memory was substantially reduced when IBM rolled out the z196 mainframes).

John mentioned that while performance improvements (referring to CPU time) in the range of 5-10% should be expected for applications' in-DB2 processing in a DB2 10 environment (assuming that packages are rebound on the DB2 10 system), some "skinny packages" (packages with very short-running SQL statements and/or that issue very frequent commits) bound with RELEASE(COMMIT) were seen to have somewhat worse CPU efficiency in a DB2 10 environment as compared to prior-release systems. John pointed to a recently available fix (the one for APAR PM31614) that addresses this situation.

DB2 10's support of much larger numbers of concurrently active threads got prominent mention in John's presentation. Some threads, he mentioned, use more virtual storage than others, but John indicated that he's pretty confident that DB2 10 sites should be able to support at least 2500 to 3000 concurrently active threads per subsystem.

DB2 users should expect that exploitation of new features will increasingly require the use of universal table spaces, which were introduced with DB2 9. John pointed out that universal table spaces are a prerequisite for inline LOBs (a potentially major performance boost for applications that read and/or insert mostly small LOBs), the "currently committed" locking behavior (whereby data readers don't wait for the release of X-locks held for inserts or deletes of qualifying rows), and the aforementioned hash organization of tables. Fortunately, DB2 10 provides a means of migrating existing simple, segmented, and "classic" partitioned table spaces to universal table spaces without the need for an unload/drop/re-create/re-load sequence (this is done by way of ALTER TABLESPACE followed by an online REORG).

John brought up a change in the behavior of the CHAR function in a DB2 10 environment, when input to the function is decimal data (among other things, values returned by the function are not padded to the left with zeros). He then informed attendees that the fix for APAR PM29124 will restore pre-DB2 10 behavior for the CHAR function operating on decimal data.

Near the end of his presentation, John talked about preparing for DB2 10 in a DB2 Connect sense. He mentioned that DB2 10 requires that DB2 clients (referring to DB2 Connect or the IBM Data Server Drivers) be at least at the V9.1 fix pack 1 level. Several new DB2 10 functions require that DB2 clients be at the V9.7 fix pack 3A level or higher.

IBM's Beth Hamel delivered a session on data warehousing and business intelligence on the mainframe DB2 platform. She pointed out that two of the trends that are driving growth in data warehousing on System z are consolidation (mainframe systems are very highly scalable) and the rise of "transactional analytics" (these high-volume, quick-running queries can run concurrently with more complex, longer-running BI tasks in a mainframe DB2 system thanks to z/OS's advanced Workload Manager).

Beth also noted that a very large amount of the source data that goes into data warehouses comes from mainframe systems, and she said that organizations are looking to get closer to this source data by locating their data warehouses on mainframes. Also boosting System z activity in the BI space is the fact that data warehouses are increasingly seen by businesses as being mission critical. That leads to a greater emphasis on availability -- long a bedrock strength of the mainframe platform (and that high availability story gets even better when several DB2 for z/OS systems function as a data sharing group on a Parallel Sysplex shared-data mainframe cluster).

In addition to delivering new BI-enabling technology for the mainframe platform, IBM has made moves in the areas of product pricing and packaging that can help organizations to get up and running with DB2-based data warehouses on System z in less time and at a lower cost versus a piece-by-piece implementation. Beth pointed to the InfoSphere Warehouse on System z offering, which provides cubing services to accelerate OLAP applications, SQL-based ETL functionality, and more, all managed by way of an easy-to-use interface. Beth told attendees that the cost of InfoSphere Warehouse on System z is way below that of the offering's individual components, were these to be acquired separately.

In wrapping up her presentation, Beth talked about Version 2 of the IBM Smart Analytics Optimizer, a query accelerator that attaches to a mainframe DB2 system and can deliver eye-popping performance for formerly long-running data retrieval tasks. ISAO V2 will take advantage of Netezza technology (Netezza was acquired by IBM last year) to expand the range of queries that can be processed by the ISAO and to significantly boost the system's data capacity. Beth said that she expects the beta program for ISAO V2 to begin in the third quarter of this year.

That's it for this collection of IDUG nuggets. I'll wrap up this series of posts with a Part 3 entry in a few days.

Monday, May 16, 2011

Some Nuggets from IDUG in the OC, Part 1

That's OC as in Orange County, California, where you'll find Anaheim, site of the 2011 International DB2 Users Group North American Tech Conference. The conference wrapped up on May 6, and herein I have items of information gleaned from some of the sessions I attended (posting delayed by a crazy-busy week following the IDUG event). In a few days I'll provide more of these nuggets in a Part 2 entry.

Gerald Hodge of HLS Technologies delivered a presentation on the plan management feature introduced with DB2 9 for z/OS. This capability enables you to rebind a package whilst retaining the previous version (and, optionally, the "original" version), making it easy to switch to a saved version, in case of performance regression, by issuing a REBIND PACKAGE command with the SWITCH(PREVIOUS) option (or SWITCH(ORIGINAL), to restore the original version of the package). The purpose of the plan management feature is, as some have said, to "take the fear out of rebinding." Gerald pointed out that rebinding all packages is getting to be a more and more important part of the DB2 for z/OS migration process, as performance gains -- and this is particularly true of DB2 10 for z/OS -- depend increasingly on regeneration of packages. He stressed that realizing performance gains in a DB2 10 system (especially in the area of CPU efficiency) is NOT just a matter of getting a different access path. In fact, performance improvements are expected for most packages when rebound in a DB2 10 environment (and we're talking about Conversion Mode here) even when access paths for the packages' SQL statements don't change. The packages generated in a DB2 10 system via REBIND will feature internal enhancements (some pertaining to column handling, others to predicate evaluation) that should result in DB2 doing the same thing better (if access paths are as they were in the previous release). Package regeneration is also required to get the bulk of thread-related virtual storage up above the 2 GB "bar" in the DB2 database services address space (aka DBM1), and THAT is the key to DB2 10's ability to support 5 to 10 times the number of concurrently active threads versus previous releases of the product.

Gerald reminded session attendees that packages last bound prior to DB2 for z/OS Version 6 will be automatically rebound when you get to DB2 10 (that's if the ABIND parameter in ZPARM is set to the default value of YES or to COEXIST; otherwise, an attempt to execute such a package will result in a -908 SQL error code), and he recommended getting such old packages rebound in the "come from" DB2 environment (be that DB2 V8 or DB2 9) BEFORE going to DB2 10. Want to see if you have packages that were last bound prior to DB2 V6? Run the DB2 pre-migration "check-out" job on your system. This job, DSNTIJPM, ships with DB2 10. You can also get the DB2 V8 and DB2 9 versions of the job (DSNTIJP8 and DSNTIJP9, respectively) by way of the fix for APAR PM04968.

Bryan Paulsen of John Deere talked about his company's experience with DB2 10. Bryan told session attendees that Deere's DB2 10 migration and fallback testing went "flawlessly." He also spoke of old DB2 functionality that is going away with DB2 10 -- in particular, the private protocol used for mainframe-DB2-to-mainframe-DB2 client-server communication. DRDA has long been the preferred protocol for DB2 distributed database processing (for all platforms on which DB2 runs -- not just the mainframe), but at some shops there are still packages that utilize private protocol. Bryan pointed people to an APAR, PK64045, that provides, for DB2 V8 and V9 users, several tools that can facilitate the identification of private protocol-using packages and the conversion of these to use the DRDA protocol. He also mentioned APAR PK92339, which introduces a new ZPARM that can be used to disable private protocol at the DB2 subsystem level. Bryan said that Deere found this private protocol disablement capability to be a very useful means of "smoking out" private-protocol-using programs prior to the migration to DB2 10.

Bryan noted that DBRMs bound into plans (versus packages) is another old piece of functionality that is gone in a DB2 10 environment. If DB2 10 encounters a DBRM that is bound directly into a plan, it will create a corresponding package, but Bryan suggested that people do these conversions themselves before going to DB2 10. He briefly described APAR PK62876, which delivers a new REBIND PLAN option that can be used to convert DBRMs bound directly into a plan into packages, and then to convert the plan to use a PKLIST that will include the collection into which the new packages (corresponding to the DBRMs) were bound.

Moving on, Bryan said that Deere had successfully tested with 3000 concurrently active threads on a DB2 subsystem. He noted that Deere normally runs with 450 concurrently active threads for a subsystem, so this test result is in keeping with the expectation that DB2 10 will support 5 to 10 times more concurrently active threads versus a DB2 V8 or V9 system.

Bryan mentioned that he likes the new DB2 10 catalog table SYSPACKCOPY, which makes it easier to track the status of previous copies of a package that are maintained by way of the previously mentioned plan management functionality of DB2 (which allows retention of, and an easy "switch to," the immediate previous and -- optionally -- the "original" copy of a given package).

DB2 10 online schema enhancements -- a further expansion of changes that can be effected for a database object without the need to drop and recreate that object -- were another of the new release capabilities successfully tested by Deere. Bryan said that Deere changed the DSSIZE value for a partition-by-growth universal table space, and changed page sizes for table spaces and indexes, using the new ALTER-then-REORG process introduced with DB2 10.

Sometimes, people will see a DB2 online REORG job fail in the switch phase (the last phase before clean-up) because a thread holds a read claim on the table space or partition being REORGed. DB2 10 introduced a new ZPARM parameter, LRDRTHLD, that can help in identifying processes that hold read claims for extended periods of time (read claims are released at commit time, but sometimes read-only applications do not issue commits in a timely manner). Bryan said that Deere successfully tested this new functionality, which will cause DB2 to write a trace record when the threshold is hit, using the default LRDRTHLD value of 10 minutes.

Bryan concluded his presentation with brief descriptions of some relatively new DB2 10 APARs, including PM27811, which allows for inlining of LOB values in the skeleton package table (SPT01) in the DB2 10 directory (as part of the DB2 10 enable new function mode process, SPT01 is converted to a partition-by-growth universal table space, with package information stored as LOB values). LOB inlining should improve SPT01 access performance and reduce disk space requirements for the tablespace (the latter because a LOB tablespace cannot be compressed, but the LOB values inlined in a base table can be compressed when compression is used for the corresponding table space). [Note: APAR PM27073 enables one to change the LOB inline length used for SPT01.]

Terry Berman of DST Systems also discussed DB2 10 features from a user's perspective. He started out with a positive review of the catalog restructuring accomplished as part of the DB2 10 enable new function mode (ENFM) process. In particular, SYSDBASE undergoes big changes: each of the 14 tables formerly in that table space goes into a partition-by-growth table space (a PBG table space, being a universal table space, contains one and only one table). Tests run at DST showed that the catalog structure changes greatly improved concurrency for DDL and BIND operations (Terry said that they successfully tested 20 concurrent BINDs).

One of the nice features delivered with DB2 9 for z/OS was the LASTUSED column of the SYSINDEXSPACESTATS catalog table -- a BIG help when it comes to identifying indexes that are not helping performance and are candidates for dropping (fewer indexes means better performance for INSERT and DELETE operations, and for UPDATEs of indexed columns, as well as savings with respect to disk space consumption). Terry gave a thumbs up to the introduction, with DB2 10, of LASTUSED in the SYSPACKAGE and SYSPLAN catalog tables, saying that this information facilitates identification of obsolete plans and packages. The new LASTUSED column values are maintained in DB2 10 conversion mode and are updated once per day.

Also on the topic of new DB2 10 catalog columns, Terry said that he was pleased to see read-activity metrics introduced to the real-time statistics tables in the catalog. He specifically mentioned the usefulness of two new SYSTABLESPACESTATS columns: REORGSCANACCESS, which records data accesses for a table space since the last REORG or LOAD REPLACE of the object (or since the object was created, if it hasn't been subsequently REORGed or LOAD REPLACEd), and REORGCLUSTERSENS, which shows the number of times that data in a table space was read by SQL statements that are sensitive to the clustering sequence of data in the table space.

Terry told session attendees that their DB2 EXPLAIN really ought to be in Unicode format in the DB2 10 environment (APAR PK85068 can help with the conversion of EBCDIC EXPLAIN tables to Unicode). Terry also pointed out that the number of PLAN_TABLE columns continues to grow: the DB2 10-format PLAN_TABLE has 64 columns -- up from 59 columns for the DB2 9 format and 58 for the DB2 V8 format (PK85068 also helps in getting EXPLAIN tables into your current release format).

Terry talked up the access path repository introduced with DB2 10, pointing out that it can be used to (among other things) set various optimization options, such as REOPT, at an individual SQL statement level, versus the package-level granularity of previous DB2 releases.

Terry concluded his presentation with information related to DB2 instrumentation. He noted that compression of SMF trace records works very well: DST saw 75.8% compression with CPU overhead that did not exceed 1% (APAR PM27872 provides a sample SMF decompression program). Terry also said that he really likes the inclusion of statement ID information in DB2 10 messages, which -- thanks to the new STMT_ID columns in the SYSPACKSTMT catalog table and the DSN_STATEMENT_CACHE_TABLE, makes it much easier to tie error situations to SQL statements in a DB2 10 system. Also getting mention was the separation (Terry: "Finally") of lock and latch times in DB2 accounting trace data, the new IFCID 359 trace record (index page split activity), IFCID 361 (auditing the DB2 "superusers" in the system), and IFCID 401, which provides statement-level metrics with a lower CPU overhead versus previous DB2 releases (Terry pointed out that getting the IFCID 401 information requires that packages be bound or rebound in a DB2 10 new function mode environment).

That's all for now. As I mentioned up top, more to come in a few days.