Monday, November 24, 2014

DB2 for z/OS: How Low Can Synchronous Read Wait Time Go?

Short answer: lower than I'd previously thought.

Short explanation: solid-state drives.

OK, backstory: I was recently analyzing some performance information related to a client's production DB2 for z/OS environment. Included in the documentation provided by the client were some Accounting Long reports generated by their DB2 monitor (these reports format data contained in DB2 accounting trace records). As I typically do when reviewing such reports, I checked out the wait time per synchronous read I/O operation: I located the "Class 3 wait time" heading in the report (data therein comes from DB2 accounting trace class 3), found the line for "Database I/O" under "Sync I/O," noted the "Average time" (the average time, per accounting record, spent waiting on single-page, on-demand read I/Os to complete), and divided that figure by the "Average event" number in the same line (that's the average number, in this case, of synchronous read I/O operations per accounting record). The quotient thus obtained gives you wait time per synchronous read. On seeing this number, I did a double take.

Back in the mid-1990s, at an IBM storage systems conference, I delivered a presentation that provided a DB2 for z/OS perspective on I/O performance. In that presentation, I mentioned that getting average wait time per synchronous read below 30 milliseconds was a good thing, and that getting this time to 20 milliseconds was a great thing. Not long after that, large cache memory sizes became the norm for enterprise storage subsystems, and synchronous read wait times dropped sharply at sites where this technology was put to use (I recall being surprised when I first saw, in the late 1990s, an average wait time per synchronous read number that was below 5 milliseconds). As time passed, more performance-boosting technologies appeared on the mainframe storage systems scene, including faster channels, parallel access volumes (a z/OS feature that enabled multiple I/O operations targeting a single disk volume to execute concurrently), and increasingly sophisticated algorithms for managing disk controller cache memory resources. By around 2010, I was seeing average wait time per DB2 for z/OS synchronous read go as low as 1 millisecond.

The wait time per synchronous read that I saw in the data I reviewed, as previously mentioned, last week? 0.25 milliseconds. That's 250 microseconds. "Your DB2 synchronous read I/Os are screaming. How do you get numbers like that?" I asked one of the client's DB2 for z/OS DBAs. "Oh," he said, "that would be the solid-state drives."

So there you have it. I certainly knew that solid-state drives (SSDs) were on the market (IBM is one of the vendors of SSDs), but I hadn't yet seen the effect of a wide-scale deployment of the technology at a DB2 for z/OS site (in some cases SSDs are used in a niche fashion for a relatively small percentage of an organization's data assets). Since SSDs don't have the actuators ("arms") associated with traditional spinning disk drives, so-called seek-time delay (the time it takes to move a read/write "head" to a particular track on a disk) is largely eliminated, and that contributes to substantially improved read I/O performance, especially for transactional workloads characterized by random reads.

It's all part of a big picture that continues to change (and for the better). Making synchronous reads go as quickly as possible is still, of course, an important aspect of optimizing DB2 for z/OS I/O performance. In that regard, utilization of SSDs is becoming a more attractive option as the technology becomes more cost-competitive relative to spinning-disk storage systems. The other key to driving down I/O wait times is to eliminate them altogether via larger DB2 buffer pools. And folks, when I say larger, I mean MUCH larger than a lot of you have been thinking. It's time to go big -- really big -- in the buffer pool department. I'll have more to say on that topic in an upcoming blog entry, so stay tuned.