In Focus

What does the concept of Re-Read mean?
DB2 table data is held in data pages within VSAM linear datasets, stored on DASD. To access and process the data using Db2 applications it is necessary to read the data pages from DASD and bring them into a part of the Db2 working storage, called Buffer Pool. Such read operations, of course, are required and unavoidable. As long as data in a data page is not changed it may remain in the Buffer Pool, thus accelerating access to this data page. This is because reading data from Buffer Pool storage is many times faster than reading data from DASD.
However, any data page shares the Buffer Pool space with other data pages. So it may happen that the Buffer Pool becomes full and existing data pages have to be deleted from the Buffer Pool in order to make space for new data pages that were requested by the application right now. And it may happen that the same data pages that were just thrown out are requested and read again a short time after that. That is called a repeated read, short: Re-Read. If a Buffer Pool is too small for the amount of different data pages that are needed by the Db2 application, then many Re-Read operations will take place in a defined time frame, leading to performance degradation because accesses to (slow) DASD increase and accesses to (fast) Buffer Pool storage decrease.
Buffer Pool tuning is about avoiding unnecessary DASD access; the percentage of Re-Reads in the total number of read operations is an important indicator for tuning potential in a Db2 Buffer Pool.
After BPA4DB2 identified a high Re-Read percentage how can I find out how much space I have to assign to a Buffer Pool?
You do not have to find it out by yourself. BPA4DB2 tells the recommended size and even creates ALTER statements to adjust the Buffer Pool sizes. But the algorithm to determine the size is easy to understand. A high Re-Read percentage occurs if the number of different data pages that are requested by a Db2 application within a specific time frame is too high to fit into the Buffer Pool. The number of different pages is the key value, appearing within BPA4DB2 as “Distinct Getpage”. Let’s assume that within a 10 minutes interval 4,000 different pages were read into a Buffer Pool. Then this pool needs exactly a size of 4,000 pages in order to perform its operation over 10 minutes without the need for Re-Reads. The number of Distinct Getpage operations is the reference point for the pool size, and this number is obviously depending on the length of the assessment interval. This assessment period is also a quality criterion for the behavior of a pool or the positive impact of the buffering.
What kind of recommendations other than Buffer Pool sizing can I expect?
There are hints for adjusting all Db2 thresholds that have impact on Db2 performance. BPA4DB2 also identifies Re-Write operations when causing I/O problems. The I/O activity of each Db2 object is shown and analyzed in a very detailed manner and every object is categorized. Another function consists in grouping proposals for Db2 objects with similar access patterns. Memory shortages in the DBM1 address space are identified. SQL statements causing poor Buffer Pool performance are outlined. For datasharing systems BPA4DB2 supervises the Group Buffer Pool health status and highlights deviations in the six most important Group Buffer Pool characteristics.
How many measurements have to be done to get valid recommendations?
It depends. If you are tracking a specific performance issue and a specific workload then one initial measurement will bring up the probable problem and tuning recommendations, a second measurement already may confirm success; if it does not a third measurement after fine-adjusting Buffer Pool parameters will finish the investigation. But the preferable procedure does not count the measurements. It acts pro-actively and sets up a regular measurement strategy. With BPA4DB2 this is quite easy, using its Host Examination Component. With an examination job scheduled for every relevant type of workload (e.g. online and batch) BPA4DB2 will collect information on a regular basis and notify the responsible Db2 administrator in such cases when performance problems seem to come up.
No, BPA4DB2 does not transfer a huge performance trace to the workstation.
Among interested parties and competitors there are a few misunderstandings about how BPA4DB2 works. One misunderstanding seems to be that a huge performance trace file has to be transfered to a workstation for analysis. Of course, BPA4DB2 does not do such silly things. By using meaningful algorithms the big output dataset produced by Db2 performance trace is condensed drastically. This process performs on the host system, and it results in a small flat file of less than one Megabyte size that, in fact, might be transfered to the workstation for analysis. But usually there is no transfer at all. By default, the resulting flat file is loaded automatically to BPA4DB2’s administration database within a host Db2 subsystem of your choice. Your workstation will then connect to this database over the network and show the analysis results in a friendly graphical application.
Why does BPA4DB2 produce such a low CPU overhead?
Every kind of trace produces large amounts of output (because we want to see what is “inside”) and a traced application uses more CPU resources than it would have used when no trace had been enabled. So usually you will want to switch a performance trace on only when it is unavoidable to do so. But this attitude conflicts with the desire to monitor Db2 performance on a regular basis, in order to pro-actively take measures for satisfying performance and cost prevention. We have found the solution: BPA4DB2 uses a sampling technique to collect performance data. There are three parameters that control the so-called measurement-interval: trace duration, wait time, and number of runs. In a typical setup, Db2 performance trace is activated for a duration of 2 seconds, then an 8 second wait period follows, up to the next trace activation. This sequence is repeated 150 times, resulting in a measurement interval of 25 minutes (( 2 + 8 ) seconds * 150), including a trace activity time of 5 minutes (2 seconds * 150). This technique delivers more accurate results than enabling the performance trace for 5 minutes at a stretch, because a far larger period of Db2 activity can be watched. On the other hand the sampling solution shows considerable CPU cost savings compared to enabling the trace for a whole 25 minutes.