Excessive undo block shipment and contention for undo buffers

I have seen this scenario in one my projects where  Oracle EBS workload is high.

When a select statement needs to read a block with active transactions, it has to cleanout the block and undo the changes to create a CR version. If the active transactions in the block belong to more than one instance, there is a need to combine local and remote undo information for the consistent read. That will result in CR requests for undo blocks. Depending on the amount of index blocks changed by multiple instances and the duration of the transactions, undo block shipment may become a bottleneck.

Usually that happens in applications that read recently inserted data very frequently, whereas committing infrequently . In that kind of workload, performance is degraded by the additional work related to creating CR buffers, even in single instance environments. In RAC, the problem is exacerbated by the fact that a subset of the undo information needs to be obtained from remote instances.

The symptoms for this kind of problem are not easily identifiable from a Statspack report in releases prior to Oracle10g. The main wait events are usually ‘global cache cr request’ and ‘buffer busy global CR’ (Oracle9i) or ‘gc buffer busy’ and ‘gc cr block 2/3 way’ in 10g, but there is no easy way to tell if the buffers involved are undo buffers or not. The view V$SEGMENT_STATISTICS does not include information about undo segments, so that the “Top CR Blocks Served per Segment” section in Statspack won’t help. Some V$SYSSTAT statistics can be helpful to measure how much rollback is done for consistent read, like “cleanouts and rollbacks – consistent read” and “cleanouts only – consistent read”. However, those statistics don’t tell if the cleanouts and rollbacks needed remote undo blocks in RAC.

The easiest way to determine if the CR requests are for undo is to query V$CR_BLOCK_SERVER (or X$KCLCRST). In that view, CURRENT_REQUESTS represent the number of requests for undo blocks, whereas CR_REQUESTS represents the number of requests for table and index blocks. In release 10g, V$CR_BLOCK_SERVER statistics are included in Statspack, making that analysis simpler.

As a general rule of thumb, any system where CURRENT_REQUESTS is greater than CR_REQUESTS needs attention, especially if there are significant buffer busy waits for CR.  It is also important to determine if most of the busy buffers are undo or data buffers. Although that information is available in V$WAITSTAT and in Statspack (Buffer wait statistics), there is no distinction between local and global buffer busy events. The most precise monitoring tool in this case is a SQL trace (event 10046) level 8, where p1 and p2 identifies file number and block number for the busy buffers.

Once investigation leads to undo block shipment bottleneck, the next step is to find out what index segments the pending transactions were changing. That is not a trivial task, as there is no direct way to correlate the undo blocks shipped to the pending changes in index blocks that created the need for remote undo. The segments with most current blocks shipped in Statspack are candidates, but it’s possible that a segment is subject to write-write concurrency causing current blocks to be shipped, and yet have relatively small consistent read activity. Drilling down to a SQL trace level 8 is again necessary. From there the waits for undo segments can be linked to particular SQL statements, and the execution plan for the SQL statements indicate the index segments accessed.

In this section I will present some specific tuning techniques to reduce undo block CR requests, as well as revisit the tuning techniques presented in section 2 and discuss how they affect undo block CR requests.

The first two techniques presented here can also be used to improve performance in single instance applications subject to excessive rollbacks for consistent read.

1.1 Committing more frequently

Shorter transactions will reduce the likelihood that any given index block in the cache contains uncommitted data, thereby reducing the need to access undo information for consistent read.

Advantages:-

  • Significantly reduces the frequency of rollbacks for consistent read, acting against the cause of the problem.
  • Effective for single instance as well as for RAC.

Drawbacks:-

  • Only applicable to applications that are flexible enough to control commit frequency without significant code changes.
  • May increase waits for ‘log file sync’. However, if contention for undo is high, it’s not likely that the additional ‘log file sync’ waits would be significant.

1.2 Setting _row_cr=TRUE

The RowCR feature, controlled by the hidden parameter _row_cr (true/false), avoids unnecessary cleanouts and rollbacks for data accesses involving index unique scans. If the row read by a query is not locked by any current or committed (delayed logging cleanout) transaction, the cleanout is avoided.  However, this feature is currently under development and is not fully supported. It can be used for experimentation and diagnosing purposes only, but cannot be applied to production systems.

Advantages:-

  • Reduces the frequency of rollbacks for consistent read without the need to change the application.
  • Effective for single instance as well as for RAC.

Drawbacks:-

  • Currently not supported for production systems. Only useful as a diagnosing tool in test systems.
  • Only improves index unique scans.  Any range scans, including equality predicates against non-unique indexes, will not be affected.

1.3 Increasing sequence cache sizes

As explained in section 2, this technique reduces inter-instance concurrent access to index leaf blocks. CR versions of index blocks modified by only one instance can be fabricated without the need of remote undo information.

Advantages:-

  • Ease of configuration, availability and manageability. No need to rebuild any segment.
  • Reduces the need for remote undo blocks during consistent read rollbacks.

Drawbacks:-

  • Only applicable to indexes deriving their values from Oracle sequences.
  • Only applicable if ordering is not an issue, i.e., sequences must be configured as CACHE NOORDER.
  • Huge gaps in sequence values will happen when instances are stopped and restarted. That should not be an issue if the only purpose of the sequence is to generate a unique identifier, since the maximum possible sequence value is 10E27.

1.4 Reversing the index key

As explained before, this technique increases the number of blocks concurrently modified by multiple instances. Although that is very useful when the problem is contention for index blocks, more blocks changed implies potentially more blocks rolled back for consistent read, which in turn means more undo block transfers.
Advantages:-
None. Not recommended if the bottleneck is excessive undo block shipment.

1.5 Partitioning the table by hash

Similarly to the reverse key index, this technique increases the number of blocks concurrently modified by multiple instances. Although that is very useful when the problem is contention for index blocks, more blocks changed implies potentially more blocks rolled back for consistent read, which in turn means more undo block transfers.

Advantages:-
None. Not recommended if the bottleneck is excessive undo block shipment.

1.6 Partitioning the table by list on INSTANCE_NUMBER

As explained before, this technique directs INSERT statements executed from different instances to different partitions. CR versions of index blocks modified by only one instance can be fabricated without the need of remote undo information.

Advantages:-

Practically eliminates the need for remote undo blocks in consistent read cleanouts and rollbacks.

Drawbacks:-

  • As the partition key is defined on a column unknown by the application, no queries will result in partition elimination. As a consequence, all partitions will always be scanned for all queries.
  • The fact that all partitions are scanned may increase the amount of index blocks shipped for queries substantially, since recently modified data is queried frequently. Depending on the workload, the additional waits for index blocks may overcome the benefits of eliminating waits for undo.
  • Implements schema changes that need to be maintained across application releases.
  • Schema changes may need to take place if instances are added to the cluster. A possible alternative is to create as many partitions as the maximum number of instances planned for the application.
  • May imply changes to primary key and unique constraints if local indexes are used for them.

Additional tuning advice

Successful index contention tuning depends on proper problem diagnosing, choosing the appropriate tuning technique and extensive testing. Some important tuning aspects that should always be kept in mind are given bellow.

1.1 Incomplete investigation may lead to the wrong approach

Suppose that the main wait event for a particular workload running Oracle10g is ‘gc buffer busy’, and V$SEGMENT_STATISTICS shows a particular index segment as the main contributor to ‘gc buffer busy’. Based on that information, partitioning the index or reversing its key appears to be the right thing to do. Before choosing one of those techniques, though, one should investigate if there are ‘gc buffer busy’ waits for undo, as undo segments are not listed in V$SEGMENT_STATISTICS. A comparison between the total number of waits for ‘gc buffer busy’ in V$SEGMENT_STATISTICS and the correspondent value in V$SYSTEM_EVENT.TOTAL_WAITS would help.  If most of those waits were for undo, then neither partitioning by hash nor reversing the index key would be appropriate.

1.2 Combining techniques does not always generate combined benefits

Hash partitioning the index reduces buffer busy waits; increasing sequence cache values provides instance affinity to index leaf blocks. Based on those correct statements, one may believe that combining the techniques should provide better results than employing them separately.

That is not necessarily true, because hash partitioning reduces the instance affinity to blocks provided by sequence caching. To demonstrate with an example, suppose we have an index with 200 keys per leaf block on average, and the keys are generated by a sequence defined with cache=4000. That means each cache should populate 4000/200=20 index leaf blocks on average. As a result, the index would end up having regions of approximately 20 leaf blocks populated by each instance, assuming uniform insert rate across instances. If that index were partitioned 4-way, those 4000 values in the sequence cache would be stored in 4 different partitions, with 1000 keys in each on average. As there are 200 keys per block, each cache would populate 1000/200=5 leaf blocks in each partition. Note that instance affinity to leaf blocks was reduced 4 times. Generalizing, instance affinity provided by sequence caches is reduced N times when the index is partitioned N-way.

Since instance affinity to index leaf blocks also helps to reduce buffer busy waits, there may be no need to employ hash partitioning for indexes deriving key values from sequences. If the application tolerates unbounded high sequence cache settings, it’s usually better to increase the cache until performance is acceptable, rather than trying to combine sequence caching and hash partitioning.

1.3 Exhaustive testing of alternatives leads to best results

The first interaction of employing a tuning technique may not lead to the expected results. Hash partitioning, for example, may not yield substantial improvements if the number of partitions chosen is too small, or if global cache defers are enabled. For that particular technique, as well as for reverse key indexes, disabling global cache defers should always be tested.

Tuning never end…. But we are closing this short sessions on Tuning Indexes on RAC. Your Suggestions and Questions are welcome !!

Other Sessions for this Training can be found @

Tuning indexes on RAC environments-1

Tuning indexes on RAC environments-2

Tuning indexes on RAC environments-3

Tuning indexes on RAC environments-4

Watch this place and become a tuning expert….. Subscribe by Email to get each day sessions in your inbox.

Enter your email address:

Delivered by FeedBurner

Readers who are interested on doing hands on practice, please feel free to contact us

PrintFriendly

Viewers of this post also viewed...