Wednesday, February 24, 2016

Slow full table scan due to row chaining

Few days ago I’ve received complaint that simple count on 2 million rows table is running forever.

This was the statement:
select count(1)
from CLIENT k
where k.expires is null;

I've used fake names for table name and columns.
Database version: 11.2.0.4.0

Indeed, query was running longer than I would expect. Oracle was using FULL SCAN of the table with "db file sequential read" wait events. This was little odd to me as I would expect "direct path reads" or "db file scattered reads".


It was partitioned table with 4 partitions and 294 columns.

select count(*) from dba_tab_columns where table_name = 'CLIENT';
 
 COUNT(*)
----------
      294

select owner, segment_name, partition_name, bytes, blocks
from dba_segments
where segment_name in ('CLIENT');
 
OWNER      SEGMENT_NAME    PARTITION_NAME            BYTES     BLOCKS
---------- --------------- -------------------- ---------- ----------
SCOTT       CLIENT         CLIENT_OTHER          8388608        1024
SCOTT       CLIENT         CLIENT_CITY           1643118592     200576
SCOTT       CLIENT         CLIENT_CNTR           591396864      72192
SCOTT       CLIENT         CLIENT_STRNG          52428800       6400


select table_name, partition_name, NUM_ROWS, AVG_ROW_LEN
from dba_tab_partitions
where table_name='CLIENT';

TABLE_NAME                     PARTITION_NAME           NUM_ROWS    AVG_ROW_LEN
------------------------------ ----------------------- ----------- ---------------
CLIENT                          CLIENT_OTHER             0            0
CLIENT                          CLIENT_CITY              1469420      572
CLIENT                          CLIENT_CNTR              592056       495
CLIENT                          CLIENT_STRNG             48977        565


select table_name, data_type, count(*)
from dba_tab_cols
where table_name='CLIENT'
group by table_name, data_type
order by 3 desc;
 
TABLE_NAME DATA_TYPE                                  COUNT(*)
---------- ---------------------------------------- ----------
CLIENT   NUMBER                                          191
CLIENT   VARCHAR2                                         70
CLIENT   DATE                                             32
CLIENT   TIMESTAMP(6)                                      3
CLIENT   RAW                                               2
CLIENT   CL_UTR                                            1
CLIENT   O_TIP_KAR                                         1
CLIENT   O_ZA_NA                                           1
CLIENT   O_PO_OSO                                          1

Some of the columns were collections.

select type_name, typecode
from dba_types
where type_name in (select data_type
                   from dba_tab_cols
                   where table_name='CLIENT'
                   and data_type not in ('NUMBER','VARCHAR2',
                   'DATE','TIMESTAMP(6)','RAW'));
 
TYPE_NAME                      TYPECODE                     
------------------------------ ------------------------------
CL_UTR                         COLLECTION                    
O_TIP_KAR                      COLLECTION                    
O_ZA_NA                        COLLECTION                    
O_PO_OSO                       COLLECTION                    

These were varrays used to store multivalued attributes.


In trace I've seen lots disk reads and elapsed time over 2400 seconds.

select count(1)
  from CLIENT k
where k.expires is null

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2    203.96    2450.19    5455717    8240323          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4    203.97    2450.20    5455717    8240323          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 369  (MSUTIC)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=8240323 pr=5455717 pw=0 time=1349733885 us)
   1905617    1905617    1905617   PARTITION LIST ALL PARTITION: 1 4 (cr=8240323 pr=5455717 pw=0 time=2449532855 us cost=164110 size=3801914 card=1900957)
   1905617    1905617    1905617    TABLE ACCESS FULL CLIENT PARTITION: 1 4 (cr=8240323 pr=5455717 pw=0 time=2448530798 us cost=164110 size=3801914 card=1900957)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
1905617    PARTITION LIST (ALL) PARTITION: START=1 STOP=4
1905617     TABLE ACCESS   MODE: ANALYZED (FULL) OF 'CLIENT' (TABLE) 
                PARTITION: START=1 STOP=4

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to CLIENT                       2        0.00          0.00
  Disk file operations I/O                       29        0.00          0.00
  direct path read                             2048        0.19          9.78
  db file sequential read                   5178860        0.23       2241.08
  resmgr:internal state change                    2        0.11          0.21
  SQL*Net message from CLIENT                     1        0.00          0.00

Object statistics were telling me that all reads were from table partitions.

Session Objects Statistics
Object/Event % Time Seconds Calls - Time per Call -
Avg Min Max
Obj#(299564)
    db file sequential read 78.1% 1,757.0600s 3,677,752 0.0005s 0.0001s 0.2333s
    direct path read 0.4% 8.8314s 1,706 0.0052s 0.0004s 0.1953s
    resmgr:internal state change 0.0% 0.2162s 2 0.1081s 0.1000s 0.1162s
    Disk file operations I/O 0.0% 0.0014s 23 0.0001s 0.0000s 0.0002s
Obj#(299565)
    db file sequential read 20.5% 462.5006s 1,416,370 0.0003s 0.0001s 0.1794s
    direct path read 0.0% 0.8966s 304 0.0029s 0.0001s 0.0479s
    Disk file operations I/O 0.0% 0.0003s 6 0.0000s 0.0000s 0.0000s
Obj#(299566)
    db file sequential read 1.0% 21.5203s 84,738 0.0003s 0.0001s 0.0552s
    direct path read 0.0% 0.0587s 38 0.0015s 0.0000s 0.0206s


Hm… why am I having so many db file sequential reads with direct path reads happening also?
This is a table with lots of columns so I might have problems with chained or migrated rows.
Oracle is probably using individual block reads to fetch pieces of each row.

As I had table with more than 255 columns I would expect intra-block chaining, but this shouldn't cause sequential reads. Only if row doesn’t fit in the block I would have regular row chaining.
I’m probably having problems with row migrations.

Chained row is a row that is too large to fit into a block and if this is the root cause of the problem there isn't much I can do to improve performance. If it’s too big to fit into a block, it would be too big after rebuilding table also.

Migration of an row occurs when row is updated in a block and amount of free space in the block is not adequate to store all the row’s data. Row is migrated to another physical block.
This usually happens when you have PCTFREE set to low.

What is important for migrated rows - you can improve performance reorganizing table/partition or simply deleting/inserting chained rows.

Tanel wrote blog post on the subject "Detect chained and migrated rows in Oracle – Part 1” and I’ve decided to use his great tool Snapper to get some diagnostic info.

SQL> @sn 60 6596
@snapper all 60 1 "6596"
Sampling SID 6596 with interval 60 seconds, taking 1 snapshots...
 
-- Session Snapper v4.06 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)
 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   SID, USERNAME  , TYPE, STATISTIC                                                 ,         DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  6596, MSUTIC    , STAT, session logical reads                                     ,        283813,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, user I/O wait time                                        ,          5719,      95.46,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, non-idle wait time                                        ,          5719,      95.46,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, non-idle wait count                                       ,        193388,      3.23k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, session pga memory                                        ,         -8400,    -140.21,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, enqueue requests                                          ,             2,        .03,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, enqueue releases                                          ,             2,        .03,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read total IO requests                           ,        193740,      3.23k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read total multi block requests                  ,           353,       5.89,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read total bytes                                 ,    1630494720,     27.21M,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, cell physical IO interconnect bytes                       ,    1630494720,     27.21M,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, consistent gets                                           ,        283812,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, consistent gets direct                                    ,        283810,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical reads                                            ,        199034,      3.32k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical reads direct                                     ,        199034,      3.32k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read IO requests                                 ,        193739,      3.23k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read bytes                                       ,    1630486528,     27.21M,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, file io wait time                                         ,      57195780,    954.66k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, Number of read IOs issued                                 ,           353,       5.89,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, no work - consistent read gets                            ,        283808,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, table scan rows gotten                                    ,       2881106,     48.09k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, table scan blocks gotten                                  ,         83578,       1.4k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, table fetch continued row                                 ,        200188,      3.34k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, buffer is not pinned count                                ,        200226,      3.34k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , TIME, DB CPU                                                    ,       5620720,    93.82ms,     9.4%, [@         ],          ,           ,
  6596, MSUTIC    , TIME, sql execute elapsed time                                  ,      60270147,      1.01s,   100.6%, [##########],          ,           ,
  6596, MSUTIC    , TIME, DB time                                                   ,      60270147,      1.01s,   100.6%, [##########],          ,           ,          ~ unaccounted time
  6596, MSUTIC    , WAIT, Disk file operations I/O                                  ,           123,     2.05us,      .0%, [          ],         2,        .03,     61.5us average wait
  6596, MSUTIC    , WAIT, db file sequential read                                   ,      57234629,   955.31ms,    95.5%, [WWWWWWWWWW],    192888,      3.22k,   296.72us average wait
 
--  End of Stats snap 1, end=2016-02-23 13:23:19, seconds=59.9
 
----------------------------------------------------------------------------------------------------
Active% | INST | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
----------------------------------------------------------------------------------------------------
   97% |    1 | 2q92xdvxjj712   | 0         | db file sequential read             | User I/O
    3% |    1 | 2q92xdvxjj712   | 0         | ON CPU                              | ON CPU
 
--  End of ASH snap 1, end=2016-02-23 13:23:19, seconds=60, samples_taken=99
 
PL/SQL procedure successfully completed.

Notice "table fetch continued row" statistic. Tanel wrote that this counter is usually increasing when rows are accessed with index access paths.
In my case I have full scan that is increasing the value. This count is number of chained pieces Oracle had to go through in order to find the individual pieces of the rows.
I won’t go any further in detail - just check Tanel’s blog post.


Let’s identify chained rows running ANALYZE command with the LIST CHAINED ROWS option. This command will collect information about each migrated or chained row.

SQL> analyze table SCOTT.CLIENT list chained rows;
 
Table analyzed.
 
SQL> select count(*) from chained_rows;
 
 COUNT(*)
----------
  2007045

SQL> select partition_name, count(*) from chained_rows group by partition_name;
 
PARTITION_NAME                   COUNT(*)
------------------------------ ----------
CLIENT_CITY                       1411813
CLIENT_CNTR                       552873
CLIENT_STRNG                      42359

Table with 2097647 rows has 2007045 chained/migrated rows. This was causing so many reads for simple full scan of the small table.


I have decided to rebuild table partitions without changing PCTFREE parameter to fit migrated rows into a single block.


After rebuild number of chained rows decreased.

SQL> analyze table SCOTT.CLIENT list chained rows;
 
Table analyzed.
 
SQL> select count(*) from chained_rows;
 
 COUNT(*)
----------
    37883

Now query finished in 14 secs without sequential reads happening.
select  count(1)
  from CLIENT k
  where k.expires is null

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.34      13.96     185802     185809          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.34      13.96     185802     185809          0           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 369  (MSUTIC)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=185809 pr=185802 pw=0 time=13965941 us)
   1905617    1905617    1905617   PARTITION LIST ALL PARTITION: 1 4 (cr=185809 pr=185802 pw=0 time=13560379 us cost=109526 size=3811234 card=1905617)
   1905617    1905617    1905617    TABLE ACCESS FULL CLIENT PARTITION: 1 4 (cr=185809 pr=185802 pw=0 time=12848619 us cost=109526 size=3811234 card=1905617)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
1905617    PARTITION LIST (ALL) PARTITION: START=1 STOP=4
1905617     TABLE ACCESS   MODE: ANALYZED (FULL) OF 'CLIENT' (TABLE) 
                PARTITION: START=1 STOP=4

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to CLIENT                       2        0.00          0.00
  direct path read                             3569        0.11          8.99
  SQL*Net message from CLIENT                     2        0.00          0.01


Snapper also showed that I don’t have problem with row chaining.
SQL> @sn 15 6601
@snapper all 15 1 "6601"
Sampling SID 6601 with interval 15 seconds, taking 1 snapshots...
 
-- Session Snapper v4.06 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 SID, USERNAME  , TYPE, STATISTIC                                                 ,      DELTA, HDELTA/SEC, %TIME, GRAPH    , NUM_WAITS,  WAITS/SEC,   AVERAGES
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   6601, MSUTIC , STAT, Requests to/from CLIENT                                ,          1,     .07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, user calls                                                ,          1,        .07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, pinned cursors current                                 ,         -1,    -.07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, session logical reads                                  ,     149590,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, CPU used when call started                             ,        227,   15.02,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, CPU used by this session                               ,        227,   15.02,      ,             ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, DB time                                                   ,       1047,   69.29,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, user I/O wait time                                     ,        424,   28.06,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, non-idle wait time                                     ,        424,   28.06,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, non-idle wait count                                    ,       3216,  212.84,      ,          ,       ,           ,       ~ per execution
   6601, MSUTIC , STAT, session uga memory                                     ,     135248,   8.95k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read total IO requests                        ,       9354,  619.07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read total multi block requests               ,       9333,  617.68,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read total bytes                              , 1225228288,  81.09M,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, cell physical IO interconnect bytes                    , 1225228288,  81.09M,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets                                        ,     149578,    9.9k,      ,             ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets from cache                             ,          5,     .33,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets from cache (fastpath)                  ,          5,     .33,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets direct                                 ,     149572,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, logical read bytes from cache                          ,      40960,   2.71k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical reads                                            ,     149548,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical reads direct                                  ,     149548,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read IO requests                              ,       9353,  619.01,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read bytes                                    , 1225097216,  81.08M,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, calls to kcmgcs                                           ,          5,     .33,      ,          ,          ,        ,       ~ per execution
   6601, MSUTIC , STAT, file io wait time                                      ,        304,   20.12,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, total number of slots                                  ,         -2,    -.13,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, Effective IO time                                      ,    4239980, 280.61k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, Number of read IOs issued                              ,       9354,  619.07,      ,          ,       ,        ,          ~ per execution
   6601, MSUTIC , STAT, no work - consistent read gets                         ,     149564,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, Cached Commit SCN referenced                           ,     149132,   9.87k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scans (cache partitions)                         ,          3,      .2,         ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scans (direct read)                              ,          3,      .2,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scan rows gotten                                 ,    3518684, 232.88k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scan blocks gotten                               ,     149559,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, bytes sent via SQL*Net to CLIENT                       ,        211,   13.96,      ,          ,       ,           ,   105.5 bytes per roundtrip
   6601, MSUTIC , STAT, bytes received via SQL*Net from CLIENT                 ,          8,     .53,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, SQL*Net roundtrips to/from CLIENT                      ,          2,     .13,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , TIME, DB CPU                                                    ,    2000964,   132.43ms, 13.2%, [@@     ],       ,        ,
   6601, MSUTIC , TIME, sql execute elapsed time                               ,    8500210,   562.57ms,    56.3%, [###### ],       ,        ,
   6601, MSUTIC , TIME, DB time                                                   ,    8500269,   562.57ms,    56.3%, [###### ],       ,        ,     14.62s unaccounted time
   6601, MSUTIC , WAIT, direct path read                                          ,    4059380,   268.66ms, 26.9%, [WWW    ],      3064,  202.78,  1.32ms average wait
   6601, MSUTIC , WAIT, SQL*Net message to CLIENT                              ,          4,   .26us,   .0%, [       ],      1,     .07,        4us average wait
   6601, MSUTIC , WAIT, SQL*Net message from CLIENT                            ,    8006127,   529.87ms,    53.0%, [WWWWWW ],      1,     .07,   8.01s average wait
 
--  End of Stats snap 1, end=2016-02-24 08:23:59, seconds=15.1
 
 
----------------------------------------------------------------------------------------------------
Active% | INST | SQL_ID       | SQL_CHILD | EVENT                               | WAIT_CLASS
----------------------------------------------------------------------------------------------------
 29% | 1 | gg54c4j6b9jb0   | 0         | direct path read                    | User I/O
 21% | 1 | gg54c4j6b9jb0   | 0         | ON CPU                           | ON CPU
 
--  End of ASH snap 1, end=2016-02-24 08:23:59, seconds=15, samples_taken=96


Reorganizing table solved my problem. Finally full scans on the table were running much faster.

There is interesting support note "Doc ID 238519.1" which states that trailing NULLs do not take space in the rowpiece: initially row fits in one rowpiece.
If column beyond 255 is then populated, then all the NULL columns between last populated and this new column now takes up space.
Row has to be split into two rowpieces and the new rowpiece is migrated to a new block - row becomes chained.

In our table we have trailing NULL columns so this probably caused such migration.


Unfortunately I don’t have time to perform detailed investigation.




REFERENCES
http://blog.tanelpoder.com/2009/11/04/detect-chained-and-migrated-rows-in-oracle/
Updating a Row with More Than 255 Columns Causes Row Chaining (Doc ID 238519.1)

5 comments:

  1. Thanks Marko for sharing it! A really great stuff! Doc ID 238519.1 is also really helpful!

    ReplyDelete
  2. It would be interesting to simulate row migrations, take block dumps but daily work won't wait :) Thx for comment!

    ReplyDelete
  3. Thank you for this post,
    Foued

    ReplyDelete