问题概述
巡检时发现某套数据库中有大量的SQL*Net message from dblink等待事件,虽然这套库频繁使用dblink但是却未出现过这么多的会话同时处于dblink等待事件,检查dblink连接目标库,发现目标库中有大量的direct path read等待事件;
## 源端出现大量的SQL*Net message from dblink等待事件
INST_ID EVENT COUNT(1)
-------- ---------------------------------------- --------------------
2 SQL*Net message from dblink 238
1 SQL*Net message from dblink 229
1 db file scattered read 1
2 gc cr request 1
1 Backup: MML write backup piece 1
1 latch: shared pool 1
## 查看该等待事件相关信息
SQL> set linesize 500
SQL> set pagesize 400
SQL> col uername for a10
SQL> col event for a30
SQL> col machine for a25
SQL> col program for a35
SQL> col sql_id for a14
SQL> col spid for a10
SQL> col USERNAME for a20
SQL> select a.inst_id,a.sid,a.serial#,a.LOGON_TIME,a.PREV_SQL_ID,b.spid,a.username,a.event,a.machine,a.program,a.sql_id,a.blocking_session,a.status
2 from gv$session a,gv$process b
3 where a.paddr=b.addr
4 and a.inst_id=b.inst_id
5 and event like '%&event%';
Enter value for event: SQL*Net message from dblink
old 5: and event like '%&event%'
new 5: and event like '%SQL*Net message from dblink%'
INST_ID SID SERIAL# LOGON_TIME PREV_SQL_ID SPID USERNAME EVENT MACHINE PROGRAM SQL_ID BLOCKING_SESSION STATUS
------- -------------------- -------------------- ----------------- --------------------------------------- ---------- -------------------- ------------------------------ ------------------------- ----------------------------------- -------------- -------------------- -------
1 1551 40170 20221108 23:02:35 242uywrzq8anf 20316696 username SQL*Net message from dblink machine001 JDBC Thin Client 0g0r55g7zvbc6 ACTIVE
1 1659 53863 20221108 23:17:35 242uywrzq8anf 27918786 username SQL*Net message from dblink machine001 JDBC Thin Client 0g0r55g7zvbc6 ACTIVE
1 1844 51421 20221108 23:32:48 242uywrzq8anf 17433048 username SQL*Net message from dblink machine001 JDBC Thin Client 0g0r55g7zvbc6 ACTIVE
## 查看造成等待事件的sql
SQL> select sql_text from v$sql where sql_id='&sql_id';
Enter value for sql_id: 0g0r55g7zvbc6
old 1: select sql_text from v$sql where sql_id='&sql_id'
new 1: select sql_text from v$sql where sql_id='0g0r55g7zvbc6'
SQL_TEXT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
select db.*,rownum from (select "ID","FL_NAME","BUREAU_CODE","PARENT_ID","IS_LEAF","CLASSIFY_ID" from "table1" where 1=1 and "PARENT_ID" = '00000000000001' ) db where rownum<200000
## 查看相关对象信息
SQL> select OWNER,OBJECT_NAME,OBJECT_TYPE,STATUS from dba_objects where OBJECT_NAME='table1';
OWNER OBJECT_NAME OBJECT_TYPE STATUS
------------- ----------------------------- ----------------- ----------------------
username table1 VIEW VALID
## 查看视图创建语句
select OWNER,VIEW_NAME,TEXT from dba_views where VIEW_NAME='table1';
## 看到该视图是通过同义词访问其他数据库的表
SQL> select OWNER,OBJECT_NAME,OBJECT_TYPE,STATUS from dba_objects where OBJECT_NAME='table2';
OWNER OBJECT_NAME OBJECT_TYPE STATUS
-------------- ----------------------------- ------------- --------------------
usernm table2 SYNONYM VALID
## 查看该同义词信息
SQL> SELECT * FROM DBA_SYNONYMS WHERE SYNONYM_NAME='table2';
OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME DB_LINK
----------------- --------------------------------- ------------------------- -------------------------- ---------------------
user table2 user1 DM_xxxxx_LOCATION LINK_xxxxxx_SBDB
## 登录至目标端查看等待事件,目标端中有大量的direct path read以及enq: KO - fast object checkpoint
INST_ID EVENT COUNT(1) WAIT_CLASS
-------- -------------------------------------------------- ---------- ------------------------------
1 direct path read 207 User I/O
1 db file scattered read 4 User I/O
1 log file sync 3 Commit
1 db file parallel read 2 User I/O
1 enq: KO - fast object checkpoint 2 Application
1 db file parallel write 1 System I/O
1 log file parallel write 1 System I/O
2 direct path read 253 User I/O
2 db file scattered read 4 User I/O
2 db file parallel write 3 System I/O
2 enq: KO - fast object checkpoint 3 Application
2 control file sequential read 1 System I/O
问题原因
Oracle在使用direct path read时,需要在操作对象上做一次对象级的checkpoint,因为direct path read读取的数据是直接从物理磁盘中读取的,所以需要保证物理磁盘中数据是最新的。此时需要等待脏数据写入磁盘,这个等待事件就是Enq: KO - Fast Object Checkpoint;
direct path read较高的可能原因有:
- 大量的磁盘排序操作,无法在排序区中完成排序,需要利用temp表空间进行排序.
- 大量的Hash Join操作,利用temp表空间保存hash区。
- SQL语句的并行处理
- 大表的全表扫描,在Oracle11g中,全表扫描的算法有新的变化,根据表的大小、高速缓存的大小等信息,决定是否绕过SGA直接从磁盘读取数据。而10g则是全部通过高速缓存读取数据,称为table scan(large)。11g认为大表全表扫描时使用直接路径读,可能比10g中的数据文件散列读(db file scattered reads)速度更快,使用的latch也更少。
## 该sql语句的执行计划如下:
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2193K(100)| | | |
|* 1 | COUNT STOPKEY | | | | | | | |
|* 2 | VIEW | | 9999 | 2060K| 2193K (1)| 07:18:46 | | |
|* 3 | COUNT STOPKEY | | | | | | | |
| 4 | VIEW | | 281K| 56M| 2193K (1)| 07:18:46 | | |
| 5 | UNION-ALL | | | | | | | |
|* 6 | COUNT STOPKEY | | | | | | | |
| 7 | PARTITION LIST ALL | | 1863K| 79M| 16 (69)| 00:00:01 | 1 | 2 |
| 8 | PARTITION LIST ALL | | 1863K| 79M| 16 (69)| 00:00:01 | 1 | 18 |
|* 9 | TABLE ACCESS FULL | DM_xxxxxxxx_LOCATION | 1863K| 79M| 16 (69)| 00:00:01 | 1 | 36 |
|* 10 | COUNT STOPKEY | | | | | | | |
| 11 | PARTITION LIST ALL | | 1863K| 79M| 16 (69)| 00:00:01 | 1 | 2 |
| 12 | PARTITION LIST ALL | | 1863K| 79M| 16 (69)| 00:00:01 | 1 | 18 |
|* 13 | TABLE ACCESS FULL | DM_xxxxxxxx_LOCATION | 1863K| 79M| 16 (69)| 00:00:01 | 1 | 36 |
| 14 | PARTITION LIST ALL | | 280K| 55M| 2193K (1)| 07:18:43 | 1 | 2 |
| 15 | PARTITION LIST ALL | | 280K| 55M| 2193K (1)| 07:18:43 | 1 | 18 |
|* 16 | TABLE ACCESS FULL | DM_xxxxxxxx_LOCATION | 280K| 55M| 2193K (1)| 07:18:43 | 1 | 36 |
| 17 | INLIST ITERATOR | | | | | | | |
| 18 | TABLE ACCESS BY INDEX ROWID| OMS_DM_xxxxxxxx_SUBSTATION | 1590 | 149K| 252 (0)| 00:00:04 | | |
|* 19 | INDEX RANGE SCAN | IDX_DM_FL_xxxxxxx_ID | 1590 | | 9 (0)| 00:00:01 | | |
--------------------------------------------------------------------------------------------------------------------------------
OBJECT SIZE
****************************************************************************************
SEGMENT
OWNER NAME SEGMENT_TYPE S_SIZE
--------------- ----------------------------------- ------------------ ----------
user ***OMS_DM_xxxxxxxx_SUBSTATION TABLE 37
IDX_DM_FL_CLASSFY_ID INDEX 2
user1 ***DM_xxxxxxxx_LOCATION TABLE SUBPARTITION 88922
IDX_DM_FL_xxxxxxx_ID INDEX SUBPARTITION 3410
PK_DM_xxxxxxxx_LOCATION INDEX 6985
## 直接路径读主要用来减少全表扫描对buffer cache的使用,由于使用direct path read需要在操作对象上做一次对象级的checkpoint,可能会造成IO抖动。该sql语句的执行计划出现了全表扫描,由于该表比较大,所以oracle认为此时使用direct path read要比通过高速缓存读取数据更快。
解决方案
1、优先优化相关的sql语句消除全表扫描。
2、修改隐含参数 “_serial_direct_read”=NEVER。
参考文档
1、Enq: KO - Fast Object Checkpoint Wait Event (Doc ID 2547319.1)