前几天和一个朋友讨论到Oracle Net Services的高级特性的问题,就研究了下。
Oracle 官网上的说明参考:
Enabling Advanced Features of Oracle Net Services
http://download.oracle.com/docs/cd/B19306_01/network.102/b14212/advcfg.htm#i473297
在这篇文章里讨论到了Net Services的几个特性:
· Configuring Advanced Network Address and Connect Data Information
· Configuring Runtime Connection Load Balancing
· Configuring Transparent Application Failover
· Configuring Connections to Non-Oracle Database Services
在这篇文章中,我们重点看一下TAF。
Configuring Transparent Application Failover
http://download.oracle.com/docs/cd/B19306_01/network.102/b14212/advcfg.htm#i475648
在RAC的Failover中也对TAF进行了说明:
racle RAC Failover 详解
一. TAF 介绍
1.1 官网对TAF 的说明:
Transparent Application Failover (TAF) is a client-side feature that allows for clients to reconnect to surviving databases in the event of a failure of a database instance. Notifications are used by the server to trigger TAF callbacks on the client-side.
TAF is configured using either client-side specified TNS connect string or using server-side service attributes. However, if both methods are used to configure TAF, the server-side service attributes will supersede the client-side settings. The server-side service attributes are the preferred way to set up TAF.
TAF can operate in one of two modes, Session Failover and Select Failover. Session Failover will recreate lost connections and sessions. Select Failover will replay queries that were in progress.
When there is a failure, callback functions will be initiated on the client-side via OCI callbacks. This will work with standard OCI connections as well as Connection Pool and Session Pool connections. Please see the OCI manual for more details on callbacks, Connection Pools, and Session Pools.
TAF will work with RAC. For more details and recommended configurations, please see the RAC Administration Guide.
TAF will operate with Physical Data Guard to provide automatic failover.
1.2 TAF 使用场合
TAF works with the following database configurations to effectively mask a database failure:
(1)Oracle Real Application Clusters
(2)Replicated systems
(3)Standby databases
(4)Single instance Oracle database
1.3 FAILOVER_MODE 参数
FAILOVER_MODE 参数必须包含CONNECT_DATA 选项,也可以包含一些其他的参数,具体参数和意义参考下表:
| FAILOVER_MODE Subparameter | Description | 
|  | Specify a different net service name for backup connections. A backup should be specified when using  | 
|  | Specify the type of failover. Three types of Oracle Net failover functionality are available by default to Oracle Call Interface (OCI) applications: ·          ·          ·          | 
|  | Determines how fast failover occurs from the primary node to the backup node: ·          ·          | 
|  | Specify the number of times to attempt to connect after a failover. If  Note: If a callback function is registered, then this subparameter is ignored. | 
|  | Specify the amount of time in seconds to wait between connect attempts. If  Note: If a callback function is registered, then this subparameter is ignored. | 
二. TAF的示例
2.1 注意事项:
不能在listener.ora 配置文件的SID_LIST_listener_name 部分设置GLOBAL_DBNAME参数, 这个静态的global配置会禁用TAF.
启用这种Failover的方法就是在客户端的tnsnames.ora中添加FAILOVER=ON 条目,这个参数默认就是ON,所以即使不添加这个条目,客户端也会获得这种Failover能力。
2.2 TAF with Connect-Time Failover and Client Load Balancing
sales.us.acme.com=
(DESCRIPTION=
(LOAD_BALANCE=on)
(FAILOVER=on)
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.acme.com)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))))
在这个示例中, Oracle的net 连接会随即去连2个地址,如果连接失败,会去连其他节点。
2.3 TAF Retrying a Connection
sales.us.acme.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.acme.com)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic)
(RETRIES=20)
(DELAY=15))))
在这个示例中,我们设置了一个ADDRESS, 并且设置了 Retries和DELAY 参数。 当连接失败后, Oracle net 会等15秒,然后再次去连接address的地址。 最多重连20次。
2.4 TAF Pre-Establishing a Connection
sales1.us.acme.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.acme.com)
(INSTANCE_NAME=sales1)
(FAILOVER_MODE=
(BACKUP=sales2.us.acme.com)
(TYPE=select)
(METHOD=preconnect))))
sales2.us.acme.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.acme.com)
(INSTANCE_NAME=sales2)
(FAILOVER_MODE=
(BACKUP=sales1.us.acme.com)
(TYPE=select)
(METHOD=preconnect))))
在这里我们设置成preconnect模式。 就是在最初建立连接时就同时建立到所有实例的连接,当发生故障时,立刻就可以切换到其他链路上。
BASIC方式在Failover时会有时间延迟,PRECONNECT方式虽然没有时间延迟,但是建立多个冗余连接会消耗更多资源,两者就是是用时间换资源和用资源换时间的区别。
这里要注意, 如果使用preconnect 模式,那么必须指定BACKUP参数。
三. 在Data Guard 下验证TAF
RAC 下的TAF 之前做过多次, 这里用Data Guard 做一个验证。
在客户端的tnsnames.ora 文件里添加如下参数:
TAFTEST=
(DESCRIPTION=
(LOAD_BALANCE=on)
(FAILOVER=on)
(ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.2) (PORT=1521))
(ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.3) (PORT=1521))
(CONNECT_DATA= (SERVICE_NAME=orcl)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic)
)))
用Tnsping 测试一下:
C:/Users/Administrator.DavidDai>tnsping taftest
TNS Ping Utility for 32-bit Windows: Version 11.2.0.1.0 - Production on 13-12月-2010 00:37:08
Copyright (c) 1997, 2010, Oracle. All rights reserved.
已使用的参数文件:
D:/app/Administrator/product/11.2.0/dbhome_1/network/admin/sqlnet.ora
已使用 TNSNAMES 适配器来解析别名
尝试连接 (DESCRIPTION= (LOAD_BALANCE=on) (FAILOVER=on) (ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.2) (PORT=1521)) (ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.3) (PORT=1521)) (CONNECT_DATA= (SERVICE_NAME=orcl) (FAILOVER_MODE= (TYPE=select) (METHOD=basic))))
OK (20 毫秒)
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on 星期一 12月 13 00:40:49 2010
Copyright (c) 1982, 2010, Oracle. All rights reserved.
SQL> conn sys/oracle@taftest as sysdba;
已连接。
SQL> select db_unique_name from v$database;
DB_UNIQUE_NAME
------------------------------
orcl_pd
这时,我们把主库shutdown,在来查看:
SQL> select db_unique_name from v$database;
DB_UNIQUE_NAME
------------------------------
orcl_st
这里变成了备库,但是备库是mount standby模式,我们查看确认一下:
SQL> select open_mode from v$database;
OPEN_MODE
----------
MOUNTED
SQL>
TAF 切换成功。
我们还可以通过对V$SESSION 视图的FAILOVER_TYPE, FAILOVER_METHOD,和 FAILED_OVER 三个字段的查看来验证TAF 的配置。
SQL 如下:
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*)
FROM V$SESSION
GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;
MACHINE FAILOVER_TYPE FAILOVER_M FAI COUNT(*)
-------------------- ------------- ---------- --- ----------
dg1 NONE NONE NO 2
dg2 NONE NONE NO 15
WORKGROUP/DAVIDDAI SELECT BASIC YES 1
做这个测试的目的就是为了DG 切换的方便。 一般情况下应用会连接数据库是对应一个实例,假设这个数据库是DG. 当某次意外,我们进行了主备切换,这时候,IP地址发生改变,应用就不能连接到备库了。 所以,这就是对Data Guard设置TAF的意义。 设置TAF之后,即使发生切换,我们也可以不用修改IP,应用能正常连接数据库。
当然如果客户端比较多的情况下,修改监听配置也是很麻烦的。 不过现在的系统,很多都是通过中间件与数据库进行连接的。 这种情况下,我们只需要把中间件与数据库连接这块搞定就ok了。










