Oracle 19C Data Guard
基础运维
-0
5Failovers (GAP)
原主库 |
原备库 |
|
新主库 |
独立库 |
192.168.31.90 |
192.168.31.100 |
192.168.31.100 |
192.168.31.90 |
|
cjcdb |
chendb |
chendb |
cjcdb |
Failover
:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sbydb/data-guard-concepts-and-administration.pdf
Figure 9-4 Failover to a Standby Database
Performing a Failover to a Physical Standby Database
关于archive gap
的问题?
上一篇博客《04 Failovers疑问?》写了关于
archive gap的疑问,在实验中,我提前将备库关机,主库端插入大量数据产生
3个归档文件,并手动将最后
3个归档文件重命名,目的是不让备库获取到这三个归档文件,在启动备库,试图模拟出备库
archive gap场景,但是在备库端
v$archive_gap中显示空的,备库没有检测出
archive gap的存在吗?
实际上是本人对archive gap概念存在一些误解,比如主库有
1到
100个归档,我认为只要有任何归档文件在备库端获取失败都会出现
archive gap,都会记录到
v$archive_gap,通过上一篇实验发现这种理论显然是不对的,我强制将主库
98,99,100三个归档文件重命名,备库端并没有出现
archive gap,即在
v$archive_gap中不会有数据。
那么究竟什么场景才会出现archive gap?真实的场景是,备库在接收主库归档文件时有部分没有接收成功,但后续的归档文件又接收成功了,比如主库
1到
100个归档文件,出于某种原因,备库没有接收到
97,98两个归档,但是后面的
99,100归档又能正常接收,这时就会产生
archive gap,在
v$archive_gap会查到
97,98归档信息。
(感谢墨天轮平台“你好我是李白”的答疑解惑
)
实验过程如下:
场景二:archive gap下的failover
主库模拟故障,模拟归档gap
:
先停掉备库:
不接收主库产生的
redo
或归档数据
SQL> shutdown immediate
主库:生成测试数据,生成redo
和归档数据
—session 1
SQL>
declare
begin
for i in 1 .. 1000
000
loop
insert into test1 values (i);
commit;
end loop;
end;
插入数据期间,生成了3
个归档文件
[oracle@cjcos01 arch]$ pwd
/arch
……
cjcpdb_arch_1_74_1030641846.arc
cjcpdb_arch_1_75_1030641846.arc
cjcpdb_arch_1_76_1030641846.arc
主库重命名新产生的前两个归档文件,模拟归档gap
[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_74_1030641846.arc cjcpdb_arch_1_74_1030641846.arc.bak
[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_75_1030641846.arc cjcpdb_arch_1_75_1030641846.arc.bak
再次插入部分数据
SQL>
declare
begin
for i in 1 .. 1000
0
loop
insert into test1 values (i);
commit;
end loop;
end;
启动备库:
SQL> startup
—
备库启动时,查看对应主库日志,提示找不到
74,75
两个归档文件,无法将
74,75
发送到备库端。
2020-04-19T18:37:53.170879+08:00
Errors in file /u01/app/oracle/diag/rdbms/cjcdb/cjcdb/trace/cjcdb_tt00_2349.trc:
ORA-00308: cannot open archived log '/arch/cjcpdb_arch_1_75_1030641846.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
2020-04-19T18:37:53.171203+08:00
Errors in file /u01/app/oracle/diag/rdbms/cjcdb/cjcdb/trace/cjcdb_tt00_2349.trc:
ORA-00308: cannot open archived log '/arch/cjcpdb_arch_1_74_1030641846.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
备库:查看archive log
,实际应该是
74
到
75
,不清楚为什么会显示
73
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
———- ————- ————–
1
73
75
备库:没有接收到74,75
两个归档文件
主库重命名system01.dbf
模拟数据库故障
[oracle@cjcos01 arch]$ cd /u01/app/oracle/oradata/CJCDB/
[oracle@cjcos01 CJCDB]$ mv system01.dbf system01.dbf.bak
SQL> alter system checkpoint;
SQL> shutdown abort
主库启动失败
SQL> startup
ORACLE instance started.
Total System Global Area 1375728192 bytes
Fixed Size
9134656 bytes
Variable Size
1107296256 bytes
Database Buffers
251658240 bytes
Redo Buffers
7639040 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 1 – see DBWR trace file
ORA-01110: data file 1: '/u01/app/oracle/oradata/CJCDB/system01.dbf'
SQL> select open_mode from v$database;
OPEN_MODE
——————–
MOUNTED
备库:
1.
检查
dg
恢复模式
(
最大性能模式
)
SQL> select database_role,protection_level,protection_mode from v$database;
DATABASE_ROLE
PROTECTION_LEVEL PROTECTION_MODE
—————- ——————– ——————–
PHYSICAL STANDBY MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
2
检查
archive_gap
,实际应该是74
到
75
,不清楚为什么会显示
73
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
———- ————- ————–
1
73
75
主库:
SQL> select name from v$archived_log where thread#=1 and sequence# between 73 and 75;
NAME
——————————————————————————–
/arch/cjcpdb_arch_1_73_1030641846.arc
/arch/cjcpdb_arch_1_74_1030641846.arc
/arch/cjcpdb_arch_1_75_1030641846.arc
将73
归档文件拷贝到备库端
[oracle@cjcos01 arch]$ scp cjcpdb_arch_1_73_1030641846.arc cjcos02:/arch
主库在mount
状态下执行
flush redo
操作
SQL> ALTER SYSTEM FLUSH REDO TO chendb;
ALTER SYSTEM FLUSH REDO TO chendb
*
ERROR at line 1:
ORA-16416: No viable switchover targets available
备库:手动注册73
号归档,也显示归档已经注册了
SQL> alter database register logfile '/arch/cjcpdb_arch_1_73_1030641846.arc';
alter database register logfile '/arch/cjcpdb_arch_1_73_1030641846.arc'
*
ERROR at line 1:
ORA-16089: archive log has already been registered
但是archive gap
还是显示有
73
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
———- ————- ————–
1
73
75
主库:将74
号归档文件名改回来
[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_74_1030641846.arc.bak cjcpdb_arch_1_74_1030641846.arc
再次执行flush redo
SQL> ALTER SYSTEM FLUSH REDO TO chendb;
ALTER SYSTEM FLUSH REDO TO chendb
*
ERROR at line 1:
ORA-16416: No viable switchover targets available
查看主库日志,主库已经将74
归档发生备库端了,开始尝试读取
75
号归档文件。
如果flush redo
命令没生效,也可以将归档文件拷到备库端,手动执行注册
SQL>
ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';
备库:只有1
个
75
号归档找不到了
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
———- ————- ————–
1
75
75
备库:取消应用进程
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
备库:由于存在archive gap
,是不允许常规的
failover
SQL> ALTER DATABASE FAILOVER TO chendb;
ALTER DATABASE FAILOVER TO chendb
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER…FINISH not allowed due to gap for thr 1, seq 75-75
备库:加force
也不生效
SQL> ALTER DATABASE FAILOVER TO chendb force;
ALTER DATABASE FAILOVER TO chendb force
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER…FINISH not allowed due to gap for thr 1, seq 75-75
强制failover:
在存在
archive gap
情况下,强制执行
failover
,会丢失数据,正式环境谨慎使用!!!
Perform a data loss failover.
If an error condition cannot be resolved, a failover can still be performed (with some data loss) by issuing the following SQL statement on the target standby database:
SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
打开数据库
SQL> ALTER DATABASE OPEN;
查看数据
SQL> select count(*) from test1;
COUNT(*)
———-
252780
test1
表丢失了
1000
000+1000-
252780
=748220
条数据。
欢迎关注我的微信公众号"IT小Chen",共同学习,共同成长!!!