欢迎光临
我们一直在努力

11GR2RAC丢失OCR和CONTROLFILE重建恢复过程

今天准备用自己的rac虚拟机做点实验的,可是虚拟机拉起来时,频频报错,估计好久不用了(前一阵子有厂商来推销虚拟化全在那上做测试了),挨个检查发现是OCR出故障了,没办法重建下吧,顺便记录下重建的过程

1.使用orccheck命令报错

[root@rac1 ~]# ocrcheck

PROT602: Failed to retrieve data from the cluster registry

PROC26: Error while accessing the
physical storage

2.使用ocrconfig restore恢复仍然报错

[root@rac1 rac-cluster]# ocrconfig restore/u01/app/11.2.0/grid/cdata/raccluster/backup00.ocr

PROT35: The configured OCR locations are not accessible.

3.强制停掉两个节点的crs(两个节点都要执行)

[root@rac1 rac-cluster]# crsctl stop crs f

CRS2791: Starting shutdown of
Oracle High Availability Services-managed resources on ‘rac1’

CRS2673: Attempting to stop ‘ora.mdnsd’
on ‘rac1’

CRS2673: Attempting to stop ‘ora.ctssd’
on ‘rac1’

CRS2673: Attempting to stop ‘ora.evmd’
on ‘rac1’

CRS2673: Attempting to stop ‘ora.asm’
on ‘rac1’

CRS2677: Stop of ‘ora.evmd’ on ‘rac1’
succeeded

CRS2677: Stop of ‘ora.mdnsd’ on ‘rac1’
succeeded

CRS2677: Stop of ‘ora.ctssd’ on ‘rac1’
succeeded

CRS2677: Stop of ‘ora.asm’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.cluster_interconnect.haip’
on ‘rac1’

CRS2677: Stop of ‘ora.cluster_interconnect.haip’
on ‘rac1’ succeeded

CRS2673: Attempting to stop ‘ora.cssd’
on ‘rac1’

CRS2677: Stop of ‘ora.cssd’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.crf’
on ‘rac1’

CRS2677: Stop of ‘ora.crf’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.gipcd’
on ‘rac1’

CRS2677: Stop of ‘ora.gipcd’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.gpnpd’
on ‘rac1’

CRS2677: Stop of ‘ora.gpnpd’ on ‘rac1’
succeeded

CRS2793: Shutdown of Oracle High
Availability Services-managed resources on ‘rac1’ has completed

CRS4133: Oracle High Availability
Services has been stopped.

[root@rac2 ~]# /u01/app/11.2.0/grid/bin/crsctl stop crs f

CRS2791: Starting shutdown of
Oracle High Availability Services-managed resources on ‘rac2’

CRS2673: Attempting to stop ‘ora.mdnsd’
on ‘rac2’

CRS2673: Attempting to stop ‘ora.ctssd’
on ‘rac2’

CRS2673: Attempting to stop ‘ora.evmd’
on ‘rac2’

CRS2673: Attempting to stop ‘ora.asm’
on ‘rac2’

CRS2677: Stop of ‘ora.evmd’ on ‘rac2’
succeeded

CRS2677: Stop of ‘ora.mdnsd’ on ‘rac2’
succeeded

CRS2677: Stop of ‘ora.ctssd’ on ‘rac2’
succeeded

CRS2677: Stop of ‘ora.asm’ on ‘rac2’
succeeded

CRS2673: Attempting to stop ‘ora.cluster_interconnect.haip’
on ‘rac2’

CRS2677: Stop of ‘ora.cluster_interconnect.haip’
on ‘rac2’ succeeded

CRS2673: Attempting to stop ‘ora.cssd’
on ‘rac2’

CRS2677: Stop of ‘ora.cssd’ on ‘rac2’
succeeded

CRS2673: Attempting to stop ‘ora.crf’
on ‘rac2’

CRS2677: Stop of ‘ora.crf’ on ‘rac2’
succeeded

CRS2673: Attempting to stop ‘ora.gipcd’
on ‘rac2’

CRS2677: Stop of ‘ora.gipcd’ on ‘rac2’
succeeded

CRS2673: Attempting to stop ‘ora.gpnpd’
on ‘rac2’

CRS2677: Stop of ‘ora.gpnpd’ on ‘rac2’
succeeded

CRS2793: Shutdown of Oracle High
Availability Services-managed resources on ‘rac2’ has completed

CRS4133: Oracle High Availability
Services has been stopped.

4.在节点一上已独占方式启动(只启动ASM实例但不启动CRS)

[root@rac1 ~]# crsctl start crs excl nocrs

CRS4123: Oracle High Availability
Services has been started.

CRS2672: Attempting to start ‘ora.mdnsd’
on ‘rac1’

CRS2676: Start of ‘ora.mdnsd’ on ‘rac1’
succeeded

CRS2672: Attempting to start ‘ora.gpnpd’
on ‘rac1’

CRS2676: Start of ‘ora.gpnpd’ on ‘rac1’
succeeded

CRS2672: Attempting to start ‘ora.cssdmonitor’
on ‘rac1’

CRS2672: Attempting to start ‘ora.gipcd’
on ‘rac1’

CRS2676: Start of ‘ora.cssdmonitor’
on ‘rac1’ succeeded

CRS2676: Start of ‘ora.gipcd’ on ‘rac1’
succeeded

CRS2672: Attempting to start ‘ora.cssd’
on ‘rac1’

CRS2672: Attempting to start ‘ora.diskmon’
on ‘rac1’

CRS2676: Start of ‘ora.diskmon’ on
‘rac1’ succeeded

CRS2676: Start of ‘ora.cssd’ on ‘rac1’
succeeded

CRS2679: Attempting to clean ‘ora.cluster_interconnect.haip’
on ‘rac1’

CRS2672: Attempting to start ‘ora.ctssd’
on ‘rac1’

CRS2681: Clean of ‘ora.cluster_interconnect.haip’
on ‘rac1’ succeeded

CRS2672: Attempting to start ‘ora.cluster_interconnect.haip’
on ‘rac1’

CRS2676: Start of ‘ora.ctssd’ on ‘rac1’
succeeded

CRS2676: Start of ‘ora.cluster_interconnect.haip’
on ‘rac1’ succeeded

CRS2672: Attempting to start ‘ora.asm’
on ‘rac1’

CRS2676: Start of ‘ora.asm’ on ‘rac1’
succeeded
      

5.重建原votedisk所在的磁盘组

[root@rac1 ~]# su grid

[grid@rac1 ~]$ sqlplus /as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Sun Feb 2800:09:322016

Copyright (c) 1982, 2013,
Oracle. 
All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 64bit Production

With the Real Application Clusters and Automatic Storage Management options

SQL>create diskgroup systemdg
normal redundancy
disk‘/dev/asm-diskb’,‘/dev/asm-diskc’,‘/dev/asm-diskd’ ATTRIBUTE ‘compatible.rdbms’=‘11.2’, ‘compatible.asm’=‘11.2’;

Diskgroup created.

6.使用最新的OCR备份进行恢复

[root@rac1 ~]# ocrconfig restore/u01/app/11.2.0/grid/cdata/raccluster/backup00.ocr

[root@rac1 ~]# ocrcheck

Status of Oracle Cluster
Registry
isas follows :

     Version                  :          3

     Total space (kbytes)    
:    
262120

     Used space (kbytes)     
:      
2900

     Available space (kbytes) :     259220

     ID                       : 1207909490

     Device/File Name        
+SYSTEMDG

                                   
Device
/File integrity check succeeded

                                   
Device
/Filenot configured

                                   
Device
/Filenot configured

                                    Device/Filenot configured

                                   
Device
/Filenot configured

     Cluster registry integrity check succeeded

     Logical corruption check succeeded

7.恢复votedisk

[root@rac1 ~]# crsctl query css votedisk

Located 0 voting disk(s).
[grid@rac1 ~]crsctl replace votedisk +SYSTEMDG

Successful addition of voting disk a0e3b9c0c5934f79bf67aa783618b59d.

Successful addition of voting disk 422a673bdbc94f97bff51f66f84a105e.

Successful addition of voting disk a1b110709eb84f39bf351f1b01aa0c7d.

Successfully replaced voting diskgroupwith+SYSTEMDG.

CRS4266: Voting file(s) successfully
replaced

[root@rac1 rac-cluster]# crsctl query css votedisk

##  STATE    File Universal Id                File Name Diskgroup

—  —–    —————–                ——— ———

 1. ONLINE  
03bd9e6851cf4fb4bf4c92e24a3d71cc (
/dev/asmdiskb) [SYSTEMDG]

 2. ONLINE  
362f2cfb5b3d4f29bf319c668e0efbe4 (
/dev/asmdiskc) [SYSTEMDG]

 3. ONLINE  
57659c53f4284fbdbfeabafb20c3fbdd (
/dev/asmdiskd) [SYSTEMDG]

Notes:如果在恢复时遇到如下错误:

[grid@rac1 ~]$ crsctl replace votedisk +SYSTEMDG

CRS4602: Failed 27toadd voting file a0e3b9c0c5934f79bf67aa783618b59d.

CRS4602: Failed 27toadd voting file 422a673bdbc94f97bff51f66f84a105e.

CRS4602: Failed 27toadd voting file a1b110709eb84f39bf351f1b01aa0c7d.

Failed toreplace voting diskgroupwith+SYSTEMDG.

CRS4000: Command Replace failed, or completed with errors.

登陆grid用户,检查并修改asm_diskstring参数

SQL>alter system set asm_diskstring=‘/dev/asm*’;

System altered.

检查并修改spfile参数,并重启ASM

SQL>create spfile from memory;

File created.

8.重启crs

[root@rac1 ~]# crsctl stop crs

CRS2791: Starting shutdown of
Oracle High Availability Services-managed resources on ‘rac1’

CRS2673: Attempting to stop ‘ora.ctssd’
on ‘rac1’

CRS2673: Attempting to stop ‘ora.asm’
on ‘rac1’

CRS2673: Attempting to stop ‘ora.mdnsd’
on ‘rac1’

CRS2677: Stop of ‘ora.mdnsd’ on ‘rac1’
succeeded

CRS2677: Stop of ‘ora.asm’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.cluster_interconnect.haip’
on ‘rac1’

CRS2677: Stop of ‘ora.cluster_interconnect.haip’
on ‘rac1’ succeeded

CRS2677: Stop of ‘ora.ctssd’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.cssd’
on ‘rac1’

CRS2677: Stop of ‘ora.cssd’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.gipcd’
on ‘rac1’

CRS2677: Stop of ‘ora.gipcd’ on ‘rac1’
succeeded

CRS2673: Attempting to stop ‘ora.gpnpd’
on ‘rac1’

CRS2677: Stop of ‘ora.gpnpd’ on ‘rac1’
succeeded

CRS2793: Shutdown of Oracle High
Availability Services-managed resources on ‘rac1’ has completed

CRS4133: Oracle High Availability
Services has been stopped.

[root@rac1 ~]# crsctl start has

CRS4123: Oracle High Availability Services has been started.

9.由于启动较慢,过几分钟查看状态

[root@rac1 ~]# crs_stat t

Name           Type           Target    State    
Host       

————————————————————

ora.DATADG.dg  ora….up.type
ONLINE    ONLINE    rac1       

ora….ER.lsnr ora….er.type ONLINE   
ONLINE    rac1       

ora….N1.lsnr ora….er.type ONLINE   
ONLINE    rac1       

ora….EMDG.dg ora….up.type ONLINE   
ONLINE    rac1       

ora.asm        ora.asm.type   ONLINE   
ONLINE    rac1       

ora.cvu        ora.cvu.type   ONLINE   
ONLINE    rac1       

ora.gsd        ora.gsd.type   OFFLINE  
OFFLINE              

ora….network ora….rk.type ONLINE   
ONLINE    rac1       

ora.oc4j       ora.oc4j.type  ONLINE   
OFFLINE              

ora.ons        ora.ons.type   ONLINE   
ONLINE    rac1       

ora.orcl.db    ora….se.type
ONLINE    OFFLINE              

ora….SM1.asm application   
ONLINE    ONLINE    rac1       

ora….C1.lsnr application   
ONLINE    ONLINE    rac1       

ora.rac1.gsd   application    OFFLINE  
OFFLINE              

ora.rac1.ons   application    ONLINE   
ONLINE    rac1       

ora.rac1.vip   ora….t1.type
ONLINE    ONLINE    rac1       

ora.rac2.vip   ora….t1.type
ONLINE    ONLINE    rac1       

ora.scan1.vip  ora….ip.type
ONLINE    ONLINE    rac1       

     

[root@rac1 ~]# crsctl check crs

CRS4638: Oracle High Availability Services is online

CRS4537: Cluster Ready Services is online

CRS4529: Cluster Synchronization Services is online

CRS4533: Event Manager is online

10.但是在启动db的时候报错了。。。

PRCR1079 : Failed to start resource
ora.orcl.db

CRS5017: The resource action “ora.orcl.db start” encountered the
following error:

ORA00205: error in identifying
control
file, check alert logfor more info

. For details refer to “(:CLSN00107:)” in/u01/app/11.2.0/grid/log/rac1/agent/crsd/oraagent_oracle/oraagent_oracle.log“.

CRS5017: The resource action “ora.orcl.db start” encountered the
following error:

ORA00205: error in identifying
control
file, check alert logfor more info

. For details refer to “(:CLSN00107:)” in/u01/app/11.2.0/grid/log/rac2/agent/crsd/oraagent_oracle/oraagent_oracle.log“.

CRS2674: Start of ‘ora.orcl.db’ on
‘rac1’ failed

CRS2674: Start of ‘ora.orcl.db’ on
‘rac2’ failed

CRS2632: There are no more
servers to try to place resource ‘ora.orcl.db’ on that would satisfy its
placement policy

查看了下trace发现SYSTEMDG中的控制文件没了,装RAC的时候就没规划好,导致控制文件放到了votedisk里,但是SYSTEMDG刚刚被重建了,所以丢了

ALTER SYSTEM SET local_listener=
(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.8.222)(PORT=1521))’
SCOPE=MEMORY SID=‘orcl1’;

ALTERDATABASE MOUNT /* db agent *//* {1:40511:558} */

This instance was first to mount

NOTE: Loaded library: System

SUCCESS: diskgroup DATADG was mounted

SUCCESS: diskgroup SYSTEMDG was mounted

NOTE: dependency betweendatabase orcl and diskgroup
resource ora.DATADG.dg
is established

ORA00210: cannot open the specified
control
file

ORA00202: control file: ‘+SYSTEMDG/orcl/controlfile/current.261.904952269’

ORA17503: ksfdopn:2 Failed toopenfile+SYSTEMDG/orcl/controlfile/current.261.904952269

ORA15012: ASM file‘+SYSTEMDG/orcl/controlfile/current.261.904952269’ does not exist

ORA205 signalled during: ALTERDATABASE MOUNT /* db agent *//* {1:40511:558} */

Sat Feb 2723:48:142016

Shutting down instance (abort)

License high water mark =1

USER (ospid: 21150): terminating the instance

Instance terminated byUSER,
pid
=21150

Sat Feb 2723:48:142016

Instance shutdown complete

11.将数据库启动到nomount阶段

[grid@rac1 trace]$ srvctl start databased orcl o nomount

12.在节点一上使用rman恢复一份控制文件到SYSTEMDG

[oracle@rac1 ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 Production on Sat Feb 2723:53:352016

Copyright (c) 1982, 2011,
Oracle
and/or its affiliates.  All rights reserved.

connected to target database: ORCL (not mounted)

RMAN>restore controlfile to‘+SYSTEMDG’from‘+DATADG/orcl/controlfile/current.260.901699963’;

Starting restore at 2016/02/2723:53:43

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=34 instance=orcl1 device type=DISK

channel ORA_DISK_1: copied control file copy

Finished restore at 2016/02/2723:53:52

13.已恢复好的控制文件路径和名称

ASMCMD> pwd

+systemdg/orcl/CONTROLFILE

ASMCMD> ls

current.261.904953225

14.节点一上登陆db,修改control_files参数

[oracle@rac1 ~]$ sqlplus /as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sat Feb 2723:55:092016

Copyright (c) 1982, 2013,
Oracle. 
All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 64bit Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining andReal Application Testing options

SQL>alter system set control_files=‘+DATADG/orcl/controlfile/current.260.901699963’,‘+SYSTEMDG/orcl/CONTROLFILE/current.261.904953225’ scope=spfile sid=‘*’;

System altered.

15.重启数据库,并检查数据库状态

[grid@rac1 trace]$ srvctl stop databased orcl

[grid@rac1 trace]$ srvctl start databased orcl

[grid@rac1 trace]$ srvctl status databased orcl

Instance orcl1 is running on node rac1

Instance orcl2 is running on node rac2

16.保险起见,查看两个节点上的控制文件参数是否正确

[oracle@rac1 ~]$ sqlplus /as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sat Feb 2723:58:472016

Copyright (c) 1982, 2013,
Oracle. 
All rights reserved.

Connected to:

Oracle Database 11g
Enterprise Edition Release
11.2.0.4.0 64bit Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining andReal Application Testing options

SQL> show parameter
control_file

NAME                                 TYPE          VALUE

———————————— ———–
————————————————————————————————

control_file_record_keep_time        integer       7

control_files                        string        +DATADG/orcl/controlfile/current.260.901699963, +SYSTEMDG/orcl/controlfile/current.261.904953225

                                              

[oracle@rac2 ~]$ sqlplus /as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sat Feb 2723:59:292016

Copyright (c) 1982, 2013,
Oracle. 
All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 64bit Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining andReal Application Testing options

SQL> show parameter
control_file

NAME                                 TYPE           VALUE

———————————— ———–
————————————————————————————————

control_file_record_keep_time        integer       7

control_files                        string        +DATADG/orcl/controlfile/current.260.901699963, +SYSTEMDG/orcl/controlfile/current.261.904953225


OK
,收工~~~~

赞(0)
【声明】:本博客不参与任何交易,也非中介,仅记录个人感兴趣的主机测评结果和优惠活动,内容均不作直接、间接、法定、约定的保证。访问本博客请务必遵守有关互联网的相关法律、规定与规则。一旦您访问本博客,即表示您已经知晓并接受了此声明通告。