通过案例学Oracle之--一次AIX rac误操作引起的“血案”
通过案例学Oracle之--一次AIX rac误操作引起的“血案”
系统环境:
操作系统:AIX 5300-09
集群软件: CRS 10.2.0.1
数据库: Oracle 10.2.0.1
本案例是用于基于VG Concurrent 的共享存储,通过HACMP 实现卷组的并发
案例分析:
一、错误现象:
1、Oracle 用户无法访问设备文件
2、CRS server启动失败
[oracle@aix211 ~]$ls -l /dev
/dev/__vg10:Nopermission/dev/audit:Nopermission/dev/cd0:Nopermission/dev/clone:Nopermission/dev/console:Nopermission/dev/error:Nopermission
查看设备文件属性,发现被改为oracle:dba
[oracle@aix211 ~]$ls -ld /dev
drw-rw---- 6 oracle dba 3584 Sep 16 11:38 /dev
重新更改设备文件属性
[root@aix211/]#chownroot.system/dev[root@aix211/]#ls-ld/devdrw-rw----6rootsystem3584Sep1611:38/dev[root@aix211/]#chmod775/dev
Oracle用户可以正常访问设备文件
[root@aix211/]#su-oracle[oracle@aix211~]$ls-l/devtotal24crw-rw----1rootsystem10,0Aug292013IPL_rootvgsrwxrwxrwx1rootsystem0Sep1610:22SRCbrw-rw----1oracledba88,9Sep1112:15control1_1brw-rw----1oracledba88,10Sep1112:15control2_2brw-rw----1oracledba88,11Sep1112:16control3_3crw-rw----1rootsystem88,0Sep1112:08datavg
但是CRS server仍然不能正常启动!
二、重新配置CRS:
1、清理ocr和vote disk磁盘信息(两个节点)
[root@aix211/]#ddif=/dev/zeroof=/dev/rrac_ocrbs=8192count=25602560+0recordsin2560+0recordsout[root@aix211/]#ddif=/dev/zeroof=/dev/rrac_votebs=8192count=25602560+0recordsin2560+0recordsout[root@aix211/]#ls-l/dev|grepocrbrw-rw----1oracledba88,1Sep1112:15rac_ocrcrw-r-----1rootoinstall88,1Sep1611:05rrac_ocr[root@aix211/]#chownoracle:dba/dev/rrac_ocr
2、重新运行root.sh脚本,配置CRS(两个节点)
node1:
[root@aix211 install]#./rootdelete.sh
ShuttingdownOracleClusterReadyServices(CRS):Sep1611:48:57.011|ERR|failedtoconnecttodaemon,errno(2)Stoppingresources.Errorwhilestoppingresources.Possiblecause:CRSDisdown.StoppingCSSD.UnabletocommunicatewiththeCSSdaemon.Shutdownhasbegun.Thedaemonsshouldexitsoon.CheckingtoseeifOracleCRSstackisdown...OracleCRSstackisnotrunning.OracleCRSstackisdownnow.RemovingscriptforOracleClusterReadyservicesUpdatingocrfilefordowngradeCleaningupSCRsettingsin'/etc/oracle/scls_scr'
[root@aix211 install]#/u01/crs_1/root.sh
WARNING:directory'/u01'isnotownedbyrootCheckingtoseeifOracleCRSstackisalreadyconfiguredCheckingtoseeifany9iGSDisupSettingthepermissionsonOCRbackupdirectorySettingupNSdirectoriesOracleClusterRegistryconfigurationupgradedsuccessfullyWARNING:directory'/u01'isnotownedbyrootclscfg:EXISTINGconfigurationversion3detected.clscfg:version3is10GRelease2.SuccessfullyaccumulatednecessaryOCRkeys.Usingports:CSS=49895CRS=49896EVMC=49898andEVMR=49897.node<nodenumber>:<nodename><privateinterconnectname><hostname>node1:aix211aix211-privaix211node2:aix212aix212-privaix212clscfg:Argumentscheckoutsuccessfully.NOKEYSWEREWRITTEN.Supply-forceparametertooverride.-forceisdestructiveandwilldestroyanypreviousclusterconfiguration.OracleClusterRegistryforclusterhasalreadybeeninitializedStartupwillbequeuedtoinitwithin30seconds.AddingdaemonstoinittabAddingdaemonstoinittabExpectingtheCRSdaemonstobeupwithin600seconds.CSSisactiveonthesenodes.aix211CSSisinactiveonthesenodes.aix212Localnodecheckingcomplete.Runroot.shonremainingnodestostartCRSdaemons.
node2:
[root@aix212 install]#./rootdelete.sh
ShuttingdownOracleClusterReadyServices(CRS):Sep1611:48:57.011|ERR|failedtoconnecttodaemon,errno(2)Stoppingresources.Errorwhilestoppingresources.Possiblecause:CRSDisdown.StoppingCSSD.UnabletocommunicatewiththeCSSdaemon.Shutdownhasbegun.Thedaemonsshouldexitsoon.CheckingtoseeifOracleCRSstackisdown...OracleCRSstackisnotrunning.OracleCRSstackisdownnow.RemovingscriptforOracleClusterReadyservicesUpdatingocrfilefordowngradeCleaningupSCRsettingsin'/etc/oracle/scls_scr'
[root@aix212@ /]#/u01/crs_1/root.sh
WARNING:directory'/u01'isnotownedbyrootCheckingtoseeifOracleCRSstackisalreadyconfiguredSettingthepermissionsonOCRbackupdirectorySettingupNSdirectoriesOracleClusterRegistryconfigurationupgradedsuccessfullyWARNING:directory'/u01'isnotownedbyrootclscfg:EXISTINGconfigurationversion3detected.clscfg:version3is10GRelease2.SuccessfullyaccumulatednecessaryOCRkeys.Usingports:CSS=49895CRS=49896EVMC=49898andEVMR=49897.node<nodenumber>:<nodename><privateinterconnectname><hostname>node1:aix211aix211-privaix211node2:aix212aix212-privaix212clscfg:Argumentscheckoutsuccessfully.NOKEYSWEREWRITTEN.Supply-forceparametertooverride.-forceisdestructiveandwilldestroyanypreviousclusterconfiguration.OracleClusterRegistryforclusterhasalreadybeeninitializedStartupwillbequeuedtoinitwithin30seconds.AddingdaemonstoinittabAddingdaemonstoinittabExpectingtheCRSdaemonstobeupwithin600seconds.CSSisactiveonthesenodes.aix211aix212CSSisactiveonallnodes.WaitingfortheOracleCRSDandEVMDtostartOracleCRSstackinstalledandrunningunderinit(1M)Runningvipca(silent)forconfiguringnodeappsThegiveninterface(s),"en0"isnotpublic.PublicinterfacesshouldbeusedtoconfigurevirtualIPs.
在node2上运行vipca,配置vip
@至此,CRS重新配置成功!
[root@aix212@/]#crsctlcheckcrsCSSappearshealthyCRSappearshealthyEVMappearshealthy[root@aix212@/]#crs_stat-tNameTypeTargetStateHost------------------------------------------------------------ora.aix211.gsdapplicationONLINEONLINEaix211ora.aix211.onsapplicationONLINEONLINEaix211ora.aix211.vipapplicationONLINEONLINEaix211ora.aix212.gsdapplicationONLINEONLINEaix212ora.aix212.onsapplicationONLINEONLINEaix212ora.aix212.vipapplicationONLINEONLINEaix212
三、重新注册Listener和Database
1、注册listener
通过netca工具,重新reconfigure就可以完成listener的注册!
[root@aix212@/]#crs_stat-tNameTypeTargetStateHost------------------------------------------------------------ora....11.lsnrapplicationONLINEONLINEaix211ora.aix211.gsdapplicationONLINEONLINEaix211ora.aix211.onsapplicationONLINEONLINEaix211ora.aix211.vipapplicationONLINEONLINEaix211ora....12.lsnrapplicationONLINEONLINEaix212ora.aix212.gsdapplicationONLINEONLINEaix212ora.aix212.onsapplicationONLINEONLINEaix212ora.aix212.vipapplicationONLINEONLINEaix212
2、注册Database和Instance
注册Database:
[root@aix212@ /]#srvctl add database -h
Usage:srvctladddatabase-d<name>-o<oracle_home>[-m<domain_name>][-p<spfile>][-A<name|ip>/netmask][-r{PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY}][-s<start_options>][-n<db_name>][-y{AUTOMATIC|MANUAL}]-d<name>Uniquenameforthedatabase-o<oracle_home>ORACLE_HOMEforclusterdatabase-m<domain>Domainforclusterdatabase-p<spfile>Serverparameterfileforclusterdatabase-A<addr_str>Databaseclusteralias-n<db_name>Databasename(DB_NAME),ifdifferentfromtheuniquenamegivenbythe-doption-r<role>Roleofthedatabase(primary,physical_standby,logical_standby)-s<start_options>Startupoptionsforthedatabase-y<dbpolicy>Managementpolicyforthedatabase(automatic,manual)-hPrintusage
[root@aix212@ /]#su - oracle
[oracle@aix212@ ~]$srvctl add database -d prod -o $ORACLE_HOME
[oracle@aix212@~]$crs_stat-tNameTypeTargetStateHost------------------------------------------------------------ora....11.lsnrapplicationONLINEONLINEaix211ora.aix211.gsdapplicationONLINEONLINEaix211ora.aix211.onsapplicationONLINEONLINEaix211ora.aix211.vipapplicationONLINEONLINEaix211ora....12.lsnrapplicationONLINEONLINEaix212ora.aix212.gsdapplicationONLINEONLINEaix212ora.aix212.onsapplicationONLINEONLINEaix212ora.aix212.vipapplicationONLINEONLINEaix212ora.prod.dbapplicationOFFLINEOFFLINE
注册Instance:
[oracle@aix212@~]$srvctladdinstance-hUsage:srvctladdinstance-d<name>-i<inst_name>-n<node_name>-d<name>Uniquenameforthedatabase-i<inst>Instancename-n<node>Nodename-hPrintusage[oracle@aix212@~]$srvctladdinstance-dprod-iprod1-naix211[oracle@aix212@~]$srvctladdinstance-dprod-iprod2-naix212[oracle@aix212@~]$crs_stat-tNameTypeTargetStateHost------------------------------------------------------------ora....11.lsnrapplicationONLINEONLINEaix211ora.aix211.gsdapplicationONLINEONLINEaix211ora.aix211.onsapplicationONLINEONLINEaix211ora.aix211.vipapplicationONLINEONLINEaix211ora....12.lsnrapplicationONLINEONLINEaix212ora.aix212.gsdapplicationONLINEONLINEaix212ora.aix212.onsapplicationONLINEONLINEaix212ora.aix212.vipapplicationONLINEONLINEaix212ora.prod.dbapplicationOFFLINEOFFLINEora....d1.instapplicationOFFLINEOFFLINEora....d2.instapplicationOFFLINEOFFLINE
通过crs工具启动Database:
[oracle@aix212@ ~]$srvctl start database -d prod
PRKP-1001 : Error starting instance prod1 on node aix211
CRS-0184: Cannot communicate with the CRS daemon.
PRKP-1001 : Error starting instance prod2 on node aix212
CRS-0184: Cannot communicate with the CRS daemon.
启动Instance失败,通过sqlplus手工启动:
[oracle@aix212@~]$sqlplus'/assysdba'SQL*Plus:Release10.2.0.1.0-ProductiononTueSep1612:08:102014Copyright(c)1982,2005,Oracle.Allrightsreserved.Connectedtoanidleinstance.SQL>startupORACLEinstancestarted.TotalSystemGlobalArea1258291200bytesFixedSize2020552bytesVariableSize352324408bytesDatabaseBuffers889192448bytesRedoBuffers14753792bytesDatabasemounted.Databaseopened.
[oracle@aix211aix211]$sqlplus'/assysdba'SQL*Plus:Release10.2.0.1.0-ProductiononTueSep1612:09:372014Copyright(c)1982,2005,Oracle.Allrightsreserved.Connectedtoanidleinstance.SQL>startupORACLEinstancestarted.TotalSystemGlobalArea1258291200bytesFixedSize2020552bytesVariableSize335547192bytesDatabaseBuffers905969664bytesRedoBuffers14753792bytesDatabasemounted.Databaseopened.
查看crs启动resource信息:
[oracle@aix211aix211]$crs_stat-tNameTypeTargetStateHost------------------------------------------------------------ora....11.lsnrapplicationONLINEONLINEaix211ora.aix211.gsdapplicationONLINEONLINEaix211ora.aix211.onsapplicationONLINEONLINEaix211ora.aix211.vipapplicationONLINEONLINEaix211ora....12.lsnrapplicationONLINEONLINEaix212ora.aix212.gsdapplicationONLINEONLINEaix212ora.aix212.onsapplicationONLINEONLINEaix212ora.aix212.vipapplicationONLINEONLINEaix212ora.prod.dbapplicationONLINEONLINEaix211ora....d1.instapplicationONLINEONLINEaix211ora....d2.instapplicationONLINEONLINEaix212
再通过crs工具重新启动Instance:
[oracle@aix211 aix211]$srvctl stop instance -d prod -i prod1
[oracle@aix211aix211]$crs_stat-tNameTypeTargetStateHost------------------------------------------------------------ora....11.lsnrapplicationONLINEONLINEaix211ora.aix211.gsdapplicationONLINEONLINEaix211ora.aix211.onsapplicationONLINEONLINEaix211ora.aix211.vipapplicationONLINEONLINEaix211ora....12.lsnrapplicationONLINEONLINEaix212ora.aix212.gsdapplicationONLINEONLINEaix212ora.aix212.onsapplicationONLINEONLINEaix212ora.aix212.vipapplicationONLINEONLINEaix212ora.prod.dbapplicationONLINEONLINEaix211ora....d1.instapplicationOFFLINEOFFLINEora....d2.instapplicationONLINEONLINEaix212
[oracle@aix211 aix211]$srvctl start instance -d prod -i prod1
[oracle@aix211aix211]$crs_stat-tNameTypeTargetStateHost------------------------------------------------------------ora....11.lsnrapplicationONLINEONLINEaix211ora.aix211.gsdapplicationONLINEONLINEaix211ora.aix211.onsapplicationONLINEONLINEaix211ora.aix211.vipapplicationONLINEONLINEaix211ora....12.lsnrapplicationONLINEONLINEaix212ora.aix212.gsdapplicationONLINEONLINEaix212ora.aix212.onsapplicationONLINEONLINEaix212ora.aix212.vipapplicationONLINEONLINEaix212ora.prod.dbapplicationONLINEONLINEaix211ora....d1.instapplicationONLINEONLINEaix211ora....d2.instapplicationONLINEONLINEaix212
@至此,通过crs工具可以正常启动和关闭Database,由于误操作而引起的血案,抢救成功!
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。