AIX 5.3下Oracle 10g RAC 启动故障--vip漂移

系统环境:

操作系统: AIX 5300-09

集群软件: CRS 10.2.0.1

数据库: Oracle 10.2.0.1

系统架构图




故障现象:

系统重启后,在节点上CRS 启动失败或CRS服务启动成功,CRS Resource无法ONLINE。

[root@aix213 racg] cat /etc/hosts

127.0.0.1loopbacklocalhost#loopback(lo0)name/address192.168.8.214aix214192.168.8.106aix106192.168.8.213aix213192.168.8.115aix213-vip10.10.10.213aix213-priv192.168.8.113aix214-vip10.10.10.214aix214-priv

每个node都绑定了其他节点的vip ip ,vip ip address 绑定到了所有的节点上!

[oracle@aix214 ~]$ifconfig -a

en0:flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN>inet192.168.8.214netmask0xffffff00broadcast192.168.8.255inet192.168.8.113netmask0xffffff00broadcast192.168.8.255inet192.168.8.115netmask0xffffff00broadcast192.168.8.255tcp_sendspace131072tcp_recvspace65536rfc13230

[oracle@aix213 ~]$ifconfig -a

en0:flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN>inet192.168.8.213netmask0xffffff00broadcast192.168.8.255inet192.168.8.113netmask0xffffff00broadcast192.168.8.255inet192.168.8.115netmask0xffffff00broadcast192.168.8.255tcp_sendspace131072tcp_recvspace65536rfc13230

[root@aix214 /]$crsctl check crs

CSSappearshealthyCRSappearshealthyEVMappearshealthy

[root@aix214 /]$crs_stat -t

NameTypeTargetStateHost------------------------------------------------------------ora....13.lsnrapplicationONLINEOFFLINEora.aix213.gsdapplicationONLINEOFFLINEora.aix213.onsapplicationONLINEOFFLINEora.aix213.vipapplicationONLINEOFFLINEora....14.lsnrapplicationONLINEOFFLINEora.aix214.gsdapplicationONLINEOFFLINEora.aix214.onsapplicationONLINEOFFLINEora.aix214.vipapplicationONLINEOFFLINEora.prod.dbapplicationONLINEOFFLINEora....d1.instapplicationONLINEOFFLINEora....d2.instapplicationONLINEOFFLINE

查看日志:

[root@aix213 racg]cd /u01/crs_1/log/aix213/racg

[root@aix213racg]$moreora.aix213.vip.logOracleDatabase10gCRSRelease10.2.0.1.0ProductionCopyright1996,2005Oracle.Allrightsreserved.2014-05-0917:07:05.624:[RACG][1][385112][1][ora.aix213.vip]:Invalidparameters,orfailedtobringupVIP(host=aix213)2014-05-0917:07:05.624:[RACG][1][385112][1][ora.aix213.vip]:clsrcexecut:envORACLE_CONFIG_HOME=/u01/crs_12014-05-0917:07:05.625:[RACG][1][385112][1][ora.aix213.vip]:clsrcexecut:cmd=/u01/crs_1/bin/racgeut-e_USR_ORA_DEBUG=054/u01/crs_1/bin/racgvipstartaix2132014-05-0917:07:05.625:[RACG][1][385112][1][ora.aix213.vip]:clsrcexecut:rc=1,time=0.345s2014-05-0917:07:06.832:[RACG][1][385112][1][ora.aix213.vip]:Invalidparameters,orfailedtobringupVIP(host=aix213)

......

初步判断是在节点上VIP配置有问题!

解决方法1:

1、关闭所有node上的nodeapps

[oracle@aix213 ~]$srvctl stop nodeapps -n aix213

[oracle@aix213 ~]$srvctl stop nodeapps -n aix214

[oracle@aix213 ~]$srvctl modify nodeapps -A 192.168.8.115/255.255.255.0/en0 -n aix213 -o $ORACLE_HOME

[oracle@aix213 ~]$srvctl modify nodeapps -A 192.168.8.113/255.255.255.0/en0 -n aix214 -o $ORACLE_HOME

2、停止所有节点的crs

[oracle@aix213 ~]$crsctl stop crs

[oracle@aix214 ~]$crsctl stop crs

3、重新启动所有节点的crs

[oracle@aix213 ~]$crsctl start crs

[oracle@aix214 ~]$crsctl start crs

解决方法2:

1、更新CRS中VIP信息

[root@aix213 racg] cat /etc/hosts

127.0.0.1loopbacklocalhost#loopback(lo0)name/address192.168.8.214aix214192.168.8.106aix106192.168.8.213aix213192.168.8.115aix213-vip10.10.10.213aix213-priv192.168.8.113aix214-vip10.10.10.214aix214-priv

2、修改VIP

[root@aix214/]$srvctlmodifynodeapps-naix213-o/u01/app/oracle/product/10.2.0/db_1/-A192.168.8.115/255.255.255.0/en0[root@aix214/]$srvctlmodifynodeapps-naix214-o/u01/app/oracle/product/10.2.0/db_1/-A192.168.8.113/255.255.255.0/en0

3、以root身份执行vipca


4、重新启动CRS服务

[root@aix214/]$crsctlcheckcrsCSSappearshealthyCRSappearshealthyEVMappearshealthy

[root@aix214 /]$crs_stat -t

NameTypeTargetStateHost------------------------------------------------------------ora....13.lsnrapplicationOFFLINEOFFLINEora.aix213.gsdapplicationONLINEONLINEaix213ora.aix213.onsapplicationONLINEONLINEaix213ora.aix213.vipapplicationONLINEONLINEaix213ora....14.lsnrapplicationONLINEOFFLINEora.aix214.gsdapplicationONLINEONLINEaix214ora.aix214.onsapplicationONLINEONLINEaix214ora.aix214.vipapplicationONLINEONLINEaix214ora.prod.dbapplicationONLINEOFFLINEora....d1.instapplicationOFFLINEOFFLINEora....d2.instapplicationONLINEOFFLINE

手工启动Listener service:

[root@aix214/]$crs_stat|greplsnNAME=ora.aix213.LISTENER_AIX213.lsnrNAME=ora.aix214.LISTENER_AIX214.lsnr[root@aix214/]$crs_start-fora.aix214.LISTENER_AIX214.lsnrAttemptingtostart`ora.aix214.LISTENER_AIX214.lsnr`onmember`aix214`Startof`ora.aix214.LISTENER_AIX214.lsnr`onmember`aix214`succeeded.[root@aix214/]$crs_start-fora.aix213.LISTENER_AIX213.lsnrAttemptingtostart`ora.aix213.LISTENER_AIX213.lsnr`onmember`aix213`Startof`ora.aix213.LISTENER_AIX213.lsnr`onmember`aix213`succeeded.

至此CRS启动成功:

[oracle@aix213 ~]$crs_stat -t

NameTypeTargetStateHost------------------------------------------------------------ora....13.lsnrapplicationONLINEONLINEaix213ora.aix213.gsdapplicationONLINEONLINEaix213ora.aix213.onsapplicationONLINEONLINEaix213ora.aix213.vipapplicationONLINEONLINEaix213ora....14.lsnrapplicationONLINEONLINEaix214ora.aix214.gsdapplicationONLINEONLINEaix214ora.aix214.onsapplicationONLINEONLINEaix214ora.aix214.vipapplicationONLINEONLINEaix214ora.prod.dbapplicationONLINEONLINEaix213ora....d1.instapplicationONLINEONLINEaix213ora....d2.instapplicationONLINEONLINEaix214

@至此,问题基本解决