本来一直都有玩下ASM EXTEND RAC这样的想法,苦于没有资源测试,等。。。。。
老天不负有心人啊~哈哈!终于有资源玩了。
2套存储:EMS跟HDS,分别放在不同的机房。
由于原测试系统用的是文件系统,故要将其先改为ASM,再创建ASM EXTEND RAC。
此次修改成ASM EXTEND RAC遇到一系列问题,虽然解决这些问题有过苦恼,但EXTEND RAC成功完成之后,有种莫名的成就感,
各位看官大问题解决之后有木有同感~ 呵呵

1.系统环境

1.1 OS及DB版本

主机OS版本:AIX 7.1 ("7100-02-03-1334")

ORACLE版本:oracle11.2.0.3 PSU10

是否RAC:是

节点个数:4个

存储:HDS 100G,EMS 50G

ASM或文件系统:赛门铁克VERITAS卷管理工具搭建集群文件系统

1.2 硬件

RAM : 128

SWAP: 13G

1.3 AIX /TMP文件系统

8GB

1.4 AIX JDK & JRE

IBM JDK 1.6.0.00 (64 BIT)

1.5 目录详细

/oracle 50GB

/oraclelog 30GB

/ocrvote2G

/archivelog400G

/oradata850

1.6 主机IP配置信息

100.15.64.180 testdb1

100.15.64.181 testdb2

100.15.64.182 testdb3

100.15.64.183 testdb4

100.15.64.184 testdb1-vip

100.15.64.185 testdb2-vip

100.15.64.186 testdb3-vip

100.15.64.187 testdb4-vip

100.15.64.188 testdb-scan

7.154.64.1 testdb1-priv

7.154.64.2 testdb2-priv

7.154.64.3 testdb3-priv

7.154.64.4 testdb4-priv


2.文件系统更换成ASM

2.1磁盘权限及属性修改

chown grid:asmadmin /dev/vx/rdmp/remc0_04a1

chown grid:asmadmin /dev/vx/rdmp/rhitachi_v0_11cd

chmod 660 /dev/vx/rdmp/remc0_04a1

chmod 660 /dev/vx/rdmp/rhitachi_v0_11cd

(注:由于测试库使用的是赛门铁克的存储多路径软件,故无需修改磁盘属性)

2.2创建ASM实例

su – grid

export DISPLAY=100.15.70.169:0.0

asmca

(注:创建OCTVOTE磁盘组选NORMAL冗余,创建2个故障组,最少3块磁盘,建议选用3块磁盘,当asm的故障组如果有多余3块盘,votedisk迁移到这个磁盘组也只用其中的3块盘。使用crsctl query css votedisk只看到votedisk放在3块盘上。磁盘组的可用空间以其故障组总大小最小的为准)

2.3创建ASM磁盘组SYSDG,DATADG并修改磁盘组参数

su – grid

export DISPLAY=100.15.70.169:0.0

asmca



注:同一边的存储放在一个故障组中。

oracle 11G之后的ASM需要将rdbms的compatible参数修改为11.2.0.0,这个参数默认的是10.2.0.0,如果这个参数不修改,后面如果使用两个故障组,其中一个故障组故障修复后,将故障组在线online的时候会报如下错:

ORA-15283: ASM operation requires compatible.rdbms of 11.1.0.0.0 orhigher

修改命令:

alter diskgroup SYSDG set attribute'compatible.rdbms'='11.2.0.0';

select name,COMPATIBILITY,DATABASE_COMPATIBILITY fromv$asm_diskgroup;

----compatibility对应asm的版本,

DATABASE_COMPATIBILITY --- 兼容数据库版本

2.4将文件系统数据文件迁移至ASM中

因为本次测试没建库,所以不涉及数据文件迁移,如需迁移,使用RMAN实现。

2.5将OCR,VOTEDISK迁移至磁盘组OCRVOTE中

1)查看ocr跟votedisk

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck

Status of OracleCluster Registry is as follows :

Version : 3

Total space (kbytes) :262120

Used space (kbytes) :3296

Available space (kbytes) : 258824

ID : 1187520997

Device/File Name : /ocrvote/ocr1

Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity checksucceeded

Logical corruption check succeeded

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctlquery css votedisk

## STATEFile Universal IdFile Name Disk group

-- ------------------------------- ---------

1. ONLINEa948649dc0e14f65bf171ba2ca496962 (/ocrvote/votedisk1) []

2. ONLINEa5f290d560684f47bf82eb3d34db5fc7 (/ocrvote/votedisk2) []

3. ONLINE49617fb984fc4fcdbf5b7566a9e1778f (/ocrvote/votedisk3) []

Located 3 votingdisk(s).

2)查看资源状态

$ crsctl stat res-t

--------------------------------------------------------------------------------

NAME TARGET STATESERVERSTATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINEONLINE testdb1

ONLINEONLINE testdb2

ONLINEONLINE testdb3

ONLINEONLINE testdb4

ora.LISTENER.lsnr

ONLINEONLINE testdb1

ONLINEONLINE testdb2

ONLINEONLINE testdb3

ONLINEONLINE testdb4

ora.OCRVOTE.dg

ONLINEONLINE testdb1

ONLINEONLINE testdb2

ONLINEONLINE testdb3

ONLINEONLINE testdb4

ora.SYSDG.dg

ONLINEONLINE testdb1

ONLINEONLINE testdb2

ONLINEONLINE testdb3

ONLINEONLINE testdb4

ora.asm

ONLINEONLINE testdb1 Started

ONLINEONLINE testdb2 Started

ONLINEONLINE testdb3 Started

ONLINEONLINE testdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINEONLINE testdb1

ONLINEONLINE testdb2

ONLINEONLINE testdb3

ONLINE ONLINEtestdb4

ora.ons

ONLINEONLINE testdb1

ONLINEONLINE testdb2

ONLINEONLINE testdb3

ONLINEONLINE testdb4

ora.registry.acfs

ONLINEONLINE testdb1

ONLINEONLINE testdb2

ONLINEONLINE testdb3

ONLINEONLINE testdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1ONLINE ONLINE testdb1

ora.cvu

1ONLINE ONLINE testdb1

ora.oc4j

1ONLINE ONLINE testdb1

ora.scan1.vip

1ONLINE ONLINE testdb1

ora.testdb1.vip

1ONLINE ONLINE testdb1

ora.testdb2.vip

1ONLINE ONLINE testdb2

ora.testdb3.vip

1ONLINE ONLINE testdb3

ora.testdb4.vip

1 ONLINEONLINE testdb4

3)备份OCR

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig-manualbackup

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig-showbackup

4)将OCR增加到磁盘组中并删除原有文件系统中的OCR

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig-add +OCRVOTE

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck

Status of Oracle Cluster Registry is asfollows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 3336

Available space (kbytes) :258784

ID :1187520997

Device/File Name :/ocrvote/ocr1

Device/Fileintegrity check succeeded

Device/File Name : +OCRVOTE

Device/Fileintegrity check succeeded

Device/Filenot configured

Device/Filenot configured

Device/Filenot configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig-delete /ocrvote/ocr1

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck

Status of Oracle Cluster Registry is asfollows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 3336

Available space (kbytes) :258784

ID : 1187520997

Device/File Name : +OCRVOTE

Device/Fileintegrity check succeeded

Device/Filenot configured

Device/Filenot configured

Device/Filenot configured

Device/Filenot configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

5)将votedisk迁移至文件系统中

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctlquery css votedisk

##STATE File Universal Id File Name Disk group

------- ----------------- --------- ---------

1.ONLINE a948649dc0e14f65bf171ba2ca496962(/ocrvote/votedisk1) []

2.ONLINE a5f290d560684f47bf82eb3d34db5fc7(/ocrvote/votedisk2) []

3.ONLINE 49617fb984fc4fcdbf5b7566a9e1778f(/ocrvote/votedisk3) []

Located 3 voting disk(s).

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctlreplace votedisk +OCRVOTE

CRS-4256: Updating the profile

Successful addition of voting disk3a5e5e8622024f17bf0c1a4594e303f5.

Successful addition of voting disk92ff4555f7064f70bf3c022bd687dbc5.

Successful addition of voting disk19a1fed74b7f4fb6bf780d43b5427dc9.

Successful deletion of voting diska948649dc0e14f65bf171ba2ca496962.

Successful deletion of voting diska5f290d560684f47bf82eb3d34db5fc7.

Successful deletion of voting disk49617fb984fc4fcdbf5b7566a9e1778f.

Successfully replaced voting disk groupwith +OCRVOTE.

CRS-4256: Updating the profile

CRS-4266: Voting file(s) successfullyreplaced

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctlquery css votedisk

##STATE File Universal Id File Name Disk group

------- ----------------- --------- ---------

1.ONLINE 3a5e5e8622024f17bf0c1a4594e303f5(/dev/vx/rdmp/emc0_04a1) [OCRVOTE]

2.ONLINE 92ff4555f7064f70bf3c022bd687dbc5(/dev/vx/rdmp/hitachi_vsp0_11cc) [OCRVOTE]

3.ONLINE 19a1fed74b7f4fb6bf780d43b5427dc9(/dev/vx/rdmp/emc0_04c1) [OCRVOTE]

Located 3 voting disk(s).

3.将NFS添加至磁盘组OCTVOTE中,作为第三块仲裁盘

asm extend rac需要在2套存储之外的地方放置一台linux的pcserver,并需要在这台server上创建一个文件系统。 将此文件系统以NFS形式挂载到asm extend rac的服务器端,NFS上需要使用dd命令生成盘。

3.1NFS服务器信息

系统版本:Linux el5 x86_64

3.2NFS服务器创建grid用户

groupadd -g 1000 oinstall

groupadd -g 1100 asmadmin

useradd -u 1100 -g oinstall -Goinstall,asmadmin -d /home/grid -c "GRID Software Owner" grid

注:建议nfs服务器用户ID、组ID跟生产库一致

3.3在NFS服务器创建目录并赋权,DD出一个盘

cd /oradata

mkdir votedisk

chown 1100:1100 votedisk

3.4修改NFS服务器上的/etc/exports文件,并重启NFS

vi /etc/exports

新增如下行

/oradata/votedisk*(rw,sync,all_squash,anonuid=1100,anongid=1100)

service nfs stop

service nfs start

3.5查看nfs是否包含新增的votedisk目录

[root@ywtcdb ~]# exportfs -v

/oradata 100.15.64.*(rw,wdelay,no_root_squash,no_subtree_check,anonuid=65534,anongid=65534)

/oradata/votedisk

<world>(rw,wdelay,root_squash,all_squash,no_subtree_check,anonuid=1100,anongid=1100)

(注:红色部分为新增部分)

3.6修改生产主机的/etc/filesystems文件,将目录设为自动随机挂载(每个节点运行)

su - root

mkdir/voting_disk

chown grid:asmadmin /voting_disk

vi/etc/filesystems

新增如下内容:

/voting_disk:

dev ="/oradata/votedisk"

vfs = nfs

nodename = ywtcdb

mount = true

options =rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys

account = false

(注:严格按照/etc/filesystems的已有选项进行配置,包括标点符号,空格等,建议使用smitnfs命令进行nfs配置,并在命令配置完成之后修改/etc/filesystems文件中对应挂载目录的options属性,options属性必须是rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys)

使用smit nfs命令设置启动自动挂载nfs

#smit nfs

[TOP][Entry Fields]

* Pathname of mount point [/voting_disk]

* Pathname of remote directory [/oradata/votedisk]

* Host where remote directory resides [ywtcdb]

Mount type name []

* Security method [sys]

* Mount now, add entry to /etc/filesystemsor both? both

* /etc/filesystems entry will mount thedirectory yes

3.7手动挂载目录(每个节点运行)

/usr/sbin/nfso –p-o nfs_use_reserved_ports=1

或nfso -p -o nfs_use_reserved_ports=1

su - root

mount -v nfs -orw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys 100.15.57.125:/oradata/votedisk /voting_disk

注:命令中的100.15.57.125问NFS服务器的IP, /oradata/votedisk为NFS服务器的目录,/voting_disk为生产主机的目录。

3.8使用dd命令生成一块盘(任一生产节点)

dd if=/dev/zeroof=/voting_disk/vote_disk_nfs bs=1M count=1000

3.9将新生成盘加到磁盘组OCRVOTE中

su – grid

export DISPLAY=100.15.70.169:0.0

asmca

在asmca中要先改变Disk Discovery Path

修改前:

/dev/vx/rdmp/*

修改后:

/voting_disk/vote_disk_nfs, /dev/vx/rdmp/*

将盘/voting_disk/vote_disk_nfs加到磁盘组OCRVOTE中的一个新的故障组中,添加完成之后我们可以看到磁盘组OCRVOTE有3个故障组。



3.10检查votedisk是否在新增盘上

$ crsctl query css votedisk

##STATE File Universal Id File Name Disk group

------- ----------------- --------- ---------

1. ONLINE89210622f0864ff0bf9517205691e679 (/voting_disk/vote_disk_nfs) [OCRVOTE]

2.ONLINE 55c4ee685a824ff3bf6ce510bf09468e(/dev/vx/rdmp/remc0_04a1) [OCRVOTE]

3.ONLINE 159234e88fe64f55bf0d4571362c3b07(/dev/vx/rdmp/ rhitachi_v0_11cd)[OCRVOTE]

Located 3 voting disk(s).

3.11开始建库,建库完成之后,至此ASM EXTEND RAC创建完成

4.ASM EXTEND RAC高可用测试

4.1 拔掉节点1、节点2的EMC存储光纤,模拟一边存储宕掉

css日志如下:

节点1::

2014-05-20 14:46:44.886:

[cssd(4129042)]CRS-1649:An I/O erroroccured for voting file: /dev/remc0_04a5; details at (:CSSNM00060:) in/oracle/app/11.2.0/grid/log/testdb1/cssd/ocssd.log.

2014-05-20 14:46:44.886:

[cssd(4129042)]CRS-1649:An I/O erroroccured for voting file: /dev/remc0_04a5; details at (:CSSNM00059:) in/oracle/app/11.2.0/grid/log/testdb1/cssd/ocssd.log.

2014-05-20 14:46:46.051:

[cssd(4129042)]CRS-1626:A Configurationchange request completed successfully

2014-05-20 14:46:46.071:

[cssd(4129042)]CRS-1601:CSSDReconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

节点2:

2014-05-20 14:46:46.053:

[cssd(4195026)]CRS-1604:CSSD voting file isoffline: /dev/remc0_04a5; details at (:CSSNM00069:) in/oracle/app/11.2.0/grid/log/testdb2/cssd/ocssd.log.

2014-05-20 14:46:46.053:

[cssd(4195026)]CRS-1626:A Configurationchange request completed successfully

2014-05-20 14:46:46.071:

[cssd(4195026)]CRS-1601:CSSDReconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

节点3:

2014-05-20 14:46:46.053:

[cssd(3604942)]CRS-1604:CSSD voting file isoffline: /dev/remc0_04a5; details at (:CSSNM00069:) in/oracle/app/11.2.0/grid/log/testdb3/cssd/ocssd.log.

2014-05-20 14:46:46.053:

[cssd(3604942)]CRS-1626:A Configurationchange request completed successfully

2014-05-20 14:46:46.074:

[cssd(3604942)]CRS-1601:CSSD Reconfigurationcomplete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

节点4:

2014-05-20 14:46:46.053:

[cssd(3015132)]CRS-1604:CSSD voting file isoffline: /dev/remc0_04a5; details at (:CSSNM00069:) in/oracle/app/11.2.0/grid/log/testdb4/cssd/ocssd.log.

2014-05-20 14:46:46.053:

[cssd(3015132)]CRS-1626:A Configurationchange request completed successfully

2014-05-20 14:46:46.073:

[cssd(3015132)]CRS-1601:CSSDReconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

CRS状态正常:

testdb3:/oracle/app/11.2.0/grid/log/testdb3/cssd(testdb3)$/oracle/app/11.2.0/grid/bin/crsctlstat res -t

--------------------------------------------------------------------------------

NAME TARGET STATESERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.LISTENER.lsnr

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.OCRVOTE.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.SYSDG.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.asm

ONLINE ONLINEtestdb1Started

ONLINE ONLINEtestdb2Started

ONLINE ONLINEtestdb3Started

ONLINE ONLINEtestdb4Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.ons

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.registry.acfs

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINEtestdb4

ora.cvu

1 ONLINE ONLINEtestdb3

ora.oc4j

1 ONLINE ONLINEtestdb3

ora.scan1.vip

1 ONLINE ONLINEtestdb4

ora.testdb.db

1 ONLINE ONLINEtestdb1 Open

2 ONLINE ONLINEtestdb2 Open

3 ONLINE ONLINEtestdb3 Open

4 ONLINE ONLINEtestdb4 Open

ora.testdb1.vip

1 ONLINE ONLINEtestdb1

ora.testdb2.vip

1 ONLINE ONLINEtestdb2

ora.testdb3.vip

1 ONLINE ONLINEtestdb3

ora.testdb4.vip

1 ONLINE ONLINEtestdb4

查看votedisk如下:

$ /oracle/app/11.2.0/grid/bin/crsctl querycss votedisk

##STATE File Universal Id File Name Disk group

------- ----------------- --------- ---------

1.ONLINE 8a31ddf5013d4fb1bfdbb01d6fc6eb7b(/dev/rhitachi_v0_11cc) [OCRVOTE]

2.ONLINE 1ef9486d54b24f8cbf07814d2848a009(/voting_disk/vote_disk_nfs) [OCRVOTE]

Located 2 voting disk(s).

当把存储光纤插回去之后手动online磁盘,两边存储会自动同步数据

alter diskgroup SYSDG online disks in failgroup fail_1;

alter diskgroup DATADG online disks in failgroup fail_1;

测试结果

所有EMC存储在各节点ASM磁盘组中都自动OFFLINE,保留HDS存储,各节点实例正常。在测试中我们拔掉hds存储光纤,现象跟拔掉EMS存储光纤一致。由此可以得出:当一边存储宕掉之后,ASM EXTEND RAC保留好的那边存储,各节点实例均正常。当把存储光纤插回去之后手动online磁盘,两边存储会自动同步数据。

注:存放votedisk的磁盘组在磁盘挂回来之后会自动online磁盘

4.2reboot节点1、2主机,模拟主机突然宕掉故障

当reboot节点1、2主机,查看crs资源状态如下:

$ crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATESERVERSTATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ARCHDG.dg

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.DATADG.dg

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.LISTENER.lsnr

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.OCRVOTE.dg

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.SYSDG.dg

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.asm

ONLINE ONLINEtestdb3 Started

ONLINE ONLINEtestdb4 Started

ora.gsd

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.ons

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.registry.acfs

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINEtestdb3

ora.cvu

1 ONLINE ONLINEtestdb3

ora.oc4j

1 ONLINEONLINE testdb3

ora.scan1.vip

1 ONLINE ONLINEtestdb3

ora.testdb.db

1 ONLINE OFFLINE

2 ONLINE OFFLINE

3 ONLINE ONLINEtestdb3 Open

4 ONLINE ONLINEtestdb4 Open

ora.testdb1.vip

1 ONLINE INTERMEDIATE testdb4 FAILED OVER

ora.testdb2.vip

1 ONLINE INTERMEDIATE testdb3 FAILED OVER

ora.testdb3.vip

1 ONLINE ONLINEtestdb3

ora.testdb4.vip

1 ONLINE ONLINEtestdb4

当节点1、2主机起来之后,在查看CRS状态如下:

$crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATESERVERSTATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINEONLINE testdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.LISTENER.lsnr

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.OCRVOTE.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.SYSDG.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.asm

ONLINE ONLINEtestdb1 Started

ONLINE ONLINEtestdb2 Started

ONLINE ONLINEtestdb3 Started

ONLINE ONLINEtestdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.ons

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.registry.acfs

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINEtestdb3

ora.cvu

1 ONLINE ONLINEtestdb3

ora.oc4j

1 ONLINE ONLINEtestdb4

ora.scan1.vip

1 ONLINE ONLINEtestdb3

ora.testdb.db

1 ONLINE ONLINEtestdb1 Open

2 ONLINE ONLINEtestdb2 Open

3 ONLINE ONLINEtestdb3 Open

4 ONLINE ONLINEtestdb4 Open

ora.testdb1.vip

1 ONLINE ONLINEtestdb1

ora.testdb2.vip

1ONLINE ONLINE testdb2

ora.testdb3.vip

1 ONLINE ONLINEtestdb3

ora.testdb4.vip

1 ONLINE ONLINEtestdb4

测试结果

当宕掉1个或多个节点时,其VIP会飘至正常节点,所有客户端重连接到可用节点,当测试主机重启完成之后,CRS会自动拉起,且VIP会正常回飘。

4.3 模拟public网络中断

由于主机做了虚拟化,无法拔除网线。使用命令ifconfig en1 down宕掉节点1 public ip所在的网卡进行测试

1)查看节点1发现公有IP、VIP及SCAN IP均在网卡en1上。

root@testdb1:/#netstat -in

NameMtu Network Address Ipkts Ierrs Opkts OerrsColl

en11500 link#2 0.14.5e.79.5c.ca 51537320 4066346 20

en11500 100.15.64 100.15.64.180 51537320 4066346 20

en11500 100.15.64 100.15.64.184 51537320 4066346 20

en11500 100.15.64 100.15.64.188 51537320 4066346 20

en21500 link#3 0.14.5e.79.5b.e6 40305463 0 44224443 20

en21500 7.154.64 7.154.64.1 40305463 0 44224443 20

en21500 169.254 169.254.78.30 403054630 44224443 2 0

lo016896 link#1 2316784 02316787 0 0

lo016896 127 127.0.0.1 2316784 02316787 0 0

lo016896 ::1%12316784 0 23167870 0

2)使用命令ifconfig en1down进行测试

root@testdb1:/oracle/app/11.2.0/grid/bin#ifconfigen1 down

3)查看crs资源状态发现vip,scan ip均已飘至正常节点

testdb3:/home/oracle(testdb3)$crsctl statres -t

--------------------------------------------------------------------------------

NAME TARGET STATESERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.LISTENER.lsnr

ONLINE OFFLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.OCRVOTE.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.SYSDG.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.asm

ONLINE ONLINEtestdb1 Started

ONLINE ONLINEtestdb2 Started

ONLINE ONLINEtestdb3 Started

ONLINE ONLINEtestdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE OFFLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.ons

ONLINE OFFLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.registry.acfs

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINEtestdb2

ora.cvu

1ONLINE ONLINE testdb2

ora.oc4j

1 ONLINE ONLINEtestdb4

ora.scan1.vip

1 ONLINE ONLINEtestdb2

ora.testdb.db

1 ONLINE ONLINEtestdb1 Open

2 ONLINE ONLINEtestdb2 Open

3 ONLINE ONLINEtestdb3 Open

4 ONLINE ONLINEtestdb4 Open

ora.testdb1.vip

1 ONLINE INTERMEDIATE testdb4 FAILED OVER

ora.testdb2.vip

1 ONLINE ONLINEtestdb2

ora.testdb3.vip

1 ONLINE ONLINEtestdb3

ora.testdb4.vip

1 ONLINE ONLINEtestdb4

4)将节点1的en1网卡启起来

root@testdb1:/#ifconfig en1 up

5)查看crs资源状态发现vip正常回飘

testdb3:/home/oracle(testdb3)$crsctl statres -t

--------------------------------------------------------------------------------

NAME TARGET STATESERVERSTATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.LISTENER.lsnr

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.OCRVOTE.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.SYSDG.dg

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.asm

ONLINE ONLINEtestdb1 Started

ONLINE ONLINEtestdb2 Started

ONLINE ONLINEtestdb3 Started

ONLINE ONLINEtestdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.ons

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

ora.registry.acfs

ONLINE ONLINEtestdb1

ONLINE ONLINEtestdb2

ONLINE ONLINEtestdb3

ONLINE ONLINEtestdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINEtestdb2

ora.cvu

1 ONLINE ONLINEtestdb2

ora.oc4j

1 ONLINEONLINE testdb4

ora.scan1.vip

1 ONLINE ONLINEtestdb2

ora.testdb.db

1 ONLINE ONLINEtestdb1 Open

2 ONLINE ONLINEtestdb2 Open

3 ONLINE ONLINEtestdb3 Open

4 ONLINE ONLINEtestdb4 Open

ora.testdb1.vip

1 ONLINE ONLINEtestdb1

ora.testdb2.vip

1 ONLINE ONLINEtestdb2

ora.testdb3.vip

1 ONLINE ONLINEtestdb3

ora.testdb4.vip

1 ONLINE ONLINEtestdb4

测试结果

测试节点(节点1)监听停止,SCAN LISTENER原来在该节点运行,已漂移到其他可用节点,测试节点 VIP漂移到其他可用节点,当网卡起来之后(public网络恢复正常),VIP正常回飘,测试节点监听自动online,SCAN LISTENER及scan VIP没回飘。而后我们依次测试宕掉其他节点的public IP所在网卡,发现SCAN LISTENER漂移至instance_number最小的节点,而vip随机漂移。

4.4 宕掉监听测试

通过kill监听进程实现

测试结果

原有连接没有收到影响,新的连接不能连到该节点实例,应用通过TAF或自动重连到另一节点

监听进程自动重新启动

4.5 数据库单个实例crash测试

通过kill pmon进程实现

测试结果

kill pmon进程后,数据库实例crash,并且实例自动重启,重启完成后会话自动重新连接

4.6 模拟CSSD进程crash

通过kill cssd进程实现

测试结果

kill cssd进程后,该节点重启,VIP飘至其他正常节点,主机启动完成后CRS自动拉起,集群重新配置。

4.7 模拟CRSD进程crash

通过kill crsd进程实现

测试结果

kill crsd.bin进程后,一分钟内该进程自动拉起。原理:crsd进程crash将会被orarootagent检测到,同时crsd进程会被自动重启。

4.8 模拟EVMD进程crash

通过kill evmd进程实现

测试结果

kill evmd.bin进程后,一分钟内该进程自动拉起。原理:evmd进程crash将被ohasd进程检测到,evmd、orarootagent和crsd进程将会被重启