Oracle的control文件

这里我们要实验的是数据库在open的状态下,破快控制文件,数据库会不会down

查看controlfiles的路径

SQL> show parameter control_files

NAME TYPE VALUE

----------------------------------------------- ------------------------------

control_files string /u01/app/oracle/oradata/orcl/control01.ctl,/u01/app/oracle/fast_recovery_area/orcl/control02.ctl

破坏控制文件

[oracle@togogo~]$ cat /dev/null > /u01/app/oracle/oradata/orcl/control01.ctl

[oracle@togogo~]$ cat /dev/null > /u01/app/oracle/fast_recovery_area/orcl/control02.ctl

验证数据库的状态

SQL>select * from dual;

D

-

X

#### 做检查点,切换redolog

SQL> alter system checkpoint;

System altered.

SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

#### check alert log

SatFeb 10 20:09:55 2018

Thread1 cannot allocate new log, sequence 8

Privatestrand flush not complete

Current log# 1 seq# 7 mem# 0:/u01/app/oracle/oradata/orcl/redo01.log

Thread1 advanced to log sequence 8 (LGWR switch)

Current log# 2 seq# 8 mem# 0:/u01/app/oracle/oradata/orcl/redo02.log

SatFeb 10 20:10:19 2018

Thread1 cannot allocate new log, sequence 9

Privatestrand flush not complete

Current log# 2 seq# 8 mem# 0:/u01/app/oracle/oradata/orcl/redo02.log

Thread1 advanced to log sequence 9 (LGWR switch)

Current log# 3 seq# 9 mem# 0:/u01/app/oracle/oradata/orcl/redo03.log

Thread1 cannot allocate new log, sequence 10

Checkpoint not complete

Current log# 3 seq# 9 mem# 0:/u01/app/oracle/oradata/orcl/redo03.log

Thread1 cannot allocate new log, sequence 10

Privatestrand flush not complete

Current log# 3 seq# 9 mem# 0:/u01/app/oracle/oradata/orcl/redo03.log

Thread1 advanced to log sequence 10 (LGWR switch)

Current log# 1 seq# 10 mem# 0:/u01/app/oracle/oradata/orcl/redo01.log

这是为什么呢?那是因为其进程持有的句柄并有释放,如下

[oracle@togogo~]$ ps -ef|grep ckpt|grep -v grep

oracle 8427 1 0 19:47 ? 00:00:00 ora_ckpt_orcl

[oracle@togogo~]$ cd /proc/8427/fd

[oracle@togogofd]$ ls -ltr |grep control

lrwx------1 oracle oinstall 64 Feb 10 20:09 257 ->/u01/app/oracle/fast_recovery_area/orcl/control02.ctl

lrwx------1 oracle oinstall 64 Feb 10 20:09 256 -> /u01/app/oracle/oradata/orcl/control01.ctl

####session 1 trace跟踪

这里说一个命令 strace

举一个例子

strace-o output.txt -T -tt -e trace=all -p 28979

上面的含义是 跟踪28979进程的所有系统调用(-e trace=all),并统计系统调用的花费时间,以及开始时间(并以可视化的时分秒格式显示),最后将记录结果存在output.txt文件里面

[oracle@togogofd]$ strace -fr -o /tmp/8427.log -p 8427

Process8427 attached - interrupt to quit

####观察trace.log

8427 0.000156gettimeofday({1518265363, 443228}, NULL) = 0

8427 0.000176gettimeofday({1518265543, 106769}, NULL) = 0

8427 0.000077gettimeofday({1518265543, 106845}, NULL) = 0

8427 0.000072gettimeofday({1518265543, 106917}, NULL) = 0

8427 0.000077 pwrite64(256,"\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\312T\0\0\2\0\0\0\0\0\0\0Q\0\0\0"...,16384, 49152) = 16384

8427 0.006462gettimeofday({1518265543, 113463}, NULL) = 0

8427 0.000089gettimeofday({1518265543, 113548}, NULL) = 0

8427 0.000080 pwrite64(257,"\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\312T\0\0\2\0\0\0\0\0\0\0Q\0\0\0"...,16384, 49152) = 16384

8427 0.000734gettimeofday({1518265543, 114364}, NULL) = 0

8427 0.000081gettimeofday({1518265543, 114443}, NULL) = 0

8427 0.000081gettimeofday({1518265543, 114525}, NULL) = 0

8427 0.000078gettimeofday({1518265543, 114603}, NULL) = 0

8427 0.000211gettimeofday({1518265543, 114816}, NULL) = 0

8427 0.000080gettimeofday({1518265543, 114891}, NULL) = 0

8427 0.000081pread64(256, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4r\t\0\0\0\0\0\0\0\0\v\373M\21Y"..., 16384, 16384) = 16384

这的 256 257 是表示的是文件描述符 FD filediscription

16384是表示一个块的大小

49152是表示偏移量

通过以上可以得到结论

1. 进程信息可以在/proc下看到,例如: /proc/8427/stat

2. 对于linux,对于文件的读写,是通过调用函数read,pwrite64 来实现的。

3. 我们可以发现检查点进程ckpt 3s触发一次的机制。

4. 对于pwrite64的操作,是通过写fd (256.257)2个文件来完成的,其中对应的offset都是49152,且我们知道这3个被写入的文件 (我们知道是controlfile) 的block大小是16384.