程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
 程式師世界 >> 數據庫知識 >> Oracle數據庫 >> Oracle教程 >> 10gClusterwareVotedisk損壞的恢復方法

10gClusterwareVotedisk損壞的恢復方法

編輯:Oracle教程

10gClusterwareVotedisk損壞的恢復方法


votedisk無論是對於RAC(10g Clusterware、11g GI)而言,是非常重要的,我們稱它為仲裁盤,當RAC集群中的某個節點發生故障而脫網掉線時,就由它來判斷是否將其踢出集群,以保證集群正常運行,當votedisk損壞了,也就會導致集群服務無法啟動,集群資源都無法加載,最後導致罷工。那麼我們平時就要注意對votedisk的備份,在11g中,由於votedisk和ocr默認就會放進ASM磁盤組,因此可以不用特別關注,但對於10g的Cluster來說,由於不能放到ASM磁盤組,只能以raw的形式使用,因此要特別關注votedisk,定期對其進行備份,如:
用dd命令備份和恢復votedisk的方法: 備份:dd if=/dev/raw/raw3 of=/tmp/votedisk.bak 恢復:dd if=/tmp/votedisk.bak of=/dev/raw/raw3
如果很不幸,之前沒有做過備份,且沒有做過鏡像,當votedisk損壞的時候,就只能對crs進行重建了,下面來演示一下這個過程:
--關閉crs,對votedisk的盤進行破壞,這裡是/dev/raw/raw3 [root@rac1 ~]# dd if=/dev/zero of=/dev/raw/raw3 bs=4096 count=12800

再次重啟crs,就提示無法啟動了,查找ocssd.log日志文件發現,其中有記錄,說明了是磁盤損壞 PS:10g Clusterware的日志入口地址是$ORA_CRS_HOME/log/主機名/...
[ CSSD]2015-01-16 09:37:38.327 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2094 Oracle. All rights reserved. [ CSSD]2015-01-16 09:37:38.327 >USER: CSS daemon log for node rac1, number 1, in cluster cluster [ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_CSSD)) [ CSSD]2015-01-16 09:37:38.332 [3059615952] >TRACE: clssscmain: local-only set to false [ CSSD]2015-01-16 09:37:38.344 [3059615952] >TRACE: clssnmReadNodeInfo: added node 1 (rac1) to cluster [ CSSD]2015-01-16 09:37:38.352 [3059615952] >TRACE: clssnmReadNodeInfo: added node 2 (rac2) to cluster [ CSSD]2015-01-16 09:37:38.356 [3032808336] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1 [ CSSD]2015-01-16 09:37:38.356 [3059615952] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor [ CSSD]2015-01-16 09:37:38.362 [3059615952] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw3) [ CSSD]2015-01-16 09:37:40.381 [3032808336] >TRACE: clssnmvDiskOpen: corrupt kill block on disk (0x09!=0x636c73536b696c4c) [ CSSD]2015-01-16 09:37:40.381 [3032808336] >TRACE: clssnmDiskStateChange: state from 2 to 3 disk (0//dev/raw/raw3)
重建crs很簡單,就執行2個腳本:
1.$ORA_CRS_HOME/install/rootdelete.sh
2.$ORA_CRS_HOME/install/rootdeinstall.sh


節點1: [root@rac1 install]# ./rootdelete.sh Shutting down Oracle Cluster Ready Services (CRS): Stopping resources. Error while stopping resources. Possible cause: CRSD is down. Stopping CSSD. Unable to communicate with the CSS daemon. Shutdown has begun. The daemons should exit soon. Checking to see if Oracle CRS stack is down... Oracle CRS stack is not running. Oracle CRS stack is down now. Removing script for Oracle Cluster Ready services Updating ocr file for downgrade Cleaning up SCR settings in '/etc/oracle/scls_scr' [root@rac1 install]# ./rootdeinstall.sh
Removing contents from OCR device 2560+0 records in 2560+0 records out 10485760 bytes (10 MB) copied, 0.590608 seconds, 17.8 MB/s
節點2: [root@rac2 install]# ./rootdelete.sh Shutting down Oracle Cluster Ready Services (CRS): OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format Shutdown has begun. The daemons should exit soon. Checking to see if Oracle CRS stack is down... Oracle CRS stack is not running. Oracle CRS stack is down now. Removing script for Oracle Cluster Ready services Updating ocr file for downgrade Cleaning up SCR settings in '/etc/oracle/scls_scr' [root@rac2 install]# ./rootdeinstall.sh
Removing contents from OCR device 2560+0 records in 2560+0 records out 10485760 bytes (10 MB) copied, 0.627909 seconds, 16.7 MB/s [root@rac2 install]# dd if=/dev/zero of=/dev/raw/raw3 bs=4096 count=128000 dd: writing `/dev/raw/raw3': No space left on device 25601+0 records in 25600+0 records out 104857600 bytes (105 MB) copied, 5.40456 seconds, 19.4 MB/s
然後重新在2個節點依次執行$ORA_CRS_HOME/root.sh就可以了,軟件的OUI不用重新安裝

如果通過腳本無法刪除成功,安裝順利重新安裝crs,可以手工刪除以下目錄:
rm /etc/oracle/* rm -f /etc/init.d/init.cssd rm -f /etc/init.d/init.crs rm -f /etc/init.d/init.crsd rm -f /etc/init.d/init.evmd rm -f /etc/rc2.d/K96init.crs rm -f /etc/rc2.d/S96init.crs rm -f /etc/rc3.d/K96init.crs rm -f /etc/rc3.d/S96init.crs rm -f /etc/rc5.d/K96init.crs rm -f /etc/rc5.d/S96init.crs rm -Rf /etc/oracle/scls_scr rm -f /etc/inittab.crs cp /etc/inittab.orig /etc/inittab
總結:
平時我們都會對ocr和votedisk磁盤做多個鏡像冗余,另外,如果是裸設備的話,還會通過dd命令單獨去備份,通常是不太容易損壞和丟失的,萬一發生了無備份情況下的損壞,那麼就只能工作重建crs來解決問題了,這就是DBAs們的最後一根救命稻草了。

  1. 上一頁:
  2. 下一頁:
Copyright © 程式師世界 All Rights Reserved