Tuesday, 14 June 2016

RAC boot sequence

RAC Startup Sequence

Stop sequence for a node:
emctl stop dbconsole
srvctl stop database -d db_name
srvctl stop asm -n node_name
srvctl stop nodeapps -n node_name --> in 11.2 stops only ONS and eONS because of some dependencies.
crsctl stop crs --root

Or stop csr on all nodes of clusterware:
crsctl stop cluster -all

Start : start from bottom to up



Level 1: OHASD Spawns:
    cssdagent         – Agent responsible for spawning CSSD.
    orarootagent     – Agent responsible for managing all root owned ohasd resources.
    oraagent         – Agent responsible for managing all oracle owned ohasd resources.
    cssdmonitor        – Monitors CSSD and node health (along wth the cssdagent).
   
Level 2: OHASD rootagent spawns:
    CSDD (ora.cssd)     – Cluster Synchronization Services
    CRSD(ora.crsd)     – Primary daemon responsible for managing cluster resources.
    CTSSD(ora.ctssd)     – Cluster Time Synchronization Services Daemon
    Diskmon(ora.diskmon)
    ACFS (ASM Cluster File System) Drivers
Level 2: OHASD oraagent spawns:
    MDNSD(ora.mdnsd)     – Used for DNS lookup
    GIPCD(ora.gipcd)     – Used for inter-process and inter-node communication
    GPNPD(ora.gpnpd)     – Grid Plug & Play Profile Daemon
    EVMD(ora.evmd)     – Event Monitor Daemon
    ASM(ora.asm)     – Resource for monitoring ASM instances
Level 3: CRSD spawns:
    orarootagent     – Agent responsible for managing all root owned crsd resources.
    oraagent         – Agent responsible for managing all oracle owned crsd resources.
Level 4: CRSD rootagent spawns:
    Network resource     – To monitor the public network
    SCAN VIP(s)     – Single Client Access Name Virtual IPs
    Node VIPs         – One per node
    ACFS Registery     – For mounting ASM Cluster File System
    GNS VIP (optional)     – VIP for GNS
Level 4: CRSD oraagent spawns:
    ASM Resouce     – ASM Instance(s) resource
    Diskgroup         – Used for managing/monitoring ASM diskgroups.
    DB Resource     – Used for monitoring and managing the DB and instances
    SCAN Listener     – Listener for single client access name, listening on SCAN VIP
    Listener         – Node listener listening on the Node VIP
    Services         – Used for monitoring and managing services
    ONS         – Oracle Notification Service
    eONS         – Enhanced Oracle Notification Service
    GSD         – For 9i backward compatibility
    GNS (optional)     – Grid Naming Service – Performs name resolution
====
Clusterware startup command  $GI_HOME/bin/crsctl start crs
A workaround is to look at the some of outputs  of root.sh command during the initial Grid infrastructure installation process as follow:
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
CRS-2672: Attempting to start ‘ora.gipcd’ on ‘owirac1’
CRS-2672: Attempting to start ‘ora.mdnsd’ on ‘owirac1’
CRS-2676: Start of ‘ora.mdnsd’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.gipcd’ on ‘owirac1’
CRS-2676: Start of ‘ora.gipcd’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.gpnpd’ on ‘owirac1’
CRS-2676: Start of ‘ora.gpnpd’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘owirac1’
CRS-2676: Start of ‘ora.cssdmonitor’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.cssd’ on ‘owirac1’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘owirac1’
CRS-2676: Start of ‘ora.diskmon’ on ‘owirac1’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.ctssd’ on ‘owirac1’
CRS-2676: Start of ‘ora.ctssd’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.asm’ on ‘owirac1’
CRS-2676: Start of ‘ora.asm’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.crsd’ on ‘owirac1’
CRS-2676: Start of ‘ora.crsd’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.evmd’ on ‘owirac1’
CRS-2676: Start of ‘ora.evmd’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.asm’ on ‘owirac1’
CRS-2676: Start of ‘ora.asm’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.OCRVOTDSK.dg’ on ‘owirac1’
CRS-2676: Start of ‘ora.OCRVOTDSK.dg’ on ‘owirac1’ succeeded
CRS-2672: Attempting to start ‘ora.registry.acfs’ on ‘owirac1’
CRS-2676: Start of ‘ora.registry.acfs’ on ‘owirac1’ succeeded
This  sequence shows  the   ASM instance startup  is just one step in middle of  the entire sequence : Some of  CRS components such as CSSD, CTSS get started before ASM, while other components such as CRSD,  EVEMD, ACFS are up after the ASM starts.  This sequence can be also confirmed by the  timestamps and log messages in  clusterware log files  (alter<hostname>.log, cssd.log and crsd.log)  and ASM instance log like  alert_+ASM1.log . Here are the sequences of messages and their timestamps: during the startup of 11g R2 clusterware and ASM instance:
OLR service started  : 2011-01-17 14:33:13.678
Starting CSS daemon 2011-01-17 14:33:18.684:
Fetching asmlib disk :ORCL:OCR1 : 2011-01-17 14:33:24.825
Read ASM header off dev:ORCL:OCR3:224:256
Opened hdl:0x1d485110 for dev:ORCL:OCR1: 2011-01-17 14:33:24.829
Successful discovery for disk ORCL:OCR1 : 2011-01-17 14:33:24.837
Successful discovery of 5 disks: 2011-01-17 14:33:24.838
CSSD voting file is online: ORCL:OCR1:  2011-01-17 14:33:50.047
CSSD Reconfiguration complete: 2011-01-17 14:34:07.729
The Cluster Time Synchronization Service started:  2011-01-17 14:34:12.333
Note: ** CSSD and CTSSD got up before ASM .  Votingdisks were discovered  by reading the header of the ASM disks (OCRL:OCR1) of  the votingdisk diskgroup without using ASM instance **
Starting ASM: Jan 17 14:34:13 2011 
CRS Daemon Starting 2011-01-17 14:34:30.329:  
Checking the OCR device : 2011-01-17 14:34:30.331
Initializing OCR 2011-01-17 14:34:30.337
diskgroup OCRVOTDSK was mounted : Jan 17 14:34:30 2011
OCRVOTDSK was mounted : Mon Jan 17 14:34:30 2011
The OCR service started : 2011-01-17 14:34:30.835
Verified ocr1-5: 2011-01-17 14:33:50.128
Cluster Time Synchronization Service started:  2011-01-17 14:34:12.333
The OCR service started : 2011-01-17 14:34:30.835 
CRSD started: 2011-01-17 14:34:31.902
Note: CRS server started  after ASM is up and the diskgroup for OCR and votingdisks are mounted
From this sequence of the log message and timestamp, we get some understanding about the sequence of clusterware and ASM instance:
1)      CSSD and CTSSD are up before ASM
2)      Votingdisks used by CSSD are discovered by reading the header of the disks, not throught ASM
3)      Startup of CRS service has to wait until ASM instance is up and the diskgroup for OCR and votingdisk is mounted.



No comments:

Post a Comment