Thursday, September 19, 2013

Upgrade RAC 11.2.0.1 to 11.2.0.3 (Part I Software Installation)

In this post I'll discuss full implementation of upgrading Oracle RAC 11.2.0.1 to 11.2.0.3, the new RAC 11.2.0.3 will be installed on a new hardware (outplace upgrade).
This lengthy post (in order to make it more beneficial) I divided it to FOUR major posts:
  Part I   OS Installation, Filesystem preparation (OCFS2 on ISCSI)
             ->Covers Oracle Enterprise Linux 5.9 x86_64 installation, preparation of ISCSI storage, using OCFS2 to format the shared filesystem.
             ->Covers Preparation of Linux OS for Oracle ,11.2.0.3 Grid Infrastructure & database software installation.
             ->Covers the creation of a standby database being refreshed from the primary DB 11.2.0.1 taking advantage of a new feature that a standby DB with an 11gr2 higher release can be refreshed from 11gr2 lower release.
  Part IV Database Upgrade from 11.2.0.1 to 11.2.03
             ->Covers switching over the new standby DB resides on 11.2.0.3 server to act as a primary DB, Upgrade the new primary DB from 11.2.0.1 to 11.2.0.3

Feel free to click on the part you interested in.

Part I, II, III doesn't require a downtime as they are being done on a different hardware as part of our (outplace upgrade) only Part IV is the one will require a downtime window.
The whole implementation may take less than two hours of downtime if every thing went smooth. but it will take long hours of DBA work.

In this post I tried to provide reference for each point. Also followed many recommendations recommended by the Maximum Availability Architecture (MAA).

Part I   OS Installation, Filesystem preparation (OCFS2 on ISCSI)


The following are "good to know" information before installing Oracle 11.2.0.3 on any hardware:

Knowledge Requirements:
===================
RAC and Oracle Clusterware Best Practices and Starter Kit (Linux) [Metalink Doc ID 811306.1]
RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent) [Metalink Doc ID 810394.1]
Configuring raw devices (singlepath) for Oracle Clusterware 10g Release 2 (10.2.0) on RHEL5 [Metalink Doc ID 465001.1]

Product Support Lifetime:
===================
This document indicates database release 11.2 premier support ends at Jan 2015 and the Extended support ends at Jan 2018.

Patching Support Lifetime:
===================
This document indicates Oracle will continue provide security patches for 11.2.0.3 version till 27-Aug-2015.
Security patches means (PSU, CPU, SPU patches). [Metalink Doc ID 742060.1]

Hardware Certification:
=================
RAC Technologies Matrix for Linux Platforms:

The main link for certification Matrix for other platforms: (Linux, Unix, Windows)

In this implementation I'll install RAC on ISCSI NAS storage.

Now let's move from the theoretical part to the technical steps...

Linux Requirements:
===============
If installing 11.2.0.3 on RHEL 5 x86_64:
The minimum requirement is Red Hat Enterprise Linux 5 Update 5 with the Unbreakable Enterprise Kernel 2.6.32 or later.

Partitioning requirement on the server’s local hard disk: [Minimum!]
=======================================
Local hard disk will contain Linux, Grid Infrastructure and database installed binaries.
/u01  => 10G Free space to hold the installation files (GI+DB). I recommend at least 30G to hold future generated logs.
/tmp  => 1G Free space.
SWAP  => RAM=32G which is >8G, SWAP= 75% of RAM = 24G
/dev/shm => must be greater than the sum of MEMORY_MAX_TARGET for all instance if you will use the new 11g feature Automatic Memory Management by setting parameters memory_max_target & memory_target to a specific value which will handle the memory size of SGA & PGA together.

More about /dev/shm:
-According to Oracle Support /dev/shm will not be able to be greater than 50% of the RAM installed on the server.
-Make /dev/shm size = Memory Size installed on the server. or at least the sum of all DBs Memory_max_target on the server.
-/dev/shm must be exist if you will use 11g new feature Auto Memory Management by setting memory_max_target parameter.
-If /dev/shm isn't exist or isn't properly sized the database will pop up this error when starting up if memory_max_target parameter has been set:
 ORA-00845: MEMORY_TARGET not supported on this system.
-Oracle will create files under /dev/shm upon instance startup and will be removed automatically after instance shutdown.
-Oracle will use these files to manage the memory size dynamically between SGA and PGA.
-It's recommended to have the /dev/shm configured with the "tmpfs" option instead of "ramfs", as ramfs is not supported for Automatic Memory Management AMM:
 # df -h /dev/shm
 Filesystem            Size  Used Avail Use% Mounted on
 tmpfs                 16G     0  16G   0%   /dev/shm

BTW I'm not using this feature I'm still stick with sga_taget & pga_aggregate_target. :-)

------------------------------
Linux OS Installation: Both Nodes (Estimated time: 3 hours)
------------------------------
Note: Install a fresh Linux installation on all RAC nodes, DON'T clone the installation from node to other in purpose of saving the time.

FS Layout:
>>>>>>>>>
The whole disk space is 300G

Filesystem      Size(G) Size(M) used in setup
----------          ----    ---------------------
/boot                1G     1072 --Force to Be Primary Partition.
Swap               24G    24576 --Force to Be Primary Partition, 75% of RAM.
/dev/shm          30G    30720 --Force to Be Primary Partition.
/                      20G    20480
/u01                 70G    73728
/home               10G   10547
/tmp                 5G     5240
/var                  10G   10547
/u02                 95G   The rest of space

Note:
 If you will install ISCSI drive avoid making a separate partition for /usr , ISCSI drive will prevent system from booting.

Packages selection during Linux installation:
>>>>>>>>>>>>>

Desktop Environment:
 # Gnome Desktop Environment
Applications:
 #Editors -> VIM
Development:
 # Development Libraries.
 # Development Tools
 # GNOME software development
 # Java Development
 # Kernel Development
 # Legacy Software Development
 # X Software Development
Servers:
 # Legacy Network Server -> Check only: rsh-server,xinetd
 # PostgreSQL -> Check only: UNIXODBC-nnn
 # Server Configuration Tools -> Check All
Base System:
 # Administration Tools.
 # Base -> Un check bluetooth,wireless packs Check-> Device mapper multipath
 # Java
 # Legacy Software Support
 # System Tools -> Check also: OCFS2 packages
 # X Window System

FIREWALL & SELINUX MUST BE STOPPED. [Note ID 554781.1]

I've uploaded OEL 5.9 installation snapshots in this link:

populate /etc/hosts with the IPs and resolved names:
=======================================
# vi /etc/hosts

#You must keep 127.0.0.1  localhost, if removed VIP will not work !!!
#cluster_scan,Public and VIP should be in the same subnet.

127.0.0.1       localhost localhost.localdomain

#Public:
172.18.20.1  ora1123-node1  node1 n1
172.18.20.2  ora1123-node2  node2 n2

#Virtual:
172.18.20.3  ora1123-node1-vip node1-vip n1-vip
172.18.20.4  ora1123-node2-vip node2-vip n2-vip

#Private:
192.168.10.1      ora1123-node1-priv n1-priv node1-priv
192.168.10.2      ora1123-node2-priv n2-priv node2-priv

#Cluster:
172.18.20.10  cluster-scan

#NAS
172.20.30.100   nas nas-server

#11.2.0.1 Servers:
10.60.60.1  ora1121-node1 old1 #the current 11.2.0.1 Node1
10.60.60.2  ora1121-node2 old2 #the current 11.2.0.1 Node2

I've added RAC node names, VIP and private IPs and it's resolved names for both nodes and guess what I'm resolving also the cluster scan in /etc/hosts, keep it a secret don't till Larry :-)
Actually I'm still not convinced with using the SCAN feature, if you will use it in your setup just ask the network admin to resolve at least three SCAN IPs in the DNS to the cluster scan name you will use.
This document will help you understanding the SINGLE CLIENT ACCESS NAME (SCAN):

Upgrade the KERNEL:
==================
-Subscribe The new servers in ULN network.
-Upgrade the Kernel to the latest version.
Ensure that /etc/resolv.conf is equipped with the DNS entry and you are connected to the internet, once this task is done if you don't have a need to connect the servers to the internet make sure the servers are not connecting anymore to the internet for security reason.

On linux server:
-------------------
# up2date --register

Install key? Yes

put this information:

login: xxxxxx
pass:  xxxxxx
CSI#: xxxxxx


In case you still cannot establish a connection with ULN You can use the IP 141.146.44.24 instead of address linux-update.oracle.com in "Network Configuration" button.
Also: in  /etc/sysconfig/rhn/up2date :
      You can change this line:
      noSSLServerURL=http://linux-update.oracle.com/XMLRPC to  noSSLServerURL=http://141.146.44.24/XMLRPC
      and this line:
      serverURL=https://linux-update.oracle.com/XMLRPC  to  serverURL=https://141.146.44.24/XMLRPC

Then proceed with updating the kernel from the same GUI or from command line as shown below:

up2date -d @  --> To download the updated packages
up2date @     --> To install the updated packages

I'm putting the symbol @ for skipping the GUI mode and continue with CLI.

Configure YUM with ULN:
--------------------------------
# cd /etc/yum.repos.d
# wget http://public-yum.oracle.com/public-yum-el5.repo
# vi public-yum-el5.repo
Modify the following:
Under both paragraphs: [el5_latest] & [ol5_UEK_latest] modify enabled=0 to enabled=1
An excerpt:

[el5_latest]
name=Oracle Linux $releasever Latest ($basearch)
baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL5/latest/$basearch/
gpgkey=http://public-yum.oracle.com/RPM-GPG-KEY-oracle-el5
gpgcheck=1
enabled=1

[ol5_UEK_latest]
name=Latest Unbreakable Enterprise Kernel for Oracle Linux $releasever ($basearch)
baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL5/UEK/latest/$basearch/
gpgkey=http://public-yum.oracle.com/RPM-GPG-KEY-oracle-el5
gpgcheck=1
enabled=1

Network configuration:
================
Node1:
--------
# cat /etc/sysconfig/network-scripts/ifcfg-eth0   #=>Public
DEVICE=eth0
BOOTPROTO=static
BROADCAST=172.18.20.255
IPADDR=172.18.20.1
NETMASK=255.255.255.0
NETWORK=172.18.20.0
ONBOOT=yes

# cat /etc/sysconfig/network-scripts/ifcfg-eth1   #=>ISCSI NAS
DEVICE=eth1
BOOTPROTO=static
BROADCAST=172.20.30.255
IPADDR=172.20.30.101
NETMASK=255.255.255.0
NETWORK=172.20.30.0
ONBOOT=yes

# cat /etc/sysconfig/network-scripts/ifcfg-eth3   #=>Private
DEVICE=eth3
BOOTPROTO=static
BROADCAST=192.168.10.255
IPADDR=192.168.10.1
NETMASK=255.255.255.0
NETWORK=192.168.10.0
ONBOOT=yes


Node2:
--------
# cat /etc/sysconfig/network-scripts/ifcfg-eth0   #=>Public
DEVICE=eth0
BOOTPROTO=static
BROADCAST=172.18.20.255
IPADDR=172.18.20.2
NETMASK=255.255.255.0
NETWORK=172.18.20.0
ONBOOT=yes

# cat /etc/sysconfig/network-scripts/ifcfg-eth1   #=>ISCSI NAS STORAGE
DEVICE=eth1
BOOTPROTO=static
BROADCAST=172.20.30.255
IPADDR=172.20.30.102
NETMASK=255.255.255.0
NETWORK=172.20.30.0
ONBOOT=yes

# cat /etc/sysconfig/network-scripts/ifcfg-eth3   #=>Private
DEVICE=eth3
BOOTPROTO=static
BROADCAST=192.168.10.255
IPADDR=192.168.10.2
NETMASK=255.255.255.0
NETWORK=192.168.10.0
ONBOOT=yes

-----------------------------
Filesystem Preparation:
-----------------------------

RAC servers will connect to the NAS shared storage using ISCSI protocol.

ISCSI Configuration:

Required packages:
# rpm -q iscsi-initiator-utils
# yum install iscsi-initiator-utils

To get ISCSI aware that LUNs are being accessed simultaneously by more than one node at the same time and to avoid LUNs corruption, use one of the following ways (A or B):
A) Generate IQN number
or
B) Setup username and password of ISCSI storage

I'll explain both of them:

A) Generate IQN number (ISCSI Qualified Name) in linux for each node to be saved inside NAS configuration console:

On Node1:

Generate an IQN number:
# /sbin/iscsi-iname
iqn.1988-12.com.oracle:9e963384353a

Note: the last portion of IQN after semicolon ":" is editable and can be changed to the node name, I mean instead of "9e963384353a" you can rename it "node1", no space

allowed in the name.

Now insert the generated IQN to /etc/iscsi/initiatorname.iscsi
# vi /etc/iscsi/initiatorname.iscsi
#Note that last portion of the IQN is modifyiable (modify it to meaningful name)
InitiatorName=iqn.1988-12.com.oracle:node1

Do the same on Node2:

On Node2:

# /sbin/iscsi-iname
iqn.1988-12.com.oracle:18e6f43d73ad

# vi /etc/iscsi/initiatorname.iscsi
#Note that last portion of the IQN is modifyiable (modify it to meaningful name)
InitiatorName=iqn.1988-12.com.oracle:node2

Put the same IQN you already inserted in /etc/iscsi/initiatorname.iscsi on both nodes in the NAS administration console for each LUN will be accessed by both nodes.

(This should be done by the Storage Admin)


B) Set up a username and password for ISCSI storage:

# vi /etc/iscsi/iscsid.conf
node.session.auth.username =
node.session.auth.password =
discovery.sendtargets.auth.username =
discovery.sendtargets.auth.password =

Start the iscsi service:
# /etc/init.d/iscsi start

Same Username & password should be configured in the NAS administration console.(This should be done by the Storage Admin)

Continue configuring the ISCSI:
=======================
Turn on the iscsi service to start after booting the machine:
# chkconfig iscsi on

Discover the target LUNs:

# service iscsi start
# iscsiadm -m discovery -t sendtargets -p 172.20.30.100
# (cd /dev/disk/by-path; ls -l *iscsi* | awk '{FS=" "; print $9 " " $10 " " $11}')

Whenever iscsid discovers new target, it will add corresponding information in the following directory:
# ls -lR /var/lib/iscsi/nodes/
# service iscsi restart

Create Persistent Naming: (NON Multipath Configuration)
===================
Every time the machine or ISCSI service restart the partitions source names /dev/sd* will change e.g. /data1 will point to /dev/sdc instead of /dev/sda, the thing we cannot live with it at all.

Note: I only have one physical path "NIC" connecting to the NAS storage, so I apply non multipath configuration.

1) Whitelist all SCSI devices:
-- -----------------------------
# vi /etc/scsi_id.config
#Add the following lines:
vendor="ATA",options=-p 0x80
options=-g


2) Get the names of LUNs and it's device name:
-- --------------------------------------------------
# (cd /dev/disk/by-path; ls -l *iscsi* | awk '{FS=" "; print $9 " " $10 " " $11}')

ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-archive1-lun-0 -> ../../sdn
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-archive2-lun-0 -> ../../sdr
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-backupdisk-lun-0 -> ../../sdi
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-control1-lun-0 -> ../../sdl
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-control2-lun-0 -> ../../sda
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-data1-lun-0 -> ../../sdp
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-index1-lun-0 -> ../../sdo
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-ocr1-lun-0 -> ../../sdq
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-ocr2-lun-0 -> ../../sde
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-ocr3-lun-0 -> ../../sdf
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-redo1-lun-0 -> ../../sdb
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-redo2-lun-0 -> ../../sdm
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-temp1-lun-0 -> ../../sdh
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-undo1-lun-0 -> ../../sdj
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-undo2-lun-0 -> ../../sdc
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-voting1-lun-0 -> ../../sdk
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-voting2-lun-0 -> ../../sdg
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-voting3-lun-0 -> ../../sdd

3) Get the drives UUID:
-- ------------------------
scsi_id -g -s /block/sdn
scsi_id -g -s /block/sdr
scsi_id -g -s /block/sdi
scsi_id -g -s /block/sdl
scsi_id -g -s /block/sda
scsi_id -g -s /block/sdp
scsi_id -g -s /block/sdo
scsi_id -g -s /block/sdq
scsi_id -g -s /block/sde
scsi_id -g -s /block/sdf
scsi_id -g -s /block/sdb
scsi_id -g -s /block/sdm
scsi_id -g -s /block/sdh
scsi_id -g -s /block/sdj
scsi_id -g -s /block/sdc
scsi_id -g -s /block/sdk
scsi_id -g -s /block/sdg
scsi_id -g -s /block/sdd

These UUIDs are the consistent identifiers for the devices, we will use them in the next step.

4) Create the file /etc/udev/rules.d/04-oracle-naming.rules with the following format:
-- --------------------------------------------------------------------------------------
# vi /etc/udev/rules.d/04-oracle-naming.rules

#Add a line for each device specifying the device name & it's UUID:
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d003000000000", NAME="archive1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d004000000000", NAME="archive2"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d002000000000", NAME="backupdisk"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d001000000000", NAME="control1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d005000000000", NAME="control2"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d006000000000", NAME="data1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d007000000000", NAME="index1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d008000000000", NAME="ocr1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d009000000000", NAME="ocr2"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d010000000000", NAME="ocr3"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d011000000000", NAME="redo1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d012000000000", NAME="redo2"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d013000000000", NAME="temp1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d014000000000", NAME="undo1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d015000000000", NAME="undo2"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d016000000000", NAME="voting1"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d017000000000", NAME="voting2"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -s /block/%k", RESULT=="360014052e3032700063d018000000000", NAME="voting3"

# service iscsi restart

5) Check the configuration:
-- -----------------------------
Now new files name under /dev should be e.g. /dev/archive1 instead of /dev/sdn
Note: fdisk -l will not show the new NAS devices anymore, don't worry use the following:

#(cd /dev/disk/by-path; ls -l *iscsi* | awk '{FS=" "; print $9 " " $10 " " $11}')

ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-archive1-lun-0 -> ../../archive1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-archive2-lun-0 -> ../../archive2
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-backupdisk-lun-0 -> ../../backupdisk
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-control1-lun-0 -> ../../control1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-control2-lun-0 -> ../../control2
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-data1-lun-0 -> ../../data1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-index1-lun-0 -> ../../index1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-ocr1-lun-0 -> ../../ocr1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-ocr2-lun-0 -> ../../ocr2
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-ocr3-lun-0 -> ../../ocr3
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-redo1-lun-0 -> ../../redo1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-redo2-lun-0 -> ../../redo2
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-temp1-lun-0 -> ../../temp1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-undo1-lun-0 -> ../../undo1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-undo2-lun-0 -> ../../undo2
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-voting1-lun-0 -> ../../voting1
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-voting2-lun-0 -> ../../voting2
ip-172.20.30.100:3260-iscsi-iqn.2013-7.VLA-NAS03:pefms-voting3-lun-0 -> ../../voting3

also test UDEV rule:
# udevtest /block/sdb | grep udev_rules_get_name

udev_rules_get_name: rule applied, 'sdb' becomes 'ocr2'
...


OCFS2 Configuration:

Required Packages:
OCFS2 packages should be installed during Linux installation, if you selected the right packages.
If you didn't do so, you can download and install the required OCFS2 packages using the following commands:
# up2date --install ocfs2-tools ocfs2console
# up2date --install ocfs2-`uname -r`

1) populate /etc/ocfs2/cluster.conf settings:
-  -------------------------------------
In the OCFS2 configuration I'll use the heartbeat NICs (private) not the public ones.

# mkdir -p /etc/ocfs2/
# vi /etc/ocfs2/cluster.conf
node:
        ip_port = 7000
        ip_address = 192.168.10.1
        number = 0
        name = ora1123-node1
        cluster = ocfs2

node:
        ip_port = 7000
        ip_address = 192.168.10.2
        number = 1
        name = ora1123-node2
        cluster = ocfs2

cluster:
        node_count = 2
        name = ocfs2

Options:
ip_port:    The Default Can be changed to unused port.
ip_address: Using the private interconnect is highly recommended as it's supposed to be a private network between cluster nodes only.
number:     Node unique number from 0-254
name:       The node name needs to match the hostname without the domain name.
cluster:    Name of the cluster.
node_count: Number of the nodes in the cluster.

BEWARE: During editing the file be-careful, parameters must start after a tab, a blank space must separate each value.

2) Timeout Configuration:
-  -----------------------------
The O2CB cluster stack uses these timings to determine whether a node is dead or alive. Keeping default values is recommended.

# /etc/init.d/o2cb configure

Load O2CB driver on boot (y/n) [n]: y
Cluster stack backing O2CB [o2cb]:
Cluster to start on boot (Enter "none" to clear) [ocfs2]:
Specify heartbeat dead threshold (>=7) [31]: 61
Specify network idle timeout in ms (>=5000) [30000]: 60000
Specify network keepalive delay in ms (>=1000) [2000]:
Specify network reconnect delay in ms (>=2000) [2000]:

Heartbeat Dead Threshold: is the number of two-second iterations before a node is considered dead.61 is recommended for multipath users, for my setup I'll

set the timeout to 120sec.
Network Idle Timeout: The time in milliseconds before a network connection is considered dead.recommended 60000ms

configured the cluster to load on boot:
-------------------------------------------
# chkconfig --add o2cb
# chkconfig --add ocfs2
# /etc/init.d/o2cb load
# /etc/init.d/o2cb start ocfs2


Filesystem Partitioning: OCFS2
==================
As per the labels on the NAS disk names, I'll assign same names with OCFS2.

# fdisk -l |grep /dev
# (cd /dev/disk/by-path; ls -l *iscsi* | awk '{FS=" "; print $9 " " $10 " " $11}')

Formating:
--------------
# mkfs.ocfs2 -F -b 4K -C 32K -N 2 -L ocr1  /dev/ocr1
# mkfs.ocfs2 -F -b 4K -C 32K -N 2 -L ocr2  /dev/ocr2
# mkfs.ocfs2 -F -b 4K -C 32K -N 2 -L ocr3  /dev/ocr3
# mkfs.ocfs2 -F -b 4K -C 32K -N 2 -L voting1  /dev/voting1
# mkfs.ocfs2 -F -b 4K -C 32K -N 2 -L voting2  /dev/voting2
# mkfs.ocfs2 -F -b 4K -C 32K -N 2 -L voting3  /dev/voting3
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L redo1  -J size=64M /dev/redo1
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L redo2  -J size=64M /dev/redo2
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L control1  -J size=64M /dev/control1
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L control2  -J size=64M /dev/control2
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L archive1  -J size=64M /dev/archive1
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L archive2   -J size=64M /dev/archive2
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L undo1  -J size=64M /dev/undo1
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L undo2  -J size=64M /dev/undo2
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L data1  -J size=64M /dev/data1
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L index1   -J size=64M /dev/index1
# mkfs.ocfs2 -F -b 4k -C 8k -N 2 -L temp1   -J size=64M /dev/temp1
# mkfs.ocfs2 -F -b 4k -C 1M -N 2 -L backupdisk  -J size=64M /dev/backupdisk

Options:
-F If the device was previously formatted by OCFS to overwrite the data.
-b blocksize from 512 to 4k(default), 4k recommended (small block size mean smaller maxsize, maxsize=2^32*blocksize means with blocksize=4096 maxsize=16T).
-C clustersize from 4k(default) to 1M, 4k recommended EXCEPT for DBFILES partition it should = Database Block Size =8k
   For backup storage filesystem holding RMAN, dump files, use bigger clustersize.
   128k recommended as a default clustersize if you're not sure what clustersize to use.
-N #Node slots, number of nodes can mount the volume concurrently, it's recommended to set it bigger than required,e.g. if you have two nodes set it to 4, this

parameter can be increased later using tunefs.ocfs2 but this practice can lead to bad performance.
-L lablel name, labeling the volume allow consistent "presistent" naming across the cluster. even if you're using ISCSI.
-J Journal size, 256 MB(default), recommeded (64MB for datafiles, 128MB for vmstore and 256MB for mail).
-T filesystem-type (datafiles,mail,vmstore)
    (datafiles) recommended for database FS will set (blocksize=4k, clustersize=128k, journal size=32M)
    (vmstore)   recommended for backup FS will set (blocksize=4k, clustersize=128k, journal size=128M) .

Note: For the filesystems that hold the database files set the cluster size -C 8k
      For the filesystems that hold backup files set the cluster size -C 1M
      If you're not sure what cluster size to use, use 128k, it proven reasonable trade-off between wasted space and performance.

Mounting the partitions:
-----------------------------
mkdir /ora_redo1
mkdir /ora_backupdisk
mkdir /ora_undo1
mkdir /ora_undo2
mkdir /ora_control2
mkdir /ora_control1
mkdir /ora_archive1
mkdir /ora_redo2
mkdir /ora_temp1
mkdir /ora_index1
mkdir /ora_archive2
mkdir /ora_data1
mkdir /ora_ocr1
mkdir /ora_ocr2
mkdir /ora_ocr3
mkdir /ora_voting1
mkdir /ora_voting2
mkdir /ora_voting3

chown -R oracle:oinstall /ora*
chmod 750 /ora*

Mounting the partitions automatically when system restart:
-----------------------------------------------------------------
vi /etc/fstab

LABEL=ocr1  /ora_ocr1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=ocr2  /ora_ocr2 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=ocr3  /ora_ocr3 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=voting1  /ora_voting1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=voting2  /ora_voting2 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=voting3  /ora_voting3 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=control1  /ora_control1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=control2  /ora_control2 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=redo1  /ora_redo1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=redo2  /ora_redo2 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=archive1  /ora_archive1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=archive2  /ora_archive2 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=temp1  /ora_temp1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=undo1  /ora_undo1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=undo2  /ora_undo2 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=index1  /ora_index1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=data1  /ora_data1 ocfs2   _netdev,datavolume,nointr   0   0
LABEL=backupdisk /ora_backupdisk ocfs2   _netdev         0   0

Partitions mount options:
>>>>>>>>>>>>>>>>>
_netdev: mandatory, prevent attempting to mount the filesystem until the network has been enabled on the system.
datavolume: force using direct I/O, used with FS contain Oracle data files, control files, redo/archive, voting/OCR disk. same behavior when using init.ora

filesystemio_options.
            datavolume mount option MUST NOT be used on volumes hosting the Oracle home or Oracle E-Business Suite or any other use.
nointr: default, blocks signals from interrupting certain cluster operations, disable interrupts.
rw: default, mount the FS in read write mode.
ro: mount the FS in read only mode.
noatime: default, disable access time updates, improve the performance (important for DB/cluster files).
atime_quantum=: update atime of files every 60 second(default), degrades the performance.
commit=: optional, sync all data every 5 seconds(default), degrades the performance. in case of failure you will lose last 5 seconds of work (Filesystem will 

not be damaged, thanks to journaling). higher assigned value improves the performance with more data loss risk.

After adding the values inside /etc/hosts you can mount the partitions using these commands:
# mount -a

OR:
# mount -L "temp1" /ora_temp1

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c1d0p6      20G  3.9G   15G  22% /
/dev/cciss/c1d0p10     98G  7.0G   86G   8% /u02
/dev/cciss/c1d0p9     5.0G  139M  4.6G   3% /tmp
/dev/cciss/c1d0p8      10G  162M  9.4G   2% /home
/dev/cciss/c1d0p7      10G  629M  8.9G   7% /var
/dev/cciss/c1d0p2      16G     0   16G   0% /dev/shm
/dev/cciss/c1d0p5      70G  180M   66G   1% /u01
/dev/cciss/c1d0p1    1003M   76M  876M   8% /boot
tmpfs                  16G     0   16G   0% /dev/shm
/dev/sdm              1.0G  143M  882M  14% /ora_ocr1
/dev/sdd              1.0G  143M  882M  14% /ora_ocr2
/dev/sde              1.0G  143M  882M  14% /ora_ocr3
/dev/sdk              1.0G  143M  882M  14% /ora_voting1
/dev/sdi              1.0G  143M  882M  14% /ora_voting2
/dev/sdg              1.0G  143M  882M  14% /ora_voting3
/dev/sdl               10G  151M  9.9G   2% /ora_control1
/dev/sda               10G  151M  9.9G   2% /ora_control2
/dev/sdb               10G  151M  9.9G   2% /ora_redo1
/dev/sdr               10G  151M  9.9G   2% /ora_redo2
/dev/sdp              300G  456M  300G   1% /ora_archive1
/dev/sdn              300G  456M  300G   1% /ora_archive2
/dev/sdf               60G  205M   60G   1% /ora_temp1
/dev/sdj               40G  184M   40G   1% /ora_undo1
/dev/sdc               40G  184M   40G   1% /ora_undo2
/dev/sdo              200G  349M  200G   1% /ora_index1
/dev/sdq              400G  563M  400G   1% /ora_data1
/dev/sdh              500G  674M  500G   1% /ora_backupdisk

Performance Tip: Ensure updatedb is not running on OCFS2 partitions, by adding "OCFS2" keyword to "PRUNEFS =" list in file /etc/updatedb.conf


 ///////////////////////////////////////////////////////////////
 In case of using ASM for the shared storage (very quick guide)
 ///////////////////////////////////////////////////////////////
 Note: Don't use persistent naming unless you finish configuring ASM first.

 Install ASMLib 2.0 Packages:
 ---------------------------
 # rpm -qa --queryformat "%{NAME}-%{VERSION}-%{RELEASE} (%{ARCH})\n"| grep oracleasm | sort
 oracleasm-2.6.18-348.el5-2.0.5-1.el5 (x86_64)
 oracleasmlib-2.0.4-1.el5 (x86_64)
 oracleasm-support-2.1.7-1.el5 (x86_64)

 Configure ASMLib:
 ----------------
 # /usr/sbin/oracleasm configure -i
 Default user to own the driver interface []: oracle
 Default group to own the driver interface []: dba
 Start Oracle ASM library driver on boot (y/n) [n]: y
 Scan for Oracle ASM disks on boot (y/n) [y]: y

 # /usr/sbin/oracleasm init

 Use FDISK to create RAW partition for each disk:
 -----------------------------------------------
 # fdisk /dev/sdn
  n
  p
  1
 
 
  w

 Do the same for other disks....
 Commit your changes without the need to restart the system using this command:
 # partprobe

 Create ASM Disks:
 ----------------
 # /usr/sbin/oracleasm createdisk OCR1 /dev/sdn1
 # /usr/sbin/oracleasm createdisk OCR2 /dev/sdd1
 # /usr/sbin/oracleasm createdisk OCR3 /dev/sde1
 # /usr/sbin/oracleasm createdisk voting1 /dev/sdk1
 # /usr/sbin/oracleasm createdisk voting2 /dev/sdi1
 # /usr/sbin/oracleasm createdisk voting3 /dev/sdj1
 ... and so on

 SCAN ASM Disks:
 --------------
 # /usr/sbin/oracleasm scandisks
 Reloading disk partitions: done
 Cleaning any stale ASM disks...
 Scanning system for ASM disks...
 Instantiating disk "OCR1"
 Instantiating disk "OCR2"
 ....

 # /usr/sbin/oracleasm listdisks
 OCR1
 ...

 # oracleasm querydisk /dev/sdn1

 Diskgroup creation will be done from the installer.

 ////////////////////////////////////////////////////////////////////

In case you want to use RAW DEVICES for the shared storage:
Note that starting with 11gr2 using DBCA or the installer to store Oracle Clusterware or Oracle Database files on block or raw devices is not supported.


Next:

In Part II  I’ll continue with OS preparation for RAC setup, Grid Infrastructure and Database installation on the new servers.

No comments:

Post a Comment