Database Administration Tips: Upgrade from RAC 11.2.0.1 to 11.2.0.3 (Part II OS preparation, Grid Infrastructure and Database software installation)

In Part I I've installed the Linux OS and prepared the shared filesystem (ISCSI configuration & OCFS2)

In this part I'll prepare the Linux OS for the RAC installation, install Grid Infrastructure 11.2.0.3 and install Database software.

Packages Requirements: (install the same packages version or later)

==================

OEL 5 Required packages for 11.2.0.2 and later versions:

--------------------------------------------------------------------

rpm -qa | grep binutils-2.17.50.0.6

rpm -qa | grep compat-libstdc++-33-3.2.3

rpm -qa | grep elfutils-libelf-0.1

rpm -qa | grep elfutils-libelf-devel-0.1

rpm -qa | grep gcc-4.1.2

rpm -qa | grep gcc-c++-4.1.2

rpm -qa | grep glibc-2.5

rpm -qa | grep glibc-common-2.5

rpm -qa | grep glibc-devel-2.5

rpm -qa | grep glibc-headers-2.5

rpm -qa | grep ksh-2

rpm -qa | grep libaio-0.3.106

rpm -qa | grep libaio-devel-0.3.106

rpm -qa | grep libgcc-4.1.2

rpm -qa | grep libstdc++-4.1.2

rpm -qa | grep libstdc++-devel-4.1.2

rpm -qa | grep make-3.81

rpm -qa | grep sysstat-7.0.2

rpm -qa | grep unixODBC-2.2.11 #=> (32-bit) or later

rpm -qa | grep unixODBC-devel-2.2.11 #=> (64-bit) or later

rpm -qa | grep unixODBC-2.2.11 #=> (64-bit) or later

In case you have missing packages try installing them from the Linux installation DVD:

e.g.

cd /media/OL5.9\ x86_64\ dvd\ 20130429/Server/

rpm -ivh numactl-devel-0.9.8-12.0.1.el5_6.i386.rpm

The most easiest way to download & install 11gr2 required packages & OS settings is to install oracle-rdbms-server-11gR2-preinstall package:

http://www.oracle.com/technetwork/articles/servers-storage-admin/ginnydbinstallonlinux-488779.html

Make sure to install Oracle on a NON Tainted Kernel:

-------------------------------------------------------------

What does Tainted Kernel mean:

-A special module has changed the kernel.

-That module has been force loaded by insmod -f

-Successful (Oracle installation, Oracle support for that database and Oracle support for Linux) will depend on the module that tainted the kernel.

Oracle Support may not support your system (Linux, database) if there is a main module in the kernel has been tainted,.

How to check if the kernel is tainted or not:

# cat /proc/sys/kernel/tainted

If the output is 1 ,the kernel is tainted, you have to contact Oracle Support asking their help whether to proceed with oracle installation or not.

if the output is 0 ,the kernel is not tainted, you're good to go to install oracle software.

Network Requirements:

=================

> Each node must have at least two NICs.

> Recommended to use NIC bonding for public NIC, use HAIP for private NIC (11.2.0.2 onwards).

> Recommended to use redundant switches along with NIC bonding.

> Public & private interface names must be identical on all nodes (e.g. eth0 is the public NIC on all nodes).

> Crossover cables between private RAC NICs are NOT supported (gigabit switch is the minimum requirement). Crossover cables limits the expansion of RAC to two nodes, bad performance due to excess packets collision and cause unstable negotiation between the NICs.

> Public NICs and VIPs / SCAN VIPs must be on the same subnet. Private NICs must be on a different subnet.

> For private interconnect use non-routable addresses:

[From 10.0.0.0 to 10.255.255.255 or

From 172.16.0.0 to 172.31.255.255 or

From 192.168.0.0 to 192.168.255.255]

> Default GATEWAY must be on the same Public | VIPs | SCAN VIPs subnet.

> If you will use SCAN VIP, SCAN name recommended to resolve via DNS to a minimum 3 IP addresses.

> /etc/hosts or DNS must include PUBLIC & VIP IPs with the host names.

> SCAN IPs should not be in /etc/hosts. People not willing to use SCAN can do so, just to let the Grid Infrastructure installation succeed.

> NIC names must NOT include DOT "."

> Every node in the cluster must be able to connect to every private NIC in each node.

> Host names for nodes must NOT have underscores (_).

> Linux Firewall (iptables) must be disabled at least on the private network. If you will enable the firewall I recommend to disable it till you finish the installation of all Oracle products to easily troubleshoot installation problems once you finish you can enable the firewall then feel free to blame the firewall if something didn't work :-).

> Oracle recommend to disable Network zero conf: (as it causing node eviction)

# route -n => If found line 169.254.0.0 this means zero conf is enabled on your OS (default), next step is to disable it by doing the following:

# vi /etc/sysconfig/network

#Add this line:

NOZEROCONF=yes

Restart the network:

# service network restart

> Recommended to use JUMBO frames for interconnect: [Note: 341788.1]

Warning: although it's available in most network devices it's not supported by some NICs (specially Intel NICs) & switches, JUMBO frames should be enabled on the interconnect switch device(Doing a test is mandatory)

# suppose that eth3 is your interconnect NIC:

# vi /etc/sysconfig/network-scripts/ifcfg-eth3

#Add the following parameter:

MTU=9000

# ifdown eth3; ifup eth3

# ifconfig -a eth3 => you will see the value of MTU=9000 (The default MTU is 1500)

Testing JUMBO frames using traceroute command: (during the test, we shouldn't see in the output something like "Message too long":

=>From Node1:

# traceroute -F node2-priv 8970

traceroute to n2-priv (192.168.110.2), 30 hops max, 9000 byte packets

1 node2-priv (192.168.110.2) 0.269 ms 0.238 ms 0.226 ms

=>This test was OK

=>In case you got this message "Message too long" try to reduce the MTU untill this message stop appear.

Testing JUMBO frames using ping: (With MTU=9000 test with 8970 bytes not more)

=>From Node1:

# ping -c 2 -M do -s 8970 node2-priv

1480 bytes from node2-priv (192.168.110.2): icmp_seq=0 ttl=64 time=0.245 ms

=>This test was OK.

=>In case you got this message "Frag needed and DF set (mtu = 9000)" reduce the MTU till you get the previous output.

> Stop avahi-daemon: recommended by Oracle, it causes node eviction plus failing the node to to re-join to the cluster [Note: 1501093.1]

# service avahi-daemon stop

# chkconfig avahi-daemon off

Create new Grid & Oracle home:

========================

mkdir -p /u01/grid/11.2.0.3/grid

mkdir -p /u01/oracle/11.2.0.3/db

chown -R oracle:dba /u01

chown oracle:oinstall /u01

chmod 700 /u01

chmod 750 /u01/oracle/11.2.0.3/db

Note: Oracle user, DBA and OINSTALL groups are created during Oracle Enterprise Linux installation.

Note: I'll install Grid & Oracle with oracle user, I'll not create a new user to be the grid installation owner.

Adding environment variables to Oracle profile:

-----------------------------------------------------

I'm using too much command aliases inside oracle user profile to speed up my administration work, I think it may be helpful for you too. also some aliases refer to some helpful shell scripts like checking the locker session on the DB and more I'll share it with you later in future posts.

# su - oracle

# vi .bash_profile

# .bash_profile

# Get the aliases and functions

if [ -f ~/.bashrc ]; then

. ~/.bashrc

if [ -t 0 ]; then

stty intr ^C

umask 022

# User specific environment and startup programs

unset USERNAME

ORACLE_SID=pefms1

export ORACLE_SID

ORACLE_BASE=/u01/oracle

export ORACLE_BASE

ORACLE_HOME=/u01/oracle/11.2.0.3/db; export ORACLE_HOME

GRID_HOME=/u01/grid/11.2.0.3/grid

export GRID_HOME

LD_LIBRARY_PATH=$ORACLE_HOME/lib; export LD_LIBRARY_PATH

TNS_ADMIN=$ORACLE_HOME/network/admin

export TNS_ADMIN

CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib

CLASSPATH=$CLASSPATH:$ORACLE_HOME/network/jlib; export CLASSPATH

TMP=/tmp; export TMP

TMPDIR=$TMP; export TMPDIR

PATH=$PATH:$HOME/bin:$ORACLE_HOME/bin:$GRID_HOME/bin:/usr/ccs/bin:/usr/bin/X11/:/usr/local/bin:$ORACLE_HOME/OPatch

export PATH

export ORACLE_UNQNAME=pefms

export ORACLE_HOSTNAME=ora1123-node1

alias profile='cd;. ./.bash_profile;cd -'

alias viprofile='cd; vi .bash_profile'

alias catprofile='cd; cat .bash_profile'

alias tnsping='$ORACLE_HOME/bin/./tnsping'

alias pefms='export ORACLE_SID=pefms; echo $ORACLE_SID'

alias sql="sqlplus '/ as sysdba'"

alias alert="tail -100f $ORACLE_HOME/diagnostics/pefms/diag/rdbms/pefms/pefms1/trace/alert_pefms1.log"

alias vialert="vi $ORACLE_HOME/diagnostics/pefms/diag/rdbms/pefms/pefms1/trace/alert_pefms1.log"

alias lis="vi $ORACLE_HOME/network/admin/listener.ora"

alias tns="vi $ORACLE_HOME/network/admin/tnsnames.ora"

alias sqlnet="vi $ORACLE_HOME/network/admin/sqlnet.ora"

alias sqlnetlog='vi $ORACLE_HOME/log/diag/clients/user_oracle/host_2245657081_76/trace/sqlnet.log'

alias network=" cd $ORACLE_HOME/network/admin;ls -rtlh;pwd"

alias arc="cd /ora_archive1/pefms/; ls -rtlh|tail -50;pwd"

alias p="ps -ef|grep pmon|grep -v grep"

alias oh="cd $ORACLE_HOME;ls;pwd"

alias dbs="cd $ORACLE_HOME/dbs;ls -rtlh;pwd"

alias pfile="vi $ORACLE_HOME/dbs/initpefms1.ora"

alias catpfile="cat $ORACLE_HOME/dbs/initpefms1.ora"

alias spfile="cd /fiber_ocfs_pefms_data_1/oracle/pefms; cat spfilepefms1.ora"

alias bdump='cd $ORACLE_HOME/diagnostics/pefms/diag/rdbms/pefms/pefms1/trace;ls -lrt|tail -10;pwd'

alias udump='cd $ORACLE_HOME/diagnostics/pefms/diag/rdbms/pefms/pefms1/trace;ls -lrt;pwd';

alias cdump='cd $ORACLE_HOME/diagnostics/pefms/diag/rdbms/pefms/pefms1/cdump;ls -lrt;pwd'

alias rman='cd $ORACLE_HOME/bin; ./rman target /'

alias listenerlog='tail -100f $ORACLE_BASE/diag/tnslsnr/ora1123-node1/listener/trace/listener.log'

alias vilistenerlog='vi $ORACLE_BASE/diag/tnslsnr/ora1123-node1/listener/trace/listener.log'

alias listenerpefms1log='tail -100f $ORACLE_HOME/log/diag/tnslsnr/ora1123-node1/listener_pefms1/trace/listener_pefms1.log '

alias listenerpefms2log='tail -100f $ORACLE_HOME/log/diag/tnslsnr/ora1123-node2/listener_pefms2/trace/listener_pefms2.log'

alias listenertail='tail -100f $ORACLE_BASE/diag/tnslsnr/ora1123-node1/listener/trace/listener.log'

alias cron='crontab -e'

alias crol='crontab -l'

alias df='df -h'

alias ll='ls -rtlh'

alias lla='ls -rtlha'

alias l='ls'

alias patrol='sh /home/oracle/patrol.sh'

alias datafiles='sh /home/oracle/db_size.sh'

alias locks='sh /home/oracle/locks.sh'

alias objects='sh /home/oracle/object_size.sh'

alias jobs='sh /home/oracle/jobs.sh'

alias crs='$GRID_HOME/bin/crsstat'

alias raclog='tail -100f $GRID_HOME/log/ora1123-node1/alertora1123-node1.log'

alias viraclog='vi $GRID_HOME/log/ora1123-node1/alertora1123-node1.log'

alias datafile='sh /home/oracle/db_size.sh'

alias invalid='sh /home/oracle/Invalid_objects.sh'

alias d='date'

alias dc='d;ssh n2 date'

alias aud='cd $ORACLE_HOME/rdbms/audit;ls -rtl|tail -200'

alias lastdb='/home/oracle/lastdb.sh'

alias sessions='/home/oracle/sessions.sh'

alias spid='sh /home/oracle/spid.sh'

alias spidd='sh /home/oracle/spid_full_details.sh'

alias session='/home/oracle/session.sh'

alias killsession='/home/oracle/kill_session.sh'

alias unlock='/home/oracle/unlock_user.sh'

alias sqlid='/home/oracle/sqlid.sh'

alias parm='/home/oracle/parm.sh'

alias grid='cd /u01/grid/11.2.0.3/grid; ls; pwd'

alias lsn='ps -ef|grep lsn|grep -v grep'

When adding the variables to Oracle profile in the other node you will change the node name from ora1123-node1 to ora1123-node2

Configure SYSTEM parameters:

========================

All parameters should be same or greater on the OS:

----------------------------------------------------

# /sbin/sysctl -a | grep sem #=> semaphore parameters (250 32000 100 142).

# /sbin/sysctl -a | grep shm #=> shmmax, shmall, shmmni (536870912, 2097152, 4096).

# /sbin/sysctl -a | grep file-max #=> (6815744).

# /sbin/sysctl -a | grep ip_local_port_range #=> Minimum: 9000, Maximum: 65500

# /sbin/sysctl -a | grep rmem_default #=> (262144).

# /sbin/sysctl -a | grep rmem_max #=> (4194304).

# /sbin/sysctl -a | grep wmem_default #=> (262144).

# /sbin/sysctl -a | grep wmem_max #=> (1048576).

# /sbin/sysctl -a | grep aio-max-nr #=> (Minimum: 1048576) limits concurrent requests to avoid I/O Failures.

Note:

If the current value of any parameter is higher than the value listed above, then do not change the value of that parameter.

If you will change any parameter on /etc/sysctl.conf then issue the command: sysctl -p

Check limit.conf values:

vi /etc/security/limits.conf

oracle soft nofile 131072

oracle hard nofile 131072

oracle soft nproc 131072

oracle hard nproc 131072

oracle soft core unlimited

oracle hard core unlimited

oracle soft memlock 50000000

oracle hard memlock 50000000

# Adjust MAX stack size for 11.2.0.3 => Original was 8192:

oracle soft stack 10240

After updating limits.conf file, oracle user should logoff & logon to let the new adjustments take effect.

Ensure mounting /usr in READ-WRITE mode:

------------------------------------------------

# mount -o remount,rw /usr

>For security reasons Sys admins prefer to mount /usr in READ ONLY mode, during Oracle installation /usr must be in RW mode.

Restart the internet services daemon (xinetd):

----------------------------------------------

# service xinetd restart

Edit the /etc/securetty file and append it with the relevant service name:

------------------------------------------------------------------------

ftp

rlogin

rsh

rexec

telnet

Create ".rhosts" file:

This file will provide user equivalence between the servers, should be create under Oracle user home:

su - oracle

vi .rhosts

# Add the following lines

ora1123-node1 oracle

ora1123-node2 oracle

ora1123-node1-priv oracle

ora1123-node2-priv oracle

ora1123-node1-vip oracle

ora1123-node2-vip oracle

Create hosts.equiv file:

vi /etc/hosts.equiv

#add these lines:

ora1123-node1 oracle

ora1123-node2 oracle

ora1123-node1-priv oracle

ora1123-node2-priv oracle

ora1123-node1-vip oracle

ora1123-node2-vip oracle

chmod 600 /etc/hosts.equiv

chown root.root /etc/hosts.equiv

Configure Host equivalence between Nodes:

-----------------------------------------------

on Both Nodes:

----------------

mkdir -p cd /home/oracle/.ssh

cd /home/oracle/.ssh

ssh-keygen -t rsa

ssh-keygen -t dsa

cat id_rsa.pub > authorized_keys

cat id_dsa.pub >> authorized_keys

On Node1:

cd /home/oracle/.ssh

scp authorized_keys oracle@ora1123-node2:/home/oracle/.ssh/authorized_keys_nod1

on Node2:

cd /home/oracle/.ssh

mv authorized_keys_nod1 authorized_keys

cat id_rsa.pub >> authorized_keys

cat id_dsa.pub >> authorized_keys

Copy the authorized_keys file to Node1:

scp authorized_keys oracle@ora1123-node1:/home/oracle/.ssh/

From Node1: Answer each question with "yes"

ssh ora1123-node1 date

ssh ora1123-node2 date

ssh n1 date

ssh n2 date

ssh ora1123-node1-priv date

ssh ora1123-node2-priv date

From Node2: Answer each question with "yes"

ssh ora1123-node1 date

ssh ora1123-node2 date

ssh n1 date

ssh n2 date

ssh ora1123-node1-priv date

ssh ora1123-node2-priv date

Enable rsh on both Nodes:

------------------------------

First verify that rsh & rsh-server packages are installed

rpm -qa|grep rsh

rsh-server-0.17-40.el5

rsh-0.17-40.el5

If the packages are not installed install them:

you can find rsh package in CD1 under "Server" directory

you can find rsh-server package in CD3 under "Server" directory

Add rsh to PAM:

------------------

vi /etc/pam.d/rsh:

#Add the following line

auth sufficient pam_rhosts_auth.so no_hosts_equiv

Enable xinetd service:

--------------------

vi /etc/xinetd.d/rsh

#Modify this line:

disable=no

-Test rsh connectivity between the cluster nodes:

From Node1: rsh n2 date

From Node2: rsh n1 date

Enable rlogin:

---------------

vi /etc/xinetd.d/rlogin

#add this line:

disable=no

Configure Hangcheck-timer:

------------------------------

If a hang occur on a node the module will reboot it to avoid the database corruption.

*To Load the hangcheck-timer module for 2.6 kernel:

# insmod /lib/modules/`uname -r`/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1

->hangcheck_tick: Defines how often in seconds, the hangcheck-timer checks the node for hangs. The default is 60, Oracle recommends 1 second.

->hangcheck_margin: Defines how long in seconds the timer waits for a response from the kernel. The default is 180, Oracle recommends 10.

->hangcheck_reboot: 1 reboot when hang occur, 0 do not reboot when hang occur.

*To confirm that the hangcheck module is loaded, enter the following command:

# lsmod | grep hang

# output will be like below

hangcheck_timer 2428 0

*Add the service in the startup by editing this file:

vi /etc/rc.d/rc.local

#add this line

insmod /lib/modules/`uname -r`/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1

You have to put the real value in place of `uname -r` which is your kernel version.

e.g.

insmod /lib/modules/2.6.32-300.32.2/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1

Prepare for using Cluster Time Synchronization Service - (CTSS)

----------------------------------------------------------

Oracle Grid Infrastructure 11gr2 provides a new service called Cluster Time Synchronization Service (CTSS) that can synchronize the time between cluster nodes automatically without any manual intervention, If you want to use (CTSS) to handle this job automatically for you, then de-configure and de-install the Network Time Protocol (NTP), during the installation when Oracle find that NTP protocol is not active it will automatically activate (CTSS) to handle the time synchronization between RAC nodes for you, no more steps are required from you during the GI installation.

Disable NTP service:

# service ntpd stop

# chkconfig ntpd off

# mv /etc/ntp.conf /etc/ntp.conf.original

# rm /var/run/ntpd.pid

Disable SELINUX:

--------------

Note: Starting with 11gR2 SELinux is supported. but I'll continue disabling it. Disabling SELINUX is easier than configuring it :-) it's a nightmare :-)

vi /etc/selinux/config

SELINUX=disabled

SELINUXTYPE=targeted

#################

Extra Configurations:

#################

Configure HugePages: [361468.1]

================

What is HugePages:

--------------------

HugePages is a feature allows larger pages to manage memory as the alternative to the small 4KB pagesize.

HugePages is crucial for faster Oracle database performance on Linux if you have a large RAM and SGA > 8G.

HugePages are not only for 32X system but for improving the memory performance on 64x kernel.

HugePages Pros:

------------------

-Doesn't allow memory to be swaped.

-Less Overhead for Memory Operations.

-Less Memory Usage.

Huge Pages Cons:

--------------------

-You must set MEMORY_TARGET and MEMORY_MAX_TARGET = 0 as Automatic Memory Management (AMM) feature is incompatible with HugePages:

ORA-00845: MEMORY_TARGET not supported on this system

Implementation:

1-Make sure that MEMORY_TARGET and MEMORY_MAX_TARGET = 0 on All instances.

2-Make sure that all instances on the server are up.

3- Set these parameters equal or greater than SGA size: (values are in KB)

# vi /etc/security/limits.conf

oracle soft memlock 20971520

oracle hard memlock 20971520

Here I'll set SGA to 18G so I'll set it to 20G in limits.conf file.

Re-login to oracle user and check the value:

# ulimit -l

4- Create this script:

# vi /root/hugepages_settings.sh

#!/bin/bash

# hugepages_settings.sh

# Linux bash script to compute values for the

# recommended HugePages/HugeTLB configuration

# Note: This script does calculation for all shared memory

# segments available when the script is run, no matter it

# is an Oracle RDBMS shared memory segment or not.

# This script is provided by Doc ID 401749.1 from My Oracle Support

# http://support.oracle.com

# Welcome text

echo "

This script is provided by Doc ID 401749.1 from My Oracle Support

(http://support.oracle.com) where it is intended to compute values for

the recommended HugePages/HugeTLB configuration for the current shared

memory segments. Before proceeding with the execution please note following:

* For ASM instance, it needs to configure ASMM instead of AMM.

* The 'pga_aggregate_target' is outside the SGA and

you should accommodate this while calculating SGA size.

* In case you changes the DB SGA size,

as the new SGA will not fit in the previous HugePages configuration,

it had better disable the whole HugePages,

start the DB with new SGA size and run the script again.

And make sure that:

* Oracle Database instance(s) are up and running

* Oracle Database 11g Automatic Memory Management (AMM) is not setup

(See Doc ID 749851.1)

* The shared memory segments can be listed by command:

# ipcs -m

Press Enter to proceed..."

read

# Check for the kernel version

KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`

# Find out the HugePage size

HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`

if [ -z "$HPG_SZ" ];then

echo "The hugepages may not be supported in the system where the script is being executed."

exit 1

# Initialize the counter

NUM_PG=0

# Cumulative number of pages required to handle the running shared memory segments

for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`

MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`

if [ $MIN_PG -gt 0 ]; then

NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`

done

RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`

# An SGA less than 100MB does not make sense

# Bail out if that is the case

if [ $RES_BYTES -lt 100000000 ]; then

echo "***********"

echo "** ERROR **"

echo "***********"

echo "Sorry! There are not enough total of shared memory segments allocated for

HugePages configuration. HugePages can only be used for shared memory segments

that you can list by command:

# ipcs -m

of a size that can match an Oracle Database SGA. Please make sure that:

* Oracle Database instance is up and running

* Oracle Database 11g Automatic Memory Management (AMM) is not configured"

exit 1

# Finish with results

case $KERN in

'2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;

echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;

'2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;

*) echo "Unrecognized kernel version $KERN. Exiting." ;;

esac

5-Run script hugepages_settings.sh to help you get the right value for vm.nr_hugepages parameter:

# chmod 700 /root/hugepages_settings.sh

# sh /root/hugepages_settings.sh

6-Edit the file /etc/sysctl.conf and set the vm.nr_hugepages parameter as per the script output value:

# cat /etc/sysctl.conf|grep vm.nr_hugepages

# vi /etc/sysctl.conf

vm.nr_hugepages = 9220

7-Reboot the server.

8-Check and Validate the Configuration:

# grep HugePages /proc/meminfo

Note: Any further modification to the following should be followed by re-run hugepages_settings.sh script and put the new value of vm.nr_hugepages parameter:

-Amount of RAM installed for the Linux OS changed.

-New database instance(s) introduced.

-SGA size / configuration changed for one or more database instances.

Increase vm.min_free_kbytes system parameter: [Doc ID 811306.1]
================================
In case you enabled regular HugePages on your system (the thing we did above) it's recommended to increase the system parameter vm.min_free_kbytes from 51200 to 524288 This will cause the system to

start reclaiming memory at an earlier time than it would have before, therefore it can help to decrease the LowMem pressure, hangs and node evictions.

# sysctl -a |grep min_free_kbytes
vm.min_free_kbytes = 51200

# vi /etc/sysctl.conf
vm.min_free_kbytes = 524288

# sysctl -p

# sysctl -a |grep min_free_kbytes
vm.min_free_kbytes = 51200

Disable Transparent HugePages: [Doc ID 1557478.1]
=====================
Transparent HugePages are different than regular HugePages (the one we configured above), Transparent HugePages are set up dynamically at run time.
Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC & Single Node, Oracle strongly recommend to disable it.
Note: For UEK2 kernel, starting with 2.6.39-400.116.0 Transparent HugePages has been removed from the kernel.

Check if Transparent HugePages Enabled:
--------------------------------------
# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] never

Disable Transparent HugePages:
-----------------------------
Add "transparent_hugepage=never" to boot kernel:

# vi /boot/grub/grub.conf
kernel /vmlinuz-2.6.39-300.26.1.el5uek ro root=LABEL=/ transparent_hugepage=never

Configure VNC on Node1:

===================

VNC will help us login to the linux machine with a GUI session, from this GUI session we can run Oracle installer to install Grid Infrastructure and Database software. eliminating the need to go to the server room and do the installation on the server itself.

Make sure that VNCserver package is already installed:

# rpm -qa | grep vnc-server

vnc-server-4.1.2-14.el5_6.6

Modify the VNC config file:

# vi /etc/sysconfig/vncservers

Add these lines at the bottom:

VNCSERVERS="2:root"

VNCSERVERARGS[2]="-geometry 800x600 -nolisten tcp -nohttpd -localhost"

Set a password for VNC:

# vncpasswd

Password:

Verify:

Run a VNC session just to generate the default config files:

# vncserver :1

Configure VNC to start an Xsession when connecting:

# vi ~/.vnc/xstartup

#UN-hash these two lines:

unset SESSION_MANAGER

exec /etc/X11/xinit/xinitrc

Now start a VNC session on the machine:

# vncserver :1

Now you can login from any machine (your Windows PC) using VNCviewer to access that remote server using port 5900 or 5901. make sure these port are not blocked by the firewall.

VNC Viewer can be downloaded from this link:

http://www.realvnc.com/download/viewer/

Download Oracle 11.2.0.3 installation media:

================================

Note [ID 753736.1] have all Patch Sets + PSU reference numbers.

11.2.0.3 (for Linux x86_64) is patch# 10404530 we need only the first 3 zip files from 1-3.

(1&2 for database, 3 for grid, 4 for client, 5 for gateways, 6 examples cd, 7 for deinstall).

I'll extract the first 3 zip files which have Grid and Database binaries under /u02/stage

###########################

Grid Infrastructure installation:

###########################

Setup Cluverify:

===========

Cluverify is a tool checks the fulfillment of RAC and database installation prerequisites.

cd /u02/stage/grid/rpm

rpm -ivh cvuqdisk-1.0.9-1.rpm

Check the fulfillment of Grid Infrastructure setup prerequisites: (using Cluverify tool)

------------------------------------------------------------

cd /u02/stage/grid

./runcluvfy.sh stage -pre crsinst -n ora1123-node1,ora1123-node2 -verbose

Grid installation:

============

On Node1:

Start a VNC session on the server to be able to open a GUI session with the server and run Oracle Installer:

# vncserver :1

# xhost +

# su - oracle

# cd /u02/stage/grid

# chmod +x runInstaller

# ./runInstaller

During the installation:

================

Click "skip software updates":

>Install and configure Grid Infrastructure for a cluster.

>Advanced Installation

>Grid Plug and Play:

Cluster Name: cluster

SCAN Name: cluster-scan

SCAN Port: 1523

>Cluster Node Information:

Add:

ora1123-node2

ora1123-node2-vip

>Network Interface Usage:

eth0 Public

eth3 Private

eth1 Do Not Use

eth2 Do Not Use

Note: Starting With Oracle (11.2.0.2), you are no longer required to use the network bounding technique to configure interconnect redundancy. You can now define at most four interfaces for redundant interconnect (private network) during the installation phase.

>Storage Option: Shared File System

>OCR Storage: Normal Redundancy

/ora_ocr1/ocr1.dbf

/ora_ocr2/ocr2.dbf

/ora_ocr3/ocr3.dbf

Note: Oracle strongly recommend to set the voting disks number to an odd number like 3 or 5 and so on, because the cluster must be able to access more than half of the voting disks at any time.

>Voting Storage:Normal Redundancy

/ora_voting1/voting1.dbf

/ora_voting2/voting2.dbf

/ora_voting3/voting3.dbf

>Do not use IPMI

Oracle Base: /u01/oracle/

RAC Installation path: /u01/grid/11.2.0.3/grid

OraInventory path: /u01/oraInventory

At the End of the installation run:

---------------------------------

Run orainstRoot.sh On Node1 then run it on Node2:

# /u01/oraInventory/orainstRoot.sh

Run root.sh On Node1 once it finish run it on Node2:

# /u01/grid/11.2.0.3/grid/root.sh

Just hit ENTER when get this message:

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Note: root.sh may take from 5 to 15 minutes to complete.

Once root.sh finish, go back to the Execute Configuration Scripts window and press "OK".

I've uploaded the screenshots to this link:

http://imgur.com/a/lccmR

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

In case you are doing an in-place upgrade of an older release and you are installing the GI on a different home, the following should be done within a downtime window:

At the End of installation by root user run:

-------------------------------------------

Note: In case of doing an in-place upgrade Oracle recommends that you leave Oracle RAC instances running from Old GRID_HOME.

Execute this script:

# /u01/grid/11.2.0.3/grid/rootupgrade.sh

=>Node By Node (don't run it in parallel).

=>rootupgrade will restart cluster resources on the node which is run on.

=>Once you finish with rootupgrade.sh ,click OK on the OUI window to finish the installation.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

The outputs of executed commands:

---------------------------

#/u01/oraInventory/orainstRoot.sh

Changing permissions of /u01/oraInventory.

Adding read,write permissions for group.

Removing read,write,execute permissions for world.

Changing groupname of /u01/oraInventory to oinstall.

The execution of the script is complete.

#/u01/grid/11.2.0.3/grid/root.sh

Node1 outputs:

-------------

[root@ora1123-node1 /u01]#/u01/grid/11.2.0.3/grid/root.sh

Performing root user operation for Oracle 11g

The following environment variables are set as:

ORACLE_OWNER= oracle

ORACLE_HOME= /u01/grid/11.2.0.3/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin ...

Copying oraenv to /usr/local/bin ...

Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Using configuration parameter file: /u01/grid/11.2.0.3/grid/crs/install/crsconfig_params

Creating trace directory

User ignored Prerequisites during installation

OLR initialization - successful

root wallet

root wallet cert

root cert export

peer wallet

profile reader wallet

pa wallet

peer wallet keys

pa wallet keys

peer cert request

pa cert request

peer cert

pa cert

peer root cert TP

profile reader root cert TP

pa root cert TP

peer pa cert TP

pa peer cert TP

profile reader pa cert TP

profile reader peer cert TP

peer user cert

pa user cert

Adding Clusterware entries to inittab

CRS-2672: Attempting to start 'ora.mdnsd' on 'ora1123-node1'

CRS-2676: Start of 'ora.mdnsd' on 'ora1123-node1' succeeded

CRS-2672: Attempting to start 'ora.gpnpd' on 'ora1123-node1'

CRS-2676: Start of 'ora.gpnpd' on 'ora1123-node1' succeeded

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'ora1123-node1'

CRS-2672: Attempting to start 'ora.gipcd' on 'ora1123-node1'

CRS-2676: Start of 'ora.cssdmonitor' on 'ora1123-node1' succeeded

CRS-2676: Start of 'ora.gipcd' on 'ora1123-node1' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'ora1123-node1'

CRS-2672: Attempting to start 'ora.diskmon' on 'ora1123-node1'

CRS-2676: Start of 'ora.diskmon' on 'ora1123-node1' succeeded

CRS-2676: Start of 'ora.cssd' on 'ora1123-node1' succeeded

clscfg: -install mode specified

Successfully accumulated necessary OCR keys.

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

Now formatting voting disk: /ora_voting1/voting1.dbf.

Now formatting voting disk: /ora_voting2/voting2.dbf.

Now formatting voting disk: /ora_voting3/voting3.dbf.

CRS-4603: Successful addition of voting disk /ora_voting1/voting1.dbf.

CRS-4603: Successful addition of voting disk /ora_voting2/voting2.dbf.

CRS-4603: Successful addition of voting disk /ora_voting3/voting3.dbf.

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 205267b4e4334fc9bf21154f92cd30fa (/ora_voting1/voting1.dbf) []

2. ONLINE 83217239b9c84fe9bfbd6c5e76a9dcc1 (/ora_voting2/voting2.dbf) []

3. ONLINE 41a59373d30b4f6cbf6f41c50dc48dbd (/ora_voting3/voting3.dbf) []

Located 3 voting disk(s).

Configure Oracle Grid Infrastructure for a Cluster ... succeeded

Node2 outputs:

-------------

[root@ora1123-node2 /u01/grid/11.2.0.3/grid]#/u01/grid/11.2.0.3/grid/root.sh

Performing root user operation for Oracle 11g

The following environment variables are set as:

ORACLE_OWNER= oracle

ORACLE_HOME= /u01/grid/11.2.0.3/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin ...

Copying oraenv to /usr/local/bin ...

Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Using configuration parameter file: /u01/grid/11.2.0.3/grid/crs/install/crsconfig_params

Creating trace directory

User ignored Prerequisites during installation

OLR initialization - successful

Adding Clusterware entries to inittab

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node ora1123-node1, number 1, and is terminating

An active cluster was found during exclusive startup, restarting to join the cluster

Configure Oracle Grid Infrastructure for a Cluster ... succeeded

> In the last page it will show error message regarding Oracle Clusterware verification utility failed, just ignore it. Our installation indeed is successful.

Test the installation:

================

-> Check the logs under: /u02/oraInventory/logs

By oracle:

# cluvfy stage -post crsinst -n ora1123-node1,ora1123-node2 -verbose

# crsctl check cluster -all

# crsctl check crs

Cluster Synchronization Services appears healthy

Cluster Ready Services appears healthy

Event Manager appears healthy

# olsnodes -n

# ocrcheck

# crsctl query crs softwareversion

# crsctl query crs activeversion

# crs_stat -t -v

Confirm clusterware time synchronization service is running (CTSS):

--------------------------------------------------------------------

# crsctl check ctss

CRS-4701: The Cluster Time Synchronization Service is in Active mode.

CRS-4702: Offset (in msec): 0

Create crs_stat script to show you a nice output shape of crs_stat command:

-----------------------------------------------------------------------

cd /u01/grid/11.2.0.3/grid/bin

vi crsstat

#--------------------------- Begin Shell Script ----------------------------

#!/bin/bash

#Sample 10g CRS resource status query script

#Description:

# - Returns formatted version of crs_stat -t, in tabular

# format, with the complete rsc names and filtering keywords

# - The argument, $RSC_KEY, is optional and if passed to the script, will

# limit the output to HA resources whose names match $RSC_KEY.

# Requirements:

# - $ORA_CRS_HOME should be set in your environment

RSC_KEY=$1

QSTAT=-u

AWK=/usr/bin/awk # if not available use /usr/bin/awk

# Table header:echo ""

$AWK \

'BEGIN {printf "%-75s %-10s %-18s\n", "HA Resource", "Target", "State";

printf "%-75s %-10s %-18s\n", "-----------", "------", "-----";}'

# Table body:

/u01/grid/11.2.0.3/grid/bin/crs_stat $QSTAT | $AWK \

'BEGIN { FS="="; state = 0; }

$1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1};

state == 0 {next;}

$1~/TARGET/ && state == 1 {apptarget = $2; state=2;}

$1~/STATE/ && state == 2 {appstate = $2; state=3;}

state == 3 {printf "%-75s %-10s %-18s\n", appname, apptarget, appstate; state=0;}'

#--------------------------- End Shell Script ------------------------------

chmod 700 /u01/grid/11.2.0.3/grid/bin/crsstat

scp crsstat root@node2:/u01/grid/11.2.0.3/grid/bin

Now you can use "crs" command that been included in the oracle profiles to execute crs_stat -t in a cute format.

Change OCR backup location:

=========================

# ocrconfig -showbackup

# ocrconfig -backuploc /u01/grid/11.2.0.3/grid/cdata/cluster11g

Modify RAC configurations:

#######################

=Configure CSS misscount:

====================

The CSS misscount parameter represents the maximum time, in seconds, that a network heartbeat can be missed before kicking out the problematic node...

Check current configurations for css misscount :

# crsctl get css misscount

It's recommended to backup OCR disks before running the following command.

configure css misscount: -From One Node only-

# crsctl set css misscount 60

#################################

Install Oracle Database Software 11.2.0.3:

#################################

Note: It's recommended to backup oraInventory directory before starting this stage.

Note: Ensure that the clusterware services are running on both nodes.

Run cluvfy to check database installation prerequisites:

========

# cluvfy stage -pre dbinst -n ora1123-node1,ora1123-node2 -verbose

-->Ignore cluster scan errors.

Execute runInstaller:

===============

Connect to the server using VNCviewer to open a GUI session, that enable you to run Oracle installer

Note: Oracle Installer can also run from the command line mode using -silent and -responseFile attributes, you should prepare the response file that will hold all installation selections"

# xhost +

# su - oracle

# cd /u02/stage/database

# ./runInstaller

During the installation:

==================

Select "skip software updates"

Select "Install database Software only"

Select "Oracle Real Application Clusters database installation" -> Select both nodes (selected by default).

Select "Enterprise" -> Selected options like (Partitioning, Data Mining, Real Application Testing)

=>From security perspective it's recommended to install only the options you need.

=>From licensing perspective there is no problem if installed an options you are not using it, as Oracle charges only on the options are being used.

Select "dba" group for OSDBA, leave it blank for OSOPER (I never had a need to login to the database with SYSOPER privilege).

Ignore SCAN warning in the prerequisite check page

ORACLE_BASE: /u01/oracle

ORACLE_HOME (Software Location): /u01/oracle/11.2.0.3/db

At the end of installation: By root user execute /u01/oracle/11.2.0.3/db/root.sh on Node1 first then execute it on Node2:

# /u01/oracle/11.2.0.3/db/root.sh

Go back to the Oracle Installer:

click OK.

click Close.

I've uploaded Oracle software installation snapshots to this link:

http://imgur.com/a/OwKQw

Post Steps:

########

Installation verification:

=================

# cluvfy stage -post crsinst -n ora1123-node1,ora1123-node2 -verbose

=>All passed except SCAN check which I'm not using it in my setup.

Do some backing up:

==============

Query Voting disks:

------------------

crsctl query css votedisk

Backing up voting disks manually is no longer required, dd command is not supported in 11gr2 for backing up voting disks. Voting disks are backed up automatically in the OCR as part of any configuration change and voting disk data is automatically restored to any added voting disks.

http://download.oracle.com/docs/cd/E11882_01/rac.112/e16794/votocr.htm#BABDJDHI

Backup the OCR: clusterware is up and running

------------------

# ocrconfig -export /u01/grid/11.2.0.3/grid/cdata/cluster11g/ocr_after_DB_installation.dmp

# ocrconfig -manualbackup

Backup oraInventory directory:

---------------------------------

# cp -r /u01/oraInventory /u01/oraInventory_After_DBINSTALL

Backup root.sh:

-----------------

# cp /u01/grid/11.2.0.3/grid/root.sh /u01/grid/11.2.0.3/grid/root.sh._after_installation

# cp /u01/oracle/11.2.0.3/db/root.sh /u01/oracle/11.2.0.3/db/root.sh_after_installation

Backup ORACLE_HOME:

---------------------------

# tar cvpf /u01/oracle/11.2.0.3/db_After_DB_install.tar /u01/oracle/11.2.0.3/db

Backup GRID_HOME:

----------------------

# tar cvpf /u01/grid/11.2.0.3/grid_after_DB_install.tar /u01/grid/11.2.0.3/grid

Note: Although clusterware services are up and running, GI Home can be backed up online.

Backup the following files:

--------------------------

# cp /usr/local/bin/oraenv /usr/local/bin/oraenv.11.2.0.3

# cp /usr/local/bin/dbhome /usr/local/bin/dbhome.11.2.0.3

# cp /usr/local/bin/coraenv /usr/local/bin/coraenv.11.2.0.3

-Restart RAC severs more than once and ensure that RAC processes are starting up automatically.

July SPU Patch Apply:

##################

-Since October 2012 Oracle re-named CPU Critical Patch Update to SPU Security Patch Update, both are same, it's just a renaming .

-SPU patches are cumulative once you apply the latest patch, there is no need to apply the older patches.

-To eliminate making a big change on my environment, plus minimizing the downtime when applying security patches, I prefer to apply SPU (CPU before) over applying PSU patches (which contains SPU patch + Common Bug fixes that affect large number of customers).
OPatch utility version must be 11.2.0.3.0 or later: (OPatch utility is the tool being used to apply SPU patches)

>> To download the latest OPatch utlility: Go to Metalink, search for Patch# 6880880

>Backup the original OPatch directory under ORACLE_HOME and just unzip the patch file under ORACLE_HOME.

> $PATH must refer to /usr/ccs/bin

# export PATH=$PATH:/usr/ccs/bin

> Unzip the Patch:

# cd $ORACLE_HOME

# unzip p16742095_112030_Linux-x86-64.zip

Patch Installation:

=============

Remember we still don't have any running database for the time being.

Shutdown Nodeapps or crs:

# srvctl stop nodeapps -n ora1123-node1

Patch Installation:

# cd $ORACLE_HOME/16742095

# opatch napply -skip_subset -skip_duplicate -local

Go to Node2 and do the same steps for installing the latest SPU patch...

Part III Create a standby database under the new 11.2.0.3 environment being refreshed from the 11.2.0.1 primary DB.

Database Administration Tips

Friday, September 20, 2013

Upgrade from RAC 11.2.0.1 to 11.2.0.3 (Part II OS preparation, Grid Infrastructure and Database software installation)

No comments:

Post a Comment