LPAR Disaster Recovery example

Scenario: we have two lpars, first is a WAS DMGR, the other is intended to be its backup if needed.

In order to avoid misalignments between the lpars we can take mksysb and saveg on a scheduled basis (at 4:00 am from monday to friday)

Backup:

0 4 * * 1-5 /export/nim/scripts/wnd8-dr.new.sh >/dev/null 2>&1

Script contents:

#!/usr/bin/ksh
for env in v u p
do CLIENT="wnd8"$env"01"
echo $CLIENT

LOG=/export/savevg/savevg.$CLIENT.log
echo "Starting..." | tee -a $LOG


nim -Fo remove mksysb-$CLIENT
nim -Fo remove savevg-$CLIENT

nim -o define -t mksysb -a server=master -a location=/export/mksysb/mksysb-$CLIENT -a source=$CLIENT -a mk_image=yes -a exclude_files=wnd8-dr-exclude mksysb-$CLIENT | tee -a $LOG
nim -o define -t savevg -a server=master -a location=/export/savevg/savevg-$CLIENT -a source=$CLIENT -a mk_image=yes -a savevg_flags=Xi -a volume_group=wsvg savevg-$CLIENT | tee -a $LOG

done

Restore:

The restvg operation is actually manual, we are evaluating an automatic recovery scheduled in the early hours of the morning to protect the environment from the loss of a dmgr.

1) Spot definition starting from mksysb

root@nim01:/# nim -o define -t spot -a server=master -a source=mksysb-wnd8p01 -a location=/export/spot/spot-wnd8p01 spot-wnd8p01

Creating SPOT in “/export/spot/spot-wnd8p01” on machine “master” from “mksysb-wnd8p01” …

Restoring files from BOS image. This may take several minutes …

Checking filesets and network boot images for SPOT “spot-wnd8p01”.
This may take several minutes …

2) bosinst operation execution

nim -o bos_inst -a source=mksysb -a mksysb=mksysb-wnd8p01vir-int -a spot=spot-wnd8p01vir-int -a accept_licenses=yes -a bosinst_data=bosinst_noprompt_wnd8p01vir_data -a no_client_boot=no wnd8p02vir-int

root@wnd8p02vir-int:/#

Broadcast message from root@wnd8p02vir-int (tty) at 17:13:36 ...

*******************************************************************
*******************************************************************
*******************************************************************

NIM has initiated a bos installation operation on this machine.
Automatic reboot and reinstallation will follow shortly...

*******************************************************************
*******************************************************************
*******************************************************************


Running /etc/rc.d/rc2.d/K45vxpbx_exchanged stop
Running /etc/rc.d/rc2.d/Ksshd stop
Running /etc/rc.d/rc2.d/Kwpars stop
Stopping TCP/IP daemons: ndpd-host lpd routed gated sendmail inetd named timed rwhod iptrace dpid2 snmpd rshd rlogind telnetd syslogd
Removing TCP/IP lock files

...

TFTP BOOT ---------------------------------------------------
Server IP.....................10.15.61.27
Client IP.....................10.15.61.168
Gateway IP....................10.15.61.27
Subnet Mask...................255.255.255.0
( 1 ) Filename................./tftpboot/wnd8p02vir-int
TFTP Retries..................5
Block Size....................512
FINAL PACKET COUNT = 59393
FINAL FILE SIZE = 30408704 BYTES

Elapsed time since release of system processors: 1892282 mins 17 secs




-------------------------------------------------------------------------------
Welcome to AIX.
boot image timestamp: 15:36:24 12/30/2020
The current time and date: 16:18:49 01/05/2021
processor count: 4; memory size: 16384MB; kernel size: 45241275
boot device: /vdevice/l-lan@30000003:speed=auto,duplex=auto,bootp,10.15.61.27,,10.15.61.168,10.15.61.27


...


Installing Base Operating System





Please wait...









Approximate Elapsed time
% tasks complete (in minutes)


13 1 9% of mksysb data restored.

NIM – SPOT definition

Spot resource (Shared product Object Tree) is a fundamental object needed when installing or customizing new lpars.
It contains the necessary files (/usr) to mount and access the resources needed during setup phase (mksysb, lpp_source, etc.).

root@nim01:/home/src582# nim -o define -t spot -a server=master -a source=lpp-aix720501-FULL -a location=/export/spot/spot-aix720501 spot-aix720501

bos.iconv.com 7.2.5.0 ROOT APPLY SUCCESS
bos.ecc_client.rte 7.2.5.0 USR APPLY SUCCESS
bos.ecc_client.rte 7.2.5.0 ROOT APPLY SUCCESS
bos.adt.lib 7.2.4.0 USR APPLY SUCCESS
bos.adt.lib 7.2.4.0 ROOT APPLY SUCCESS
adde.v2.rdma.rte 7.2.2.0 USR APPLY SUCCESS
adde.v2.rdma.rte 7.2.2.0 ROOT APPLY SUCCESS
adde.v2.ethernet.rte 7.2.5.0 USR APPLY SUCCESS
Java7_64.sdk 7.0.0.665 USR APPLY SUCCESS
Java7_64.jre 7.0.0.665 USR APPLY SUCCESS
Java7_64.jre 7.0.0.665 ROOT APPLY SUCCESS

File /etc/inittab has been modified.

One or more of the files listed in /etc/check_config.files have changed.
See /var/adm/ras/config.diff for details.

Checking filesets and network boot images for SPOT “spot-aix720501”.
This may take several minutes …

NIM – LPP_SOURCE definition


Let’s start from AIX 7.2 base release LPP Source ,in order to define our new LPP_SOURCE resource:

root@nim01:/export/lpp_source/lpp-aix720501-FULL# nim -o define -t lpp_source -a server=master -a source=/export/lpp_source/lpp-aix720000-FULL -a location=/export/lpp_source/lpp-aix720501-FULL lpp-aix720501-FULL
Preparing to copy install images (this will take several minutes)…

/export/lpp_source/lpp-aix720501-FULL/RPMS/ppc/expect-5.42.1-3.aix6.1.ppc.rpm
/export/lpp_source/lpp-aix720501-FULL/RPMS/ppc/tcl-8.4.7-3.aix6.1.ppc.rpm
/export/lpp_source/lpp-aix720501-FULL/RPMS/ppc/tk-8.4.7-3.aix6.1.ppc.rpm
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/xlsmp.rte.4.1.2.0.I
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/xlsmp.aix61.rte.4.1.2.0.I


/export/lpp_source/lpp-aix720501-FULL/installp/ppc/Java7_64.sdk.7.0.0.320.I
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/Java7_64.jre.7.0.0.320.I
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/ICU4C.rte.7.2.0.0.I

Now checking for missing install images…

All required install images have been found. This lpp_source is now ready.

The newly created resource contains base release filesets, we need to update them to the desired version:

root@nim01:/export/lpp_source/lpp-aix720501-FULL# nim -o update -a packages=all -a source=/download/AIX/PTF/Aix-72-05-01 lpp-aix720501-FULL

/export/lpp_source/lpp-aix720501-FULL/installp/ppc/devices.pciex.df1060e214103404.7.2.0.0.I.1
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/rsct.msg.zh_TW.3.2.6.0.I
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/rsct.msg.zh_CN.3.2.6.0.I
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/rsct.msg.sk_SK.3.2.6.0.I


/export/lpp_source/lpp-aix720501-FULL/installp/ppc/devices.pciex.b315506714106104.rte.7.2.1.0.U
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/devices.pciex.b315506714101604.rte.7.2.1.0.U
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/devices.pciex.b31503101410b504.rte.7.2.1.0.U
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/devices.pciex.b3155067b3157365.rte.7.2.1.0.U
/export/lpp_source/lpp-aix720501-FULL/installp/ppc/devices.pciex.b3155067b3157265.rte.7.2.1.0.U

Resource creation and update tasks ended without errors, let’s run another check:

root@nim01:/export/lpp_source/lpp-aix720501-FULL# nim -o check lpp-aix720501-FULL
root@nim01:/export/lpp_source/lpp-aix720501-FULL#

Let’s remove eventual duplicate or superseded filesets:

root@nim01:/export/lpp_source/lpp-aix720501-FULL# nim -o lppmgr -a lppmgr_flags=rub lpp-aix720501-FULL
lppmgr: Source table of contents location is /export/lpp_source/lpp-aix720501-FULL/installp/ppc/.toc
lppmgr: Building table of contents in /export/lpp_source/lpp-aix720501-FULL/installp/ppc ..
lppmgr: Building table of contents completed.
lppmgr: Generating duplicate list..
lppmgr: Generating base level duplicate list..

Results:
No filesets found that can be removed.

LPP_SOURCE resource is now ready to use.

May the NIMADM be with you

NIMADM or Network Installation Manager Alternate Disk Migration, is a facility that helps you to seamlessly clone and upgrade your lpar to a new AIX release.

In the following example lpar01 is at 7100-05-05-1939 level, we ‘d like to upgrade it to 7200-03-03-1914 with the least possible downtime .

The entire nimadm process splits into 12 phases that allows to fully customize the target environment.

root@nim01:/download/AIX/PTF/# nimadm -l lpp-aix720303-FULL -c lpar01-int -s spot-aix720303 -d hdisk0 -Y
Initializing the NIM master.
Initializing NIM client lpar01-int.

Verifying alt_disk_migration eligibility.
Initializing log: /var/adm/ras/alt_mig/lpar01-int_alt_mig.log
Starting Alternate Disk Migration.


Restoring device ODM database.

+—————————————————————————–+
Executing nimadm phase 7.
+—————————————————————————–+
nimadm: There is no user customization script specified for this phase.

+—————————————————————————–+
Executing nimadm phase 8.
+—————————————————————————–+
Creating client boot image.
bosboot: Boot image is 59393 512 byte blocks.
Writing boot image to client’s alternate boot disk hdisk0.

+—————————————————————————–+
Executing nimadm phase 9.
+—————————————————————————–+
Unmounting client mounts on the NIM master.
forced unmount of /lpar01-int_alt/alt_inst/var/adm/ras/livedump
forced unmount of /lpar01-int_alt/alt_inst/var
forced unmount of /lpar01-int_alt/alt_inst/usr
forced unmount of /lpar01-int_alt/alt_inst/tmp
forced unmount of /lpar01-int_alt/alt_inst/opt/ocsinventory
forced unmount of /lpar01-int_alt/alt_inst/opt/netbackup
forced unmount of /lpar01-int_alt/alt_inst/opt/nagios
forced unmount of /lpar01-int_alt/alt_inst/opt/freeware/webmin
forced unmount of /lpar01-int_alt/alt_inst/opt/bmc
forced unmount of /lpar01-int_alt/alt_inst/opt/IBM
forced unmount of /lpar01-int_alt/alt_inst/opt
forced unmount of /lpar01-int_alt/alt_inst/logs
forced unmount of /lpar01-int_alt/alt_inst/home
forced unmount of /lpar01-int_alt/alt_inst/admin
forced unmount of /lpar01-int_alt/alt_inst

+—————————————————————————–+
Executing nimadm phase 10.
+—————————————————————————–+
Unexporting alt_inst filesystems on client lpar01-int:

+—————————————————————————–+
Executing nimadm phase 11.
+—————————————————————————–+
Cloning altinst_rootvg on client, Phase 3.
Client alt_disk_install command: alt_disk_copy -M 7.2 -P3 -d “hdisk0”

Verifying altinst_rootvg…
Modifying ODM on cloned disk.
forced unmount of /alt_inst/var/adm/ras/livedump
forced unmount of /alt_inst/var/adm/ras/livedump
forced unmount of /alt_inst/var
forced unmount of /alt_inst/var
forced unmount of /alt_inst/usr
forced unmount of /alt_inst/usr
forced unmount of /alt_inst/tmp
forced unmount of /alt_inst/tmp
forced unmount of /alt_inst/opt/ocsinventory
forced unmount of /alt_inst/opt/ocsinventory
forced unmount of /alt_inst/opt/netbackup
forced unmount of /alt_inst/opt/netbackup
forced unmount of /alt_inst/opt/nagios
forced unmount of /alt_inst/opt/nagios
forced unmount of /alt_inst/opt/freeware/webmin
forced unmount of /alt_inst/opt/freeware/webmin
forced unmount of /alt_inst/opt/bmc
forced unmount of /alt_inst/opt/bmc
forced unmount of /alt_inst/opt/IBM
forced unmount of /alt_inst/opt/IBM
forced unmount of /alt_inst/opt
forced unmount of /alt_inst/opt
forced unmount of /alt_inst/logs
forced unmount of /alt_inst/logs
forced unmount of /alt_inst/home
forced unmount of /alt_inst/home
forced unmount of /alt_inst/admin
forced unmount of /alt_inst/admin
forced unmount of /alt_inst
forced unmount of /alt_inst
Changing logical volume names in volume group descriptor area.
Fixing LV control blocks…
Fixing file system superblocks…
Bootlist is set to the boot disk: hdisk0 blv=hd5

+—————————————————————————–+
Executing nimadm phase 12.
+—————————————————————————–+
Cleaning up alt_disk_migration on the NIM master.
Cleaning up alt_disk_migration on client lpar01-int.