PowerHA Cluster

Transcrição

PowerHA Cluster

Sascha Wycisk – Senior IT Architect
20 Mai 2014
PowerHA 7.1
News & Migration Best Practice
© 2014 IBM Corporation
Power Experten Forum 20 Mai 2014
Allgemeines zu PowerHA
 IBM hat den Support für PowerHA 6.1 zum 30. September 2014 30. April 2015 abgekündigt.
Support Lifecycle Seite
 Die neue Version PowerHA 7.1 nutzt im Gegensatz zur PowerHA 6.1 die in AIX integrierte
CAA (Cluster Aware AIX) Funktionalität. DMS (Dead Man Switch) per default off.
 Mit PowerHA 7.1.3 (verfügbar seit Dez. 2013) sind viele Kundenanforderungen als
Neuerungen verfügbar
 Mindestlevel
PowerHA Version
>= 7.1.3
AIX Version
AIX 6.1 TL 9 with SP 1
AIX 7.1 TL 3 with SP 1
Source: If applicable, describe source origin
2/42
Allgemeines zu PowerHA
Supported Versions / Combinations
Supported Versions
/ Combinations
AIX 4.3.3
AIX 5.1
AIX 5.1(64-bit)
AIX 5.2
AIX 5.3
AIX 6.1
AIX 7.1
HACMP 4.5
No
Yes
No
Yes
No
No
No
HACMP/ES 4.5
No
Yes
Yes
Yes
No
No
No
HACMP/ES 5.1
No
Yes
Yes
Yes
Yes
No
No
HACMP/ES 5.2
No
Yes
Yes
Yes
Yes
No
No
HACMP/ES 5.3
No
No
No
Yes
Yes
HACMP/ES 5.4.0
No
No
No
TL8+
TL4+
No
No
No
TL8+
TL4+
No
No
No
No
TL9+
TL2,SP1+
No
No
No
No
TL9+
TL2,SP1+
No
No
No
No
No
TL6+
No
No
No
No
No
TL7+
TL1 SP2
No
No
No
No
No
TL8 SP1
TL2 SP1
No
No
No
No
No
TL9 SP1
TL3 SP1
PowerHA 7.1.2
No
No
No
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101347
Source: If applicable, describe source origin
3/42
Allgemeines zu PowerHA 7.1
Interessante Features in 7.1.3

Unicast based heartbeat

Das CAA environment bietet nun die Möglichkeit der Auswahl von
IP unicast oder IP multicast für IP heartbeating.

Default Einstellungen:
• Installation = Unicast
• Migration = Multicast (Unicast steht zur Auswahl)
[ PowerHA System Mirror Migration Check ]
Your cluster can use multicast or unicast messaging for heartbeat.
Multicast addresses can be user specified or default (i.e. generated by AIX).
Select the message protocol for cluster communications:
1 = DEFAULT_MULTICAST
2 = USER_MULTICAST
2 = USER_MULTICAST
3 = UNICAST
3 = UNICAST
Select one of the above or "h" for help or "x" to exit: Select one of the above or "h" for help or "x" to exit: 4/42

Tie Breaker Support
Cluster Split and Merge Handling policies are supported:
• Split: No action
Merge: Majority
• Split: Tie Breaker
Merge: Tie Breaker
These only supported for Linked Clusters:
• Split: No Action
Merge: Priority
• Split: No Action
Merge: Manual
• Split: Tie Breaker
Merge: Priority
• Split: Manual
Merge: Manual
5/42
Streched PowerHA 7.1 Cluster with Tie Breaker Disk
Hardware 1
Hardware 2
Node 1
Node 2
Site 2
Site 1
20 km
VIO 1a
VIO 1b
VIO 2a
VIO 2b
IP Heartbeat 1 - 3
Switches
Switches
repository
Bckp
repository
Rootvg
+
Datavg‘s
High Available Service Netzwork Gateway
Rootvg
+
Datavg‘s
SAN Storage 2
S/N zzzxxxy
SAN Storage 1
S/N xxxyyyz
SAN Heartbeat
200 km
Site 3
Tie
Breaker
Netapp
S/N yyyxxxz
6/42

clmgr enhancments

Syntactical Built-In Help

Lists all possible inputs for an operation.

Shows valid groupings

Provides complete required vs. optional input information.
root@sascha:/> clmgr view h
root@sascha:/> clmgr view h
# Available classes for clmgr action "view":
# Available classes for clmgr action "view":
log
log
mirror_group
mirror_group
report
report
snapshot
snapshot
root@sascha:/> clmgr view report h
root@sascha:/> clmgr view report h
clmgr view report cluster \
clmgr view report cluster \
TYPE=html \
TYPE=html \
[ FILE=<PATH_TO_NEW_FILE> ] \
[ FILE=<PATH_TO_NEW_FILE> ] \
[ COMPANY_NAME="<BRIEF_TITLE>" ] \
[ COMPANY_NAME="<BRIEF_TITLE>" ] \
[ COMPANY_LOGO="<RESOLVEABLE_FILE>" ]
[ COMPANY_LOGO="<RESOLVEABLE_FILE>" ]
7/42
SMITTY search
PowerHA SystemMirror
PowerHA SystemMirror
Move cursor to desired item ++
Move cursor to desired item ++
| Can't find what you are looking for ? |
| Can't find what you are looking for ? |
Cluster Nodes and Networks| |
Cluster Nodes and Networks| |
Cluster Applications and R| Move cursor to desired item and press Enter. |
Cluster Applications and R| Move cursor to desired item and press Enter. |
| |
| |
System Management (CSPOC)| [MORE...785] |
System Management (CSPOC)| [MORE...785] |
Problem Determination Tool| # (cm_manage_networks_menu) |
Problem Determination Tool| # (cm_manage_networks_menu) |
Custom Cluster Configurati| # Add a Network |
Custom Cluster Configurati| # Add a Network |
| (cm_add_network) |
| (cm_add_network) |
Can't find what you are lo| # Change/Show a Network |
Can't find what you are lo| # Change/Show a Network |
Not sure where to start ? | (cm_change_show_network) |
Not sure where to start ? | (cm_change_show_network) |
| # Remove a Network |
| # Remove a Network |
| (cm_remove_network) |
| (cm_remove_network) |
| # Network Interfaces |
| # Network Interfaces |
| # (cm_manage_interfaces_menu) |
| # (cm_manage_interfaces_menu) |
| # Add a Network Interface |
| # Add a Network Interface |
| (cm_add_interfaces) |
| (cm_add_interfaces) |
| # Change/Show a Network Interface |
| # Change/Show a Network Interface |
| (cm_change_show_interfaces) |
| (cm_change_show_interfaces) |
| # Remove a Network Interface |
| # Remove a Network Interface |
| (cm_remove_interfaces) |
| (cm_remove_interfaces) |
| # Define Repository Disk and Cluster IP Address |
| # Define Repository Disk and Cluster IP Address |
| (cm_define_repos_ip_addr) | | (cm_define_repos_ip_addr) | | # Configure Cluster Split and Merge Policy | | # Configure Cluster Split and Merge Policy | | (cm_cluster_split_merge) | | (cm_cluster_split_merge) | | [MORE...223] | | [MORE...223] | | | | | | F1=Help F2=Refresh F3=Cancel | | F1=Help F2=Refresh F3=Cancel | | F8=Image F10=Exit Enter=Do | | F8=Image F10=Exit Enter=Do | F1=Help | /=Find n=Find Next |age F1=Help | /=Find n=Find Next |age 8/42

clmgr enhancements – continued ...

Native HTML Report

Contains more cluster configuration information than any other report.

Can be scheduled via AIX core functionality like cron

Portable: Can send by e-mail without loss of information.
clmgr view report cluster FILE=/tmp/a2_b2_cl_report.html TYPE=html COMPANY_NAME=My_Test_Company
clmgr view report cluster FILE=/tmp/a2_b2_cl_report.html TYPE=html COMPANY_NAME=My_Test_Company
9/42

clmgr enhancements – continued ...

Availability Report

Zeitraum definierbar

Eingrenzbar auf Application Controller
root@sascha:/> clmgr view report availability
root@sascha:/> clmgr view report availability
No application data could be found for this report. This is often caused
No application data could be found for this report. This is often caused
by a lack of, or problem with, application monitoring. Application monitors
by a lack of, or problem with, application monitoring. Application monitors
produce the historical data needed for this report.
produce the historical data needed for this report.
Analysis begins: Wednesday, 02April2014, 00:00
Analysis begins: Wednesday, 02April2014, 00:00
Analysis ends: Friday, 02May2014, 23:59
Analysis ends: Friday, 02May2014, 23:59
Application analyzed: sasapp
Application analyzed: sasapp
Total time: 30 days, 23 hours, 59 minutes, 59 seconds
Total time: 30 days, 23 hours, 59 minutes, 59 seconds
Uptime:
Uptime:
Amount: 30 days, 20 hours, 38 minutes, 15 seconds
Percentage: 99.55%
Percentage: 99.55%
Longest period: 21 days, 1 hours, 29 minutes, 32 seconds
Downtime:
Downtime:
Percentage: 0.45%
Percentage: 0.45%
...
...
10/42





clmgr enhancements
Embedded Hyphen and Leading Digit Support in Node Labels
Beispiele:
first-node oder 2ndnode
Dynamic hostname change (Temporary or Permanent)
Several new Smart Assists für DB2, Oracle, WebSphere, MQSeries, FileNet P8
(Enterprise Content Manager), Lotus Domino Server, TSM, Tivoli Systems
Director, SAP Netweaver, SAP MaxDB, SAP Live Cache Hot Standby
Cluster Copying
Cluster Aware AIX (CAA) enhancements:

Scalability (CAA does now support up to 32 nodes and 1024 shared disks)
(PowerHA 7.1 kann mehr als 1024 shared disks)
11/42
Interessante Features in 7.1.3 Hyperswap Enhancements
Site 2
Site 1
 Continuous Availability against Storage
failures
 Substitutes storage secondary to take the
place of failed primary device
– Non-disruptive - applications keep running
– Key value add to HA/DR deployments
Cluster
Hyperswap
SWAP
Customer Benefits
– Unplanned HyperSwap:
•Continuous Availability against storage failures
– Planned HyperSwap:
•Storage Maintenance without downtime
•Storage migration without downtime
Sync
Mirror
Primary DS8K
Site 1
Legend:
Active Path
Passive Path
12/42
Systems Director Cluster Simulator
Director
Server
Export
Export
<PowerHAXml>
<Cluster>
…
</Cluster>
</PowerHAXml>
Deploy
<PowerHAXml>
<Cluster>
…
</Cluster>
</PowerHAXml>
Deploy
Site 1
Display
connected
mode
Display
xml
mode
3 mögliche Modi verfügbar
●
●
Director Console in browser
13/42
●
Online
Offline Simulation
Planning
Heatbeating





Für das Heatbeating in einem PowerHA 7.1 Cluster ist das CAA verantwortlich.
Für das Heartbeating können genutzt werden:

IP-Netzwerke

SAN-Adapter (sfwcomm)

Repository Disk
Nicht mehr genutzt werden kann der aus PowerHA 6.1 und früher bekannte
disk heartbeat (net_diskhb_01).
Solange spezielle IP-Netze nicht ausdrücklich aus dem Heartbeating
herauskonfiguriert werden, werden alle IP-Netze für den Heartbeat genutzt.
Der Repository Disk Heartbeat setzt erst ein, wenn alle anderen Netze nicht mehr
für Heartbeats funktionieren.
14/42
Heartbeating
IP-Heartbeating



Seit PowerHA 7.1.3 kann auch wieder Unicast für die Kommunikation genutzt
werden.
Default Einstellungen:

Installation = Unicast

Migration = Multicast
Für die Cluster Kommunikation wird das Netz mit der schnellsten RTT (Round
Trip Time) genutzt. Sollte das Netz gestört sein wird auf das mit der dann
schnellsten RTT gewechselt.
15/42
Heartbeating
SAN-Heartbeating






Das SAN Heartbeating arbeitet mit den
WWPNs der PHYSISCHEN Adapter im
VIOS bei Virtualisierung
Es muß ein Storage Framework
Communication Device installiert und
konfiguriert werden, damit virtualisierte
Clients es nutzen können.
Auf den VIOS muß der Target Mode
enabled werden für die Adapter
Die Kommunikation des Clients läuft über
PVID 3358 des Hypervisor
Nicht Supportet bei LPM (Live Partition
Mobility) Link zum Informationcenter
Alle genutzen Adapter müssen in einer
Zone sein.
16/42
Repository Disk





Die Repository Disk kann nicht mit LVM Mitteln gespiegelt werden
Die Repository Disk spielt eine Key Role beim Start und bei der Überwachung der
Gesundheit des Clusters. Im Besonderen wird die Repository Disk für Heartbeats, cluster
messages and node to node synchronisation genutzt.
1 GB Größe ist vollkommen ausreichend für die Repository Disk
Wenn die Repository Disk beim Start der LPAR nicht zur Verfügung steht wird der CAA
Cluster nicht gestartet.
Beim Health Management gibt es 2 Kernfunktionen die im Cluster genutzt werden:
1. Continuous Health monitoring
2. Distress time cluster communication
17/42
Repository Disk
Health Management

Continuous Health Monitoring
CAA and disk device drivers maintain health counters per node. These health counters are updated and
read at least once every two seconds by the Storage framework device driver. The health counters
of the other nodes are compared every 6 seconds to determine if the other nodes are still functional
(note that this time might change if necessary in the future). Failures of any of these writes and
reads will result in repository failure related events to CAA and PowerHA. This means Administrator
would have to provide a new disk to be used as a replacement disk for the original failed repository
disk.

Distress time Cluster communication
When all the network interfaces have failed (eg for a node), then the node is in distress condition. In this
distress environment CAA and Storage framework use the repository disk to do all the necessary
communication between the distressed node and other nodes. Note that this type of communication
requires certain area of the disk to be set aside per node for writing the messages meant to be
delivered to other nodes. This disk space is automatically allocated at cluster creation time. No
action from the customer is needed. When operating in this mode, each node has to scan the
message areas of all other nodes several times per second to receive any messages meant for
them.
Note that since this second method of communication not the most efficient form of communication
as it requires lot more of polling of the disk, it is expected that this form of communication is used
only when the cluster is in distress mode. The failover happens automatically without any
intervention from customer’s side.
18/42
Repository disk
Mögliche Spiegelung
 Mit dem SVC gibt es eine Möglichkeit die Repository Disk hochverfügbar zu machen.
Evtl. ist dies auch bei Produkten anderer Storage-Hersteller möglich. Bitte beim Storage
Vendor nachfragen.
19/42
Repository Disk
Beispiel Ausfall-Szenario
Hardware 1
PowerHA Cluster
Cluster Node 1 (a2)
AIX 7.1 TL3 SP1
NPIV Mapping
---------------------------------PowerHA 7.1 TL3
Unicast
VIO 1
Hardware 2
Cluster Node 2 (b2)
Datacenter 1
Datacenter 2
VIO 2
AIX 7.1 TL3 SP1
NPIV Mapping
---------------------------------PowerHA 7.1 TL3
Unicast
VIO 1
VIO 2
IP Heartbeat 1
Switches
Switches
SAN Heartbeat
rootvg
+
datavg‘s
Backup
repository
rootvg
+
datavg‘s
SAN Storage 2
SAN Storage 1
20/42
repository
Repository Disk
Doc APAR bei gleichzeitigem Node und Repository Disk Fehler
Der Doc-APAR IV50788 sollte beim Betrieb einer PowerHA 7.1 Umgebung kannt sein:
IV50788: DOC HA 7.1 HOWTO HANDLE SIMULTANEOUSLY REP DISK AND NODE
FAILURE
Link: http://www-01.ibm.com/support/docview.wss?uid=isg1IV50788
Der Doc APAR ist eine Verfahrensanweisung. Er bezieht sich auf einen gleichzeitigen Node
und Repository Disk Ausfall, bei dem aus Verfügbarkeitsgründen ein recreate der Repository
Disk auf einer anderen Disk durchgeführt wird. Wenn die “Failing Node” wieder zur
Verfügung steht muß eine Bereinigung vorgenommen werden, damit die volle
Funktionsfähigkeit wieder hergestellt ist.
21/42
Repository Disk
Beispiel Ausfall-Szenario
Hardware 1
PowerHA Cluster
Cluster Node 1 (a2)
AIX 7.1 TL3 SP1
NPIV Mapping
---------------------------------PowerHA 7.1 TL3
Unicast
VIO 1
Hardware 2
Cluster Node 2 (b2)
Datacenter 1
Datacenter 2
VIO 2
AIX 7.1 TL3 SP1
NPIV Mapping
---------------------------------PowerHA 7.1 TL3
Unicast
VIO 1
VIO 2
IP Heartbeat 1
Switches
Switches
SAN Heartbeat
rootvg
+
datavg‘s
Backup
repository
rootvg
+
datavg‘s
SAN Storage 2
SAN Storage 1
22/42
repository
Repository Disk
Besonderer Fall: Totaler Cluster Fehler und fehlende Repositoy Disk
Situation: Site failure (Node und Disk) und anschließender Fahler in der übernehmenden
Node der zu einem Reboot führt bevor ein replacement der Repository Disk gemacht
werden kann. Ohne Repository Disk kein Start des CAA Cluster nach dem reboot.

Lösung:

Der Cluster wird aus dem Cache File wieder hergestellt.
Beschrieben im Redbook
IBM PowerHA SystemMirror 7.1.3 for AIX Reference Guide

Voraussetzung:
Version
Re-create possible
Fixed
< PowerHA 7.1.2
Special Procedure during outage
Please
contact your support center
= PowerHA 7.1.2
(Yes)
Backport required. Please
contact your support center *1
= PowerHA 7.1.3 and AIX 7.1 TL3 SP1
or AIX 6.1 TL9 SP1
YES
AIX and PowerHA fix required depending on level.
Please open a problem record to request fixes. *1
YES
Included
(Statement of Direction)
>= PowerHA 7.1.3 SP1*2 and AIX 7.1
TL3 SP3 or AIX 6.1 TL9 SP3
*1 Muß vor dem Eintreten der Störung installiert sein!
*2 Verfügbar seit 18.5.2014
23/42
Achtung:
recfgct führt zu einem sofortigen halt des Cluster - Node
Description:
Sometimes it is required to run recfgct (reconfigure RSCT) e.g.to make DLPAR operations working again.
Affected:
PowerHA 7.1
Detailed Description:
On PowerHA 6.1 this is possible while HA is UP as long as no RSCT/RMC based functionality are used:
- no "Process Application Monitor" is defined
- no "User Defined Events" are configured
- "Dynamic Node Priority" is not used.
because HA 6.1 uses its "own" RSCT instance (topsvcs+grpsvcs).
Only the above functionalities are based on RSCT/RMC domain (ctrmc and RSCT resource manager). Most HA 6.1 clusters do not
use any of those three functionalities.
PowerHA 7.1 is using CAA and the RSCT Peer Domain (cthags) and no longer an own instance.
Therefore it is not possible to run recfgct without a Downtime for a PowerHA 7.1 node.
Running recfgct on an active PowerHA 7.1 node (PowerHA is UP) causes the node to "halt" !!!
Please call your local support for a valid procedure to run recfgct.
Developerworks Eintrag - Nicht bei PowerHA7.1 nutzen
24/42
Achtung:
AIX Update inclusive RSCT filesets führt zu einem sofortigen halt
des Cluster - Node
Description:
In new AIX Releases and Technology Levels RCST filesets are included in the BOS update downloads
Affected:
PowerHA 7.1
Detailed Description:
PowerHA uses the RSCT RPD (cthags). Die RSCT RPD uses CAA.
Thus PowerHA 7.1 depends on RSCT RPD and thereby indirect from CAA.
During an update of rsct filesets the rsct demon will be refreshed and the node will be halted.
25/42
Migration
Upgrade Path Optionen
Ausgangsversion
Zielversion
5.4.1
5.5
6.1
7.1.1
7.1.2
7.1.3
5.4
R/S/O/N
R/S/O/N*
R*/S/O*/**
*
NA
NA
5.4.1
NA
R/S/O/N
R/S/O/N
R/S/O
NA
NA
5.5
NA
NA
R/S/O/N
R/S/O
R/S/O
NA
6.1
NA
NA
NA
R/S/O
R/S/O
R/S/O
7.1.0
NA
NA
NA
S/O
S/O
S/O
7.1.1
NA
NA
NA
NA
R/S/O
R/S/O
7.1.2
NA
NA
NA
NA
NA
R/S/O
R=Rolling Upgrade
S=Snapshot Upgrade
O=Offline Upgrade
N=Non-disruptive Upgrade
*= Must upgrade to 5.4.1 or above first, making it a two step upgrade.
**=NDU not officially supported beyond two version
26/42
Migration

Eine Migration ist nur von PowerHA 6.1 oder PowerHA 7.1.x möglich.

Cluster mit PowerHA 5.5 und älter müssen zuerst nach PowerHA 6.1 migriert werden.



27/42
Folgende Migrationsverfahren sind möglich

Offline

Rolling

Snapshot
Die folgenden Filesets müssen vor der Migration installiert sein

bos.cluster.rte >>> enthält auch clmigcheck – AIX Fileset!

bos.ahafs

bos.clvm.enh

devices.commom.IBM.storfwork ( Nur wenn SAN Adapter Heartbeating genutzt
werden soll )
Der Communication-path muß auf das Interface gesetzt sein, das den Hostname trägt.
Migration
nicht unterstützte Komponenten / Verfahren




28/42
Starting with PowerHA SystemMirror 7.1, the following features are no longer available:

IP address takeover (IPAT) via IP replacement

Locally administered address (LAA) for hardware MAC address takeover (HWAT)

Heartbeat over IP aliases
The following IP network types:

ATM

FDDI

Token Ring
The following point-to-point (non-IP) network types:

RS232

TMSCSI

TMSSA

Disk heartbeat (diskhb)

Multinode disk heartbeat (mndhb)
WebSMIT (replaced with the IBM Systems Director plug-in)
Mögliches Problem
Hostname als Alias
Der Hostname darf nicht als Alias auf dem Interface definiert sein!
Dies kann der Fall sein, wenn:


Cluster wenigen IP-Netzen haben
und
von einer dedizierten in eine virtualisierte Umgebung umgezogen wurden
Ist der Hostname als IP-Alias auf das Bootnetz konfiguriert meldet das clmigcheck Script
einen Fehler.
In einem solchen Fall ist eine Umkonfiguration notwendig um die Migration durchzuführen.
29/42
Unicast oder Multicast
The multicast IP address is generated in the case of a stretched cluster configuration. CAA
generates the multicast address based on a local IP, associated with the hostname of a
cluster node, by replacing the first byte of the IP address with 228: The address is
generated during the first synchronization, at the time of CAA cluster creation. For
example, you have a 2-node cluster with a single IP segment at both sites:
ha1clFp: 192.155.5.123
ha2clFp: 192.155.5.124
The default generated multicast IP in this case is 228.155.5.123 .
root@ha2clFp:/ # mping s a 228.155.5.123 c 2
root@ha2clFp:/ # mping s a 228.155.5.123 c 2
mping version 1.1
mping version 1.1
mpinging 228.155.5.123/4098 with ttl=1:
mpinging 228.155.5.123/4098 with ttl=1:
32 bytes from 9.155.5.123 seqno=0 ttl=1 time=5.375 ms
root@ha1clFp:/ # mping r a 228.155.5.123
root@ha1clFp:/ # mping r a 228.155.5.123
mping version 1.1
mping version 1.1
Listening on 228.155.5.123/4098:
Listening on 228.155.5.123/4098:
Replying to mping from 9.155.5.124 bytes=32 seqno=0 ttl=1
30/42
Vorgehen
Migration 7.1.x > 7.1.3 und Change Multicast > Unicast

Zuerst müssen erst alle Voraussetzungen erfüllt sein




AIX Level
PowerHA Level
Um die Migration / das Update abzuschließen muß der Cluster 1 Mal neu gestartet werden
Die Umstellung Multicast > Unicast kann online ohne downtime durchgeführt werden
root@sascha:/hacmp/scripts> lscluster c | grep Mode root@sascha:/hacmp/scripts> lscluster c | grep Mode Communication Mode: multicast
Communication Mode: multicast
root@sascha:/hacmp/scripts> clmgr modify cluster HEARTBEAT_TYPE=unicast root@sascha:/hacmp/scripts> clmgr modify cluster HEARTBEAT_TYPE=unicast root@sascha:/hacmp/scripts> lscluster c | grep Mode root@sascha:/hacmp/scripts> lscluster c | grep Mode Communication Mode: multicast
Communication Mode: multicast
root@sascha:/hacmp/scripts> /usr/es/sbin/cluster/utilities/cldare "rt" C interactive
root@sascha:/hacmp/scripts> /usr/es/sbin/cluster/utilities/cldare "rt" C interactive
...
...
PowerHA SystemMirror Cluster Manager current state is: ST_STABLE.....completed.
PowerHA SystemMirror Cluster Manager current state is: ST_STABLE.....completed.
root@sascha:/hacmp/scripts> lscluster c | grep Mode root@sascha:/hacmp/scripts> lscluster c | grep Mode Communication Mode: unicast
Communication Mode: unicast
Beispiel auf Developerworks
31/42
Multicast Adresse im lscluster -i output bei Nutzung von unicast
Der lscluster -i output enthält immer auch eine Multicast-Adresse.
Unabhängig vom Installations- / Migrationsweg.
root@sascha:/> lscluster i
root@sascha:/> lscluster i
Network/Storage Interface Query
Network/Storage Interface Query
...
...
Node sascha
Node sascha
Node UUID = 064dc196cf9011e3abfae6081000301e
Node UUID = 064dc196cf9011e3abfae6081000301e
Number of interfaces discovered = 2
Number of interfaces discovered = 2
Interface number 1, en0
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address length = 6
MAC address = E6:EB:D0:00:30:02
MAC address = E6:EB:D0:00:30:02
Smoothed RTT across interface = 7
Smoothed RTT across interface = 7
Mean deviation in network RTT across interface = 3
Mean deviation in network RTT across interface = 3
Probe interval for interface = 990 ms
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
NDD flags for interface = 0x0021081B
Interface state = UP
Interface state = UP
Number of regular addresses configured on interface = 1
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 9.154.97.99 broadcast 9.154.97.127 netmask 255.255.255.128
IPv4 ADDRESS: 9.154.97.99 broadcast 9.154.97.127 netmask 255.255.255.128
Number of cluster multicast addresses configured on interface = 1
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.154.97.97
IPv4 MULTICAST ADDRESS: 228.154.97.97
Interface number 2, dpcom
Interface number 2, dpcom
IFNET type = 0 (none)
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
NDD type = 305 (NDD_PINGCOMM)
...
...
Die Auswertung eines tcpdump zeigte keine Kommunikation über diese Adresse bei
verwendung von unicast.
32/42
clmigcheck
Auswahl des IP Heartbeats
Bei Auswahl des IP heartbeats wird als default Multicast mit der vom HACMP generierten
multicast Adresse gewählt.
2 = USER_MULTICAST
2 = USER_MULTICAST
3 = UNICAST
3 = UNICAST
Select one of the above or "h" for help or "x" to exit: Select one of the above or "h" for help or "x" to exit: 33/42
clmigcheck
Anzeige des alten net_diskhb
Beim Aufruf des clmigcheck Script wird darauf hingewiesen das die alten Disk Heartbeats
nicht migriert werden. Das ist kein Fehler und das Script läuft normal weiter. Es ist nur als
Hinweis zu verstehen.
CONFIGWARNING: The configuration contains unsupported hardware: Disk
CONFIGWARNING: The configuration contains unsupported hardware: Disk
Heartbeat network. The PowerHA network name is net_diskhb_01. This will be
Heartbeat network. The PowerHA network name is net_diskhb_01. This will be
removed from the configuration during the migration
removed from the configuration during the migration
to PowerHA System Mirror 7.1.
to PowerHA System Mirror 7.1.
Hit <Enter> to continue Hit <Enter> to continue [ PowerHA System Mirror Migration Check ]
The ODM has no unsupported elements.
The ODM has no unsupported elements.
Hit <Enter> to continue Hit <Enter> to continue 34/42
clmigcheck
Defect in PowerHA6.1 SP10




Es gibt einen Defect im PowerHA6.1 SP10 der das cllspvids Script betrifft
Das Script wird vom clmigcheck gerufen um die Auswahl der verfügbaren hdisks für die
repository disk zu erstellen.
Die Liste bleibt leer, auch wenn Platten verfügbar und richtig konfiguriert sind.
Ob man vom Fehler betroffen ist, kann man wie folgt feststellen:
root@Lleonard:/> /usr/es/sbin/cluster/cspoc/cllspvids n leonard sheldon
root@Lleonard:/> /usr/es/sbin/cluster/cspoc/cllspvids n leonard sheldon
# No free disks found.
# No free disks found.
# Check each node to see if any disks need to have PVIDs allocated.
# Check each node to see if any disks need to have PVIDs allocated.
root@leonard:/> /usr/es/sbin/cluster/cspoc/cllspvids
root@leonard:/> /usr/es/sbin/cluster/cspoc/cllspvids
00c0fb32fcff8cc2 ( hdisk3 on all cluster nodes )
00c0fb32fcff8cc2 ( hdisk3 on all cluster nodes )

Zur Zeit ist leider noch kein ifix verfügbar. (Stand 4. Mai 2014)
35/42
Migration
Rolling Migration



Vorteile:

Der Cluster ist verfügbar wenn die erste Node upgedated wird

Während der Updates ist weniger Zeitdruck vorhanden
Nachteile:

Mehrere Downtimes (bei jedem Schwenk der Applikation)

Keine Changes am Cluster während der mixed Cluster Phase

Eine Automation ist nicht möglich
Hinweis:

36/42
der CAA cluster wird erst beim upgrade des letzten Cluster Nodes angelegt!
(caavg_private erst dann sichtbar)
Migration
Snapshot Migration


37/42
Vorteile:

Der Snapshot ist auch gleich das Backup der Cluster-Konfiguration

Die Snapshot Migration läßt sich sehr gut automatisieren

Altlasten werden bei der Deinstallation mit entsorgt

Beim Einspielen des Snapshot wird der CAA cluster gleich mit angelegt
Nachteile:

Downtime der Applikation

Änderung an allen Nodes gleichzeitig
Migration
Automation

clmigcheck Script


Das clmigcheck Script, das mit dem bos.cluster.rte Fileset geliefert wird
unterstützt die Cluster-Migration. Dem Script können keine Parameter
mitgegeben werden. Auch nicht in Art eines response files o.ä..
Wenn man die Migration automatisieren möchte müssen alles Prüfungen des
Scriptes vom Administrator gemacht werden. Zudem muß das File
/var/clmigcheck/clmigcheck.txt, das sonst vom Script erstellt und auf alle Cluster
Nodes verteilt wird, von der Automation erstellt und verteilt werden.
root@sascha /root# cat /var/clmigcheck/clmigcheck.txt
root@sascha /root# cat /var/clmigcheck/clmigcheck.txt
CLUSTER_TYPE:STANDARD
CLUSTER_TYPE:STANDARD
CLUSTER_REPOSITORY_DISK:00f840038a4165e7
CLUSTER_REPOSITORY_DISK:00f840038a4165e7
CLUSTER_MULTICAST:UNI
CLUSTER_MULTICAST:UNI
38/42
Migration
Beispiel Vorlage automatisierte snapshot Migration
39/42
Woran man noch denken sollte!

Gleich eine Backup Repository Disk mit anlegen bei Nutzung von LVM
Mirroring auf unterschiedlichen SAN Storage Systemen

Die Disk als Repository Backup im PowerHA definieren

Erweiterung der Überwachungsscripts (repository Fehler)

Spezielle Tests für die Repository Disk in den Abnahme-Tests für Releases
und Neuaufbauten definieren.

Prüfen der eigenen Scripte zum Clusterbau / zur Clustererweiterung

IBM PowerHA SystemMirror Rapid Deploy Cluster Worksheets
http://www.redbooks.ibm.com/abstracts/tips1176.html?Open
40/42
Serviceangebote rund um PowerHA Migrationen




1 Tages-Workshop incl. eines
einstündiges Vorgesprächs per
TelKo.
Mig. Support
Release
Empfehlung
System Healthcheck
Release Empfehlung ggfs.
aufbauend auf den Healthcheck
On-Site Migrationssupport
41/42
Healthcheck
Migrationsplanung
1 Tages
Workshop
Ansprechpartner
Sascha Wycisk
Senior IT Architect
Open Client Software Support
IBM Deutschland GmbH
Niederlassung Münster
Tel.: 0171-33-73-652
E-Mail: [email protected]
42/42

PowerHA Cluster

Transcrição

Documentos relacionados

Cluster Nanotechnologie

Hallenplan der PRODUCTRONICA 2015

Clusterprojekt: Auswirkungen von Unterwasserschall auf marine

Innovative Getriebe für E-Fahrzeuge

IBM High Availability Services

Biologische Aspekte der Entwicklungspsychologie

IBM Praktikaangebote für Studenten - Werkstudent/in im

COB_NewsLetter_August 2012 - Cluster

Jubiläum – 40 Jahre Productronica

Skillprofil german - hydrografix Consulting GmbH