Saturday, May 11, 2013

NetApp head swap

Advanced post!

Step:1 Prepare a checklist to map existing hardware and new hardware and have the network cables to run way ahead of the swap. Label all cables on the original cluster with current and new ports.

Step:2 Validate physical layout of new equipment once the old equipment has been removed.

Step:3 Identify desired end version of Data OnTap and, if necessary, a “middle” version of Data OnTap that is supported by both old and new sets of hardware: 8.1.1P1 – no middle step required

Step:4 Download all associated version of Data OnTap to your laptop.  If possible, have  pre-load OS’s into /etc/software folder. Be sure to capture the netboot files for the new versions of the OS as well.

Step:5 Check the System Configuration Guide to ensure that the new system and desired OS supports the disk shelves and shelf controller modules.

Step:6 Inventory all equipment received to ensure that it matches what is on the sales order.

a. Capture Licenses for new controllers. Review for match with old licenses.
b. Document cifs shares for the current systems
c. Document nfs exports for the current systems
d. Document LUNs and LUN Serials for the current systems
e. Document WWNN and WWPNs for the current systems
f. Document igroups
g. Document LUN mappings
h. Document disk ownership
i. Document ASIS schedule for current systems

Step:7 Prepare scripts to rebuild if needed:

a. LUN Serials
b. igroups
c. LUN Maps

Step:8 Create a “map” of old cable locations to new cable locations.

Step:9 Capture a copy of the /etc/rc and /etc/hosts file from the current systems. Make the changes required to these files for the new configuration and save a copy of the new files on your laptop.
Step:10 Run Info Collect script. Be sure to log output.

Step:11 Boot into new filer heads and install end version of Data OnTap, install new OS files onto existing filers BUT DO NOT REBOOT.

Step:12 Send ASUP from old hardware

Step:13 Confirm that each controller has only a single root volume, and that no other volume is set as diskroot.

a. Go/No-Go Checkpoint. Beyond this step, services start going offline.
b. Disable SnapVault/SnapMirror
c. Disable cluster
d. Disable CIFS

Step:14 Reboot filers and verify successful upgrade of Data OnTap.

Step:15 Copy the changed /etc/rc and /etc/hosts required for the new hardware

Step:16 Halt filers

Step:17 Remove the old equipment from the rack and replace with new hardware

a. Cable new systems as per cable map developed previously
b. Ensure that ONLY the previously used shelves are connected to the new heads.

Step:18 Boot into Maintenance Mode

Step:19 Confirm or change on-board FC port settings (target/initiator) – all onboard ports should be target

Step:20 Halt and boot both controllers into maintenance mode if any ports needed to be switched in the previous step

Step:21 Verify that all onboard ports on both controllers are now set to initiator or target correctly

a. Reassign disks from old controllers to new controllers
b. Verify fiber connections
c. Verify aggregate status on new equipment
d. Confirm that each head sees only one root aggregate.  If more than one root aggregate is visible, re-check disk ownership.  If more than two root aggregates exist across the cluster, coordinate with customer to identify correct root aggregate and consider backout if this cannot be accomplished.
e. Clear all mailbox disks (local and partner)

Step:22 Halt new systems

Step:23 Boot the new systems

Step:24 Confirm correct root volume and system identity.
a. Install licenses for new system
b. Reboot new system and bring it fully up
c. Verify disk cabling via Wiregauge and/or “storage show disk –p”
d. Check aggregate status to verify that all aggregates are online
e. Check volume status to verify that all volumes are online
f. Verify network connectivity
g. Enable CIFS

Step:25 Validate
a. Verify CIFS shares are present
b. Verify NFS exports are present
c. Verify LUN serial numbers (reset if necessary)
d. Verify igroups (reset if necessary)
e. Verify LUNs are online & mapped (reset if necessary)
f. Verify ASIS Deduplication schedules and settings (reset if necessary)
g. Verify WWNN and WWPNs are correct (reset if necessary)
h. Verify RLM/BMC setup and reconfigure if necessary

Step:26 Install the current version of Data OnTap again to ensure the correct kernel on the boot media

Step:27 Verify that both primary and backup kernel versions on the boot media are correct

Step:28 Verify time, autosupport, and timezone settings (reset if necessary)

Step:29 Enable Cluster Failover

Step:30 Update all FW to latest (or desired) revision, including battery firmware if appropriate, and reboot filers

Step:31 Perform CF Failover and Giveback test from both controllers

Step:32 Perform single-path test on both controllers

Step:33 Perform VIF failover testing on both controllers

Step:34 Send ASUP from upgraded nodes

Step:35 Enable SnapVault/Snapmirror

Step:36 Time for application testing


Post a Comment
Read more: