Windows VM iscsi timeout when doing controller giveback for Netapp storage

Jephe Wu - http://linuxtechres.blogspot.com

Problem: Windows VM iscsi drive timeout and disappear after Netapp giveback operation, which caused Microsoft SQL server 2012 down, as data files are on iscsi D drive.
Objective:  find out why iscsi drive disappeared during giveback operation
Environment:  Windows 2008 R2 VM sitting on ESXi 5.1 cluster, D drive is iscsi drive on Netapp ONTAP 7 mode storage, VM is using Microsoft software iscsi initiator connecting Netapp portal target group.



Observation
When doing giveback between two controller heads on Netapp, iscsi drive on ESXi Windows 2008 R2 VM lost connection to the target which caused D drive timeout then disappear, which caused SQL server 2012 down. After around 18 minutes, iscsi drive reconnected back, SQL server restarted and operation resumed.

There was no any issues during takeover for Netapp, only there's issue during giveback.

Root Cause
After some research, the KB article from Netapp below indicated the problem:

Microsoft iSCSI SW Initiator takes a long time to reconnect to the filer after disruption. - http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=202007

I've pasted above KB article as follows:
------------------

Bug ID202007
TitleMicrosoft iSCSI SW Initiator takes a long time to reconnect to the filer after disruption.
Duplicate of
Bug Severity3 - Serious inconvenience
Bug StatusClosed
ProductData ONTAP
Bug TypeISCSI - Windows 
Description Formatted
 During initial target discovery ("Add Target"), the Microsoft iSCSI SW
 Initiator uses the iSCSI SendTargets command, to retrieve from the target
 a list of IP addresses at which the target can be accessed.
 
 In a filer configuration with multiple physical networks or multiple VLANs,
 it is possible that some of the filer's addresses are not accessible to
 a given host.  In this situation, the SendTargets response sent by the
 filer will advertise some addresses which are not accessible by that host.
 
 When the Microsoft SW initiator loses connectivity to the target (such
 as during filer reboot, takeover, and giveback), the initiator attempts
 to restablish connectivity to the target using the following default
 algorithm, which Microsoft calls 'port-hopping':
 
   - attempt to reconnect over the same IP address which was being
     used before the disruption
   - cycle through the other IP addresses from the SendTargets response,
     attempting to reconnect, until connectivity is reestablished.
 
 Each inaccessible IP address in the list can add a delay of 15-20
 seconds, the TCP connection establishment timeout.  If there are many
 inaccessible IP addresses in the list, it may take a long time for the
 Microsoft initiator to cycle through the list before it finally
 successfully reconnect to the target.  If the total reconnect time
 exceeds the timeout configured on the host (MaxRequestHoldTime (non-MPIO),
 or PDORemovePeriod (MPIO)), the application will result in I/O errors.
 
Workaround Formatted
 This long reconnect time can be minimized by disabling the use of
 the Microsoft 'port-hopping' technique.  This is achieved by directing
 the Microsoft iSCSI initiator to use a specific IP address to create
 a TCP connection. The steps are:
 
 1. In the "Logon to target" box, click "Advanced ..."
 2. In the "Target Portal" list, change the value from "Default" to a
    specific IP address.  Select the same filer IP address as was
    originally specified in the "Add Target Portal" dialog under 
    "Discovery" tab.
 
 After the disruption occurs, the Microsoft initiator will use only the
 IP address previously specified in the advanced logon setting to reconnect
 to the filer and will not try other IP addresses advertised in the
 SendTargets response.
 
Notes Formatted
 
Fixed-In VersionThis bug is not scheduled to be fixed, you may opt to open a technical support case if you would like to contact NetApp regarding the status of this bug. A complete list of releases where this bug is fixed is available here.
Related Bugs
Bug Watch StatusThis bug is unwatchable.
--------------------

And also the article below explained above issue
http://software.tectrade.co.uk/SAN/NSeries/gc52129616.pdf  page 16 and 17

-------------
Microsoft iSCSI SW Initiator takes a long time to reconnect to the storage
system after disruption
In a storage system configuration with multiple physical networks or multiple
VLANs, the Microsoft iSCSI software Initiator can take several minutes to
reconnect to the storage system.

During initial target discovery (Add Target), the Microsoft iSCSI software
initiator uses the iSCSI SendTargets command to retrieve a list of IP addresses
at which the target can be accessed.

In a storage system configuration with multiple physical networks or multiple
VLANs, it is possible that some of the storage system's addresses are not
accessible to a given host. In this situation, the SendTargets response sent by
the storage system will advertise some addresses which are not accessible by
that host.

When the Microsoft iSCSI initiator loses connectivity to the target (such as
during storage system reboot, takeover, and giveback), the initiator attempts
to reestablish connectivity to the target using the following default algorithm,
which Microsoft calls “port-hopping”:

1 Attempt to reconnect over the same IP address which was being used
before the disruption.
2 Cycle through the other IP addresses from the SendTargets response,
attempting to reconnect, until connectivity is reestablished.
Each inaccessible IP address in the list can add a delay of 15-20 seconds,
which is the TCP connection establishment timeout. If there are many
inaccessible IP addresses in the list, it may take a long time for the iSCSI
initiator to cycle through the list before successfully reconnecting to the target.
If the total reconnect time exceeds the timeout configured on the host
(MaxRequestHoldTime for non-MPIO, or PDORemovePeriod for MPIO), the
Windows applications experience I/O errors.

This long reconnect time can be minimized by disabling the use of the
Microsoft “port-hopping” algorithm by using a specific target IP address for
each connection in the iSCSI Initiator.

Disabling the Microsoft iSCSI port-hopping algorithm
Disable the Microsoft iSCSI port-hopping algorithm in the iSCSI Initiator to
minimize the recovery time in configurations with many iSCSI target ports.
1. Open the Microsoft iSCSI initiator applet and select the Targets tab.
2. Click Log On.
3. In the Logon to target dialog box, click Advanced.
4. In the Target Portal list, change the value from Default to one of the
storage system IP addresses specified in the Target Portals list on the
Discovery tab.

After a disruption occurs, the Microsoft initiator uses only the IP address
specified to reconnect to the storage system and does not try other IP
addresses advertised in the SendTargets response.

Note: If the specified IP address is unreachable, no failover occurs.

---------------------

Note: Changing from 'default' to specific IP in advanced setting will be persistent across the reboot, see Microsoft support replied as follows:

----------
Once you have set the ISCSI configuration to use the specific IP address. Even though if we restart the server or the session gets disconnected the bindings are persistent. Unfortunately there is no specific utility by which we can confirm that it’s still using the Specific IP, but as per the configuration it will remain as it is.
This setting will remain persistent until and unless the entries from the Discovery Tab is removed. Once the entry from Discovery Tab is removed. We will have to re-configure once again with the specific IP`s.
------------------


How to Prove above solution from Netapp KB 
Install Wireshark on Windows VM, enable all interfaces monitoring and filter traffic only for iscsi (tcp.port==3260) During giveback operation. You will notice it will try all the IPs from non-stroage facing Interface if it cannot communicate with the correct IP in certain period
References

What are the parameters that control how MS iSCSI survives lost TCP connections without causing applications harm?

==========
During transient loss of connection, instead of reporting "Device not available" immediately, the Microsoft iSCSI initiator will try to reconnect to the target and resubmit outstanding SCSI commands.
There are three registry values related MS iSCSI retry behavior, found in the following path:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4D36E97B-E325-11CE-BFC1-08002BE10318}\[Instance_Number]\Parameters
Note: the [Instance_Number] may be different from system to system, depending on how many SCSI adapters already exist on the system.
    The three registry values are:
  1. DelayBetweenReconnect [default: 5 (seconds)]
  2. MaxConnectionRetries [default: 0xFFFFFFFF, infinite]
  3. MaxRequestHoldTime [default: 60 (seconds)]
Explanations: Normally you don't need to modify DelayBetweenReconnect and MaxConnectionRetries. The MaxRequestHoldTime is probably the only one that you may want to change. It defines how long Microsoft iSCSI initiator should hold and retry outstanding commands, before notifying upper layer of a Device Removal event. This event usually causes I/O failures to applications using the iSCSI disk. MaxRequestHoldTime is only relevant with non-MPIO environments. When MPIO is involved, this value is ignored.
Device Removal event can be bad for applications actively using an iSCSI Logical Unit Number (LUN), especially if a cable-pull, filer reboot, filer cluster failover, etc., takes more than MaxRequestHoldTime of 60 seconds to recover. Unless you have special requirement that need the retry window to be smaller or larger, 180 (seconds) is a good value to start with.
Note: Even after a Device Removal event is reported, Microsoft iSCSI initiator will still keep trying to reconnect to the target, as defined by the first two registry values,DelayBetweenReconnect and MaxConnectionRetries.
The Windows iSCSI host must be rebooted after changing the registry value(s).
=============






Keep Oracle VM server ethX name consistent across reboot

Jephe Wu - http://linuxtechres.blogspot.com

Problem: random assigned ethX name with MAC address and renamed ethX to something else, different /etc/issue output after each reboot

Environment: Oracle VM server 3.1.1, PCI card with multiple NIC ports


Steps:

1. use DRAC and lspci to check pci bus info

Open DRAC, network device, check each MAC address for NIC


2. draw network diagram for each NIC MAC address


3. assign ethX to pci slot 

create 99-ethernet.rules under /etc/udev/rules.d as follows:


KERNEL=="eth*", ID=="0000:05:05.0", NAME="eth0"
KERNEL=="eth*", ID=="0000:0b:00.0", NAME="eth1"

Note: The ID can be determined by running lspci | grep Eth and prefixing the appropriate number with 0000: or else by using ethtool -i <DEVICE> | grep bus-info.

If using udev rules like this in RHEL5, kudzu should be disabled (chkconfig kudzu off) so that it does not interfere by modifying the /etc/sysconfig/network-scripts/ifcfg-* files.

4. remove HWADDR from ifcfg-ethX

cd /etc/sysconfig/network-scripts/
for i in ifcfg-eth*;do sed -i 's#HWADDR#\#HWADDR#g' $i;done

5. use real driver in /etc/modprobe.conf

cd /etc/
vi modprobe.conf to use correct kernel driver for ethX

ethtool -i eth0
or lspci -vvv 0000:04:00.0

Reference

RHEL5: How to make NIC names persistent across reboots

https://access.redhat.com/site/solutions/16411