IN THIS ARTICLE
- Outlines how to configure VMware vSphere to better handle cluster upgrades and reboots.
- Running Qumulo Core OVA
- VMware vSphere
At Qumulo headquarters, we have implemented a configuration setting to our internal VMware vSphere 6 hosts that allows our hosts to better handle Qumulo cluster upgrades & reboots.
A storage device is considered to be in an All Paths Down (APD) state when the it remains unreachable for a specified length of time. The default on vSphere 6 is 140 seconds, typically not long enough to survive a reboot of the host. If the host marks the datastore as APD, further I/O will cause read-only file systems in Linux guests, and has the potential to blue-screen Windows systems.
To help avoid these types of issues during an upgrade, we recommend increasing the APD timeout values:
- From Hosts and Clusters, select the Host
- Click the Manage tab > Settings
- Select Advanced System Settings
- Ensure Misc.APDHandlingEnable is set to 1
- Change Misc.APDTimeout to the value you'd like in seconds. Note that the default value is 140
As seen in the screenshot below, we use a setting of 1200, which gives the hosts ten minutes to retry accessing physical storage. Note that it is best to keep VMware Tools up to date on your guest operating systems to ensure the latest scsi timeouts are applied as well.
You should now be able to successfully configure your VMware vSphere to better handle cluster upgrades and reboots.
Like what you see? Share this article with your network!