Saturday, March 23, 2013

VM Disk Timeouts

A pretty common issue to run into when using some SAN back-ends for virtual machines is that the VM's end up crashing, BSOD'ing, or (most commonly) remounting their "disks" read-only when there's a hiccup or failover in the storage system, often resulting in a need to reboot to restore functionality.

Updated 9/16/2013 to incorporate excellent suggestions of commenter Greg Smith.
Updated 5/13/2014 to incorporate on-the-job learning.


The most common fix is typically to increase the default timeout settings in the guest VM, and sometimes also in the host machine as well, as the root cause is usually that the SAN took longer than the default timeout to respond. This is usually because the SAN was involved in a failover, which can take > 60, or even > 120 seconds in some cases. I generally recommend setting it to at least 300 seconds, though 600 seconds or more I'm also perfectly happy with, personally. I only really have an issue with under 180 seconds or so.

This is in keeping with industry standards, I might add - VMware sets to 180, NetApp has long requested it be 180, and so on. I don't actually like how the timeouts and such are handled, and I especially do not like that in many scenarios the timeout is a global value applying to both the SAN-provided storage and that local spinning disk (which never needs or wants a timeout value this long), but them's the breaks I'm afraid.

Of course, the usual follow-up question from anyone told this is, "Ok, so where do I do that?" and then you're off to Google, and it can be annoying. Enough so that I decided to compile them all in one place, and add some scripts and such to simplify it (and be included in automated deployment tools, for instance). So, here you are.

Windows 2000, 2003, 2008, Vista, & Windows 7

Open the registry editor (regedit) and navigate to:

HKEY_LOCAL_MACHINE / System / CurrentControlSet / Services / Disk

Once there, look for 'TimeOutValue'. If it exists, edit it, and if it does not exist, right-click and choose 'Edit/Add Value' and create it. The type is REG_DWORD, and the value should be set in decimal to the timeout in seconds that you desire (so, I suggest, 300).

After that, if you're using the Microsoft iSCSI Initiator in the OS instead of being passed in the disk from a hypervisor, you should also modify the timeout value in the iSCSI initiator. On 2008, Vista, and Windows 7, navigate to:
HKEY_LOCAL_MACHINE / System / CurrentControlSet / Control / Class / {4D36E97B-E325-<HostID>
Under this key you'll find a number of subkeys named 0001, 0002 and so on. Expand each subkey until you find the one subkey that has another subkey called 'Parameters'. Within that Parameters subkey is the key you want, MaxRequestHoldTime. Modify it to 300 (decimal). There is another setting in here, LinkDownTime, that you would set instead if you're planning to use iSCSI MPIO on the Windows OS, but there's also other things to set for that and beyond the scope of this post for now.

These changes are permanent as far as I know, as well as global, so that's all you've got to do. I am unaware if you need to reboot for it to take affect, probably should to be sure.

Linux (2.6+ non-udev)

So the 'easy' but far from elegant solution is to go in and force the timeout to be higher on every block device you need to do so on. This is done on both 2.4 and 2.6 kernels by echo'ing the time in seconds you want at /sys/block/<device>/device/timeout, substituting the device name for <device>. So, for example, if the main disk (sda) was being offered up from the VM host and originated on a SAN and you wanted to make it timeout after 300 seconds, you'd do:

echo 300 > /sys/block/sda/device/timeout

The problem with this is that this isn't permanent, and will only survive until the system is rebooted. The quick and dirty answer to this is to add a command to do this into something like /etc/rc.local or create a full-blown init script that does it (be sure you add the command above the 'exit 0' that often ends the default rc.local file). For completeness, here's a simple script you can call from rc.local (put the contents below into a file, chmod +x it, and then call it from rc.local), that may or may not work for you out of the box (be sure to edit DISKS to be a list of the disks you care about):


#/bin/bash
#
# nex7.blogspot.com - VM Disk Timeouts - simple script for non-udev 2.6+ kernels
# - edit DISKS to be a list of disks you want to increase the timeout on to TIMEOUT_V

TIMEOUT_V=300
DISKS="sda sdb sdc"

for DISK in $DISKS; do
  echo $TIMEOUT_V > /sys/block/$DISK/device/timeout
done


Or, read on for the better way to do it if you have a fairly modern and mainstream distribution.

Linux (2.6+ with udev)

The slightly more complex but a bit more elegant method that I see, and that I wish the various major Linux distributions would adopt directly into their base releases, is something like what the VMware Tools does when installed on a supported Linux distribution. You can see their own explanation at this link.

The issue with this today is that not only is this only added if you install the VMware Tools, the line it adds to the udev rules only affects disks exposed using VMware. Something that will not help you if you are using Xen or KVM or VirtualBox and so on. So, something a bit more agnostic is called for. In building this little blog post, and coming upon this issue (admittedly for the umpteenth time), I decided to go ahead and finally do something about it.

My investigations so far have concluded there is no danger to 'bad' or unmatched rules in a udev rules file (at worst, you get a warning in syslog on boot from udev complaining about the lines it doesn't like, but it still parses the other rules fine). Thus, a simple single rules file put into /etc/udev/rules.d/ that contains rules for all possible OS and all possible exposed disks from a variety of virtualization hosts seems like the easiest way to go, so I give you this link. You can run the below command directly (as root) to install on most distributions (be sure /etc/udev/rules.d is where they go):

wget http://www.nex7.com/files/99-virt-scsi-udev.rules; mv 99-virt-disk-timeouts.rules /etc/udev/rules.d/;chmod 644 99-virt-disk-timeouts.rules

After putting it in /etc/udev/rules.d, just reboot. You can verify it is working with this one-liner (you're looking for results that have at least some entries that say '300', if you don't, it either isn't working or you don't have any disks the rules match against):

for file in `find /sys/devices -iname timeout`; do (echo $file && cat $file); done

And that's it. I've tested the file on CentOS 6.3 on top of KVM, Ubuntu 12.04 on top of KVM, and the VMware ones on a variety of OS's and versions. As far as I know, the list of presently supported virtualization platforms and guest OS's of this file are:

Hosts

VMware 5+ (disks offered up via scsi)
KVM 1.0+ (disks offered up via ide or scsi - virtio doesn't expose timeout at guest level)
XenServer 5+ (disks offered up via scsi)

Guests

RHEL 5+ / CentOS 5+
Ubuntu 10+

If you run into any problems with this file, please let me know.

FreeBSD 9

There are two variables that appear to be of note - and common wisdom seems to jump between which one to tweak. I'll err on the side of timeout over retry here, but that may not be the best option in all situations. To modify it, and it is a global variable as far as I can tell, you need to modify 'kern.cam.da.default_timeout' and change it from its default of 60 to 300. To modify it permanently, edit your /etc/sysctl.conf and add a line like this:

kern.cam.da.default_timeout = 300

If you're curious, the other variable mentioned online is 'kern.cam.da.retry_count', but I am less sure if the advice about it is fair or true.

NexentaStor (and other OpenSolaris-based derivatives)

So the easy way is to modify the sd timeout value. Unfortunately in OpenSolaris today, this value can only be set in /etc/system for all drives, with no config file method of setting it on a per-disk basis that I am aware of. To modify it globally, add this line to your /etc/system file and reboot:

set sd:sd_io_time=300

This is dangerous if there are any disks exposed to your VM that are not coming from a SAN and such, since this is a global value (much like the Windows one). There does exist a method of modifying the live value used by the kernel on a per-disk basis using mdb, but building this into a script to run on boot and when disks change I've decided not to try to tackle at this time. If you want more info, check out Alisdair's post on the issue, found here.

40 comments:

  1. In your "Linux (2.6+ with udev)" section, there's a small change I would make to the "find" pipeline that looks for devices without correct timeouts. First it's useful to show both the full filename and the timeout there, which takes two small changes:

    $ for file in `find /sys -iname timeout`; do (echo $file && cat $file); done
    /sys/devices/pci0000:00/0000:00:1f.1/host1/target1:0:0/1:0:0:0/timeout
    30
    /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/timeout
    30
    /sys/class/firmware/timeout
    60

    And if you look at the output from this system I found, it turns out there's this firmware timeout on there too. That doesn't seem as important to tune as the disk timeouts. What I settled on then to validate the disk timeouts are being set correctly was this pipeline, which only navigates /sys/devices where the disks are at. Here's sample output from a tuned VM install:

    $ for file in `find /sys/devices -iname timeout`; do (echo $file && cat $file); done
    /sys/devices/pci0000:00/0000:00:1f.1/host1/target1:0:0/1:0:0:0/timeout
    180
    /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/timeout
    180

    ReplyDelete
  2. should be:

    wget http://www.nex7.com/files/99-virt-scsi-udev.rules
    mv 99-virt-scsi-udev.rules /etc/udev/rules.d/
    chmod 644 /etc/udev/rules.d/99-virt-scsi-udev.rules

    ReplyDelete
  3. Awesome article. Thanks a lot for sharing...

    One uncommon question:
    Do you maybe know, how to configure this disk timeout parameter for an OS X Guest VM? I've tried it already with the one from FreeBSD, but unfortunately OS X doesn't recognize it.

    Any feedback appreciated!
    Thanks - Bojan

    ReplyDelete
  4. This has been the most helpful article about this problem.

    I found this while Googling about the problem I was having on Linux VMs.

    I find interesting that you have a suggested fix for Windows VMs. I've never seen this problem on my Windows VMs (2008 R2 and 7). In fact I've had my datastore offline for nearly an hour and all my Windows VMs recovered gracefully.

    I personally set this to 3600 seconds, because if there is a datastore issue, fixing it in 3 minutes is unlikely. Under an hour is to be expected.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. And what with XenServer ? XenServer block devices xvd* doesn't have any timeout parameters. We have very ugly crash with XenServer due NFS storage timeouts (not enough free space on ZFS storage). We subsequently tested all versions from XenServer 6.2 to 6.5SP1, NFS mount parameters (timeo, hard/soft), different Guest OSs and kernels (Ubuntu, CentOS) but without any positive results. All linux guests in xenserver crash immediately (<1s) when NFS server generate long IO response.

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Thanks for the post and scripts Andrew!

    I am running KVM hypervisors with RHEL/Oracle Linux and Windows guests, and all guests are utilizing virtIO drivers/disks. So, since KVM does not expose timeout values to guests, what would my solution be if Nexenta is taking more than 60 seconds to failover?

    Do I only need to adjust timeout values for all the block devices on the hypervisors? If I adjust /sys/block/sda/device/timeout to 600 on the hypervisor, does this mean my virtIO VM will effectively have a timeout setting of 600 seconds?

    I see that my RHEL VM's don't have a timeout file under /sys/block/, but my Windows VM's still have the registry key. Is this registry key ignored when Windows uses a "Red Hat VirtIO SCSI Disk Device"(that is the description under Device Manger)?

    Thanks!

    ReplyDelete
    Replies
    1. These are very good questions. Too bad there were no responses. Let me know if you were able to get these answers. thanks.

      Delete
  9. wget http://www.nex7.com/files/99-virt-scsi-udev.rules
    link is not working

    ReplyDelete
  10. Thank you for sharing helpful info. We've learned so much from your blog
    In offering quality education and academic excellence in South Asia, Lyceum Northwestern University has a lengthy heritage of over 50 years. Located in the Philippines town of Dagupan.

    ReplyDelete
  11. Thanks for this useful information...Good Job
    All The Best!!! cotton sarees in surat

    ReplyDelete
  12. Thank you for excellent article.You made an article that is interesting.
    AWS Solutions Architect courses in Bangalore with certifications.
    https://onlineidealab.com/aws-training-in-bangalore/


    ReplyDelete
  13. Download your favorite Latest Mp3 Lyrics which are available in English, Hindi, Bangla, Telugu, Latin, Arabic, Russian, etc.

    Click Here

    Click Here

    Click Here

    Click Here

    Click Here

    ReplyDelete

  14. Class College Education training Beauty teaching university academy lesson teacher master student spa manager skin care learn eyelash extensions tattoo spray

    daythammynet
    daythammynet
    daythammynet
    daythammynet
    daythammynet
    daythammynet
    daythammynet
    daythammynet
    daythammynet

    ReplyDelete
  15. I just loved your article on the beginners guide to starting a blog.If somebody take this blog article seriously in their life, he/she can earn his living by doing blogging.thank you for thizs article. pega online training , best pega online training ,
    top pega online training

    ReplyDelete
  16. Great blog !It is best institute.Top Training institute In chennai
    http://chennaitraining.in/openspan-training-in-chennai/
    http://chennaitraining.in/uipath-training-in-chennai/
    http://chennaitraining.in/automation-anywhere-training-in-chennai/
    http://chennaitraining.in/microsoft-azure-training-in-chennai/
    http://chennaitraining.in/workday-training-in-chennai/
    http://chennaitraining.in/vmware-training-in-chennai/

    ReplyDelete
  17. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
    http://chennaitraining.in/test-complete-training-in-chennai/
    http://chennaitraining.in/load-runner-training-in-chennai/
    http://chennaitraining.in/jmeter-training-in-chennai/
    http://chennaitraining.in/soapui-testing-training-in-chennai/
    http://chennaitraining.in/mobile-application-testing-training-in-chennai/
    http://chennaitraining.in/html-training-in-chennai/

    ReplyDelete
  18. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care
    http://chennaitraining.in/creo-training-in-chennai/
    http://chennaitraining.in/building-estimation-and-costing-training-in-chennai/
    http://chennaitraining.in/machine-learning-training-in-chennai/
    http://chennaitraining.in/data-science-training-in-chennai/
    http://chennaitraining.in/rpa-training-in-chennai/
    http://chennaitraining.in/blueprism-training-in-chennai/

    ReplyDelete
  19. If you want to become a successful Digital Marketer, you should keep on learning along with time tips. This can deliver perfect results to you at read the end of the day. The world of digital marketing is changing at a rapid pace. If you don’t keep on learning, you will not be able to keep on going nice forward with the trends. As a result, you will eventually be left out. You don’t want that to good info happen. That’s the main reason on why you need to keep on learning along with time. You will need to have the passion towards learning as well. Along with that, you read this need to keep the desire to succeed.

    ReplyDelete
  20. NAGAQQ: AGEN BANDARQ BANDARQ ONLINE ADUQ ONLINE DOMINOQQ TERBAIK

    Yang Merupakan Agen Bandarq, Domino 99, Dan Bandar Poker Online Terpercaya di asia hadir untuk anda semua dengan permainan permainan menarik dan bonus menarik untuk anda semua

    Bonus yang diberikan NagaQQ :
    * Bonus rollingan 0.5%,setiap senin di bagikannya
    * Bonus Refferal 10% + 10%,seumur hidup
    * Bonus Jackpot, yang dapat anda dapatkan dengan mudah
    * Minimal Depo 15.000
    * Minimal WD 20.000

    Memegang Gelar atau title sebagai Agen BandarQ Terbaik di masanya

    Games Yang di Hadirkan NagaQQ :
    * Poker Online
    * BandarQ
    * Domino99
    * Bandar Poker
    * Bandar66
    * Sakong
    * Capsa Susun
    * AduQ
    * Perang Bacarrat (New Game)

    Tersedia Deposit Via pulsa :
    Telkomsel & XL

    Info Lebih lanjut Kunjungi :
    Website : NagaQQ
    Facebook : NagaQQ Official
    Kontakk : Info NagaQQ
    linktree : Agen Judi Online
    WHATSAPP : +855977509035
    Line : Cs_nagaQQ
    TELEGRAM : +855967014811


    BACA JUGA BLOGSPORT KAMI YANG LAIN:
    agen bandarq terbaik
    Winner NagaQQ
    Daftar NagaQQ
    Agen Poker Online

    ReplyDelete
  21. CrownQQ Agen DominoQQ BandarQ dan Domino99 Online Terbesar

    Yuk Buruan ikutan bermain di website CrownQQ
    Sekarang CROWNQQ Memiliki Game terbaru Dan Ternama loh...

    9 permainan :
    => Poker
    => Bandar Poker
    => Domino99
    => BandarQ
    => AduQ
    => Sakong
    => Capsa Susun
    => Bandar 66
    => Perang Baccarat (NEW GAME)

    => Bonus Refferal 20%
    => Bonus Turn Over 0,5%
    => Minimal Depo 20.000
    => Minimal WD 20.000
    => 100% Member Asli
    => Pelayanan DP & WD 24 jam
    => Livechat Kami 24 Jam Online
    => Bisa Dimainkan Di Hp Android0619679319
    => Di Layani Dengan 5 Bank Terbaik
    => 1 User ID 9 Permainan Menarik

    Ayo gabung sekarang juga hanya dengan
    mengklick CrownQQ

    Link Resmi CrownQQ:
    ratuajaib.com
    ratuajaib.net

    Info Lebih lanjut Kunjungi :
    WHATSAPP : +855882357563
    Line : CS CROWNQQ
    Facebook : CrownQQ Official

    ReplyDelete