Saturday, March 23, 2013

VM Disk Timeouts

A pretty common issue to run into when using some SAN back-ends for virtual machines is that the VM's end up crashing, BSOD'ing, or (most commonly) remounting their "disks" read-only when there's a hiccup or failover in the storage system, often resulting in a need to reboot to restore functionality.

Updated 9/16/2013 to incorporate excellent suggestions of commenter Greg Smith.
Updated 5/13/2014 to incorporate on-the-job learning.

The most common fix is typically to increase the default timeout settings in the guest VM, and sometimes also in the host machine as well, as the root cause is usually that the SAN took longer than the default timeout to respond. This is usually because the SAN was involved in a failover, which can take > 60, or even > 120 seconds in some cases. I generally recommend setting it to at least 300 seconds, though 600 seconds or more I'm also perfectly happy with, personally. I only really have an issue with under 180 seconds or so.

This is in keeping with industry standards, I might add - VMware sets to 180, NetApp has long requested it be 180, and so on. I don't actually like how the timeouts and such are handled, and I especially do not like that in many scenarios the timeout is a global value applying to both the SAN-provided storage and that local spinning disk (which never needs or wants a timeout value this long), but them's the breaks I'm afraid.

Of course, the usual follow-up question from anyone told this is, "Ok, so where do I do that?" and then you're off to Google, and it can be annoying. Enough so that I decided to compile them all in one place, and add some scripts and such to simplify it (and be included in automated deployment tools, for instance). So, here you are.

Windows 2000, 2003, 2008, Vista, & Windows 7

Open the registry editor (regedit) and navigate to:

HKEY_LOCAL_MACHINE / System / CurrentControlSet / Services / Disk

Once there, look for 'TimeOutValue'. If it exists, edit it, and if it does not exist, right-click and choose 'Edit/Add Value' and create it. The type is REG_DWORD, and the value should be set in decimal to the timeout in seconds that you desire (so, I suggest, 300).

After that, if you're using the Microsoft iSCSI Initiator in the OS instead of being passed in the disk from a hypervisor, you should also modify the timeout value in the iSCSI initiator. On 2008, Vista, and Windows 7, navigate to:
HKEY_LOCAL_MACHINE / System / CurrentControlSet / Control / Class / {4D36E97B-E325-<HostID>
Under this key you'll find a number of subkeys named 0001, 0002 and so on. Expand each subkey until you find the one subkey that has another subkey called 'Parameters'. Within that Parameters subkey is the key you want, MaxRequestHoldTime. Modify it to 300 (decimal). There is another setting in here, LinkDownTime, that you would set instead if you're planning to use iSCSI MPIO on the Windows OS, but there's also other things to set for that and beyond the scope of this post for now.

These changes are permanent as far as I know, as well as global, so that's all you've got to do. I am unaware if you need to reboot for it to take affect, probably should to be sure.

Linux (2.6+ non-udev)

So the 'easy' but far from elegant solution is to go in and force the timeout to be higher on every block device you need to do so on. This is done on both 2.4 and 2.6 kernels by echo'ing the time in seconds you want at /sys/block/<device>/device/timeout, substituting the device name for <device>. So, for example, if the main disk (sda) was being offered up from the VM host and originated on a SAN and you wanted to make it timeout after 300 seconds, you'd do:

echo 300 > /sys/block/sda/device/timeout

The problem with this is that this isn't permanent, and will only survive until the system is rebooted. The quick and dirty answer to this is to add a command to do this into something like /etc/rc.local or create a full-blown init script that does it (be sure you add the command above the 'exit 0' that often ends the default rc.local file). For completeness, here's a simple script you can call from rc.local (put the contents below into a file, chmod +x it, and then call it from rc.local), that may or may not work for you out of the box (be sure to edit DISKS to be a list of the disks you care about):

# - VM Disk Timeouts - simple script for non-udev 2.6+ kernels
# - edit DISKS to be a list of disks you want to increase the timeout on to TIMEOUT_V

DISKS="sda sdb sdc"

for DISK in $DISKS; do
  echo $TIMEOUT_V > /sys/block/$DISK/device/timeout

Or, read on for the better way to do it if you have a fairly modern and mainstream distribution.

Linux (2.6+ with udev)

The slightly more complex but a bit more elegant method that I see, and that I wish the various major Linux distributions would adopt directly into their base releases, is something like what the VMware Tools does when installed on a supported Linux distribution. You can see their own explanation at this link.

The issue with this today is that not only is this only added if you install the VMware Tools, the line it adds to the udev rules only affects disks exposed using VMware. Something that will not help you if you are using Xen or KVM or VirtualBox and so on. So, something a bit more agnostic is called for. In building this little blog post, and coming upon this issue (admittedly for the umpteenth time), I decided to go ahead and finally do something about it.

My investigations so far have concluded there is no danger to 'bad' or unmatched rules in a udev rules file (at worst, you get a warning in syslog on boot from udev complaining about the lines it doesn't like, but it still parses the other rules fine). Thus, a simple single rules file put into /etc/udev/rules.d/ that contains rules for all possible OS and all possible exposed disks from a variety of virtualization hosts seems like the easiest way to go, so I give you this link. You can run the below command directly (as root) to install on most distributions (be sure /etc/udev/rules.d is where they go):

wget; mv 99-virt-disk-timeouts.rules /etc/udev/rules.d/;chmod 644 99-virt-disk-timeouts.rules

After putting it in /etc/udev/rules.d, just reboot. You can verify it is working with this one-liner (you're looking for results that have at least some entries that say '300', if you don't, it either isn't working or you don't have any disks the rules match against):

for file in `find /sys/devices -iname timeout`; do (echo $file && cat $file); done

And that's it. I've tested the file on CentOS 6.3 on top of KVM, Ubuntu 12.04 on top of KVM, and the VMware ones on a variety of OS's and versions. As far as I know, the list of presently supported virtualization platforms and guest OS's of this file are:


VMware 5+ (disks offered up via scsi)
KVM 1.0+ (disks offered up via ide or scsi - virtio doesn't expose timeout at guest level)
XenServer 5+ (disks offered up via scsi)


RHEL 5+ / CentOS 5+
Ubuntu 10+

If you run into any problems with this file, please let me know.

FreeBSD 9

There are two variables that appear to be of note - and common wisdom seems to jump between which one to tweak. I'll err on the side of timeout over retry here, but that may not be the best option in all situations. To modify it, and it is a global variable as far as I can tell, you need to modify '' and change it from its default of 60 to 300. To modify it permanently, edit your /etc/sysctl.conf and add a line like this: = 300

If you're curious, the other variable mentioned online is '', but I am less sure if the advice about it is fair or true.

NexentaStor (and other OpenSolaris-based derivatives)

So the easy way is to modify the sd timeout value. Unfortunately in OpenSolaris today, this value can only be set in /etc/system for all drives, with no config file method of setting it on a per-disk basis that I am aware of. To modify it globally, add this line to your /etc/system file and reboot:

set sd:sd_io_time=300

This is dangerous if there are any disks exposed to your VM that are not coming from a SAN and such, since this is a global value (much like the Windows one). There does exist a method of modifying the live value used by the kernel on a per-disk basis using mdb, but building this into a script to run on boot and when disks change I've decided not to try to tackle at this time. If you want more info, check out Alisdair's post on the issue, found here.


  1. In your "Linux (2.6+ with udev)" section, there's a small change I would make to the "find" pipeline that looks for devices without correct timeouts. First it's useful to show both the full filename and the timeout there, which takes two small changes:

    $ for file in `find /sys -iname timeout`; do (echo $file && cat $file); done

    And if you look at the output from this system I found, it turns out there's this firmware timeout on there too. That doesn't seem as important to tune as the disk timeouts. What I settled on then to validate the disk timeouts are being set correctly was this pipeline, which only navigates /sys/devices where the disks are at. Here's sample output from a tuned VM install:

    $ for file in `find /sys/devices -iname timeout`; do (echo $file && cat $file); done

    1. Good catch. Suggestion incorporated.

    2. IEEE Final Year Project centers make amazing deep learning final year projects ideas for final year students Final Year Projects for CSE to training and develop their deep learning experience and talents.

      IEEE Final Year projects Project Centers in India are consistently sought after. Final Year Students Projects take a shot at them to improve their aptitudes, while specialists like the enjoyment in interfering with innovation.

      corporate training in chennai corporate training in chennai

      corporate training companies in india corporate training companies in india

      corporate training companies in chennai corporate training companies in chennai

      I have read your blog its very attractive and impressive. I like it your blog. Digital Marketing Company in Chennai

    3. This comment has been removed by the author.

    4. Great code, the author is handsome! It seemed to me that you have it too detailed and from this large in size, I think you can reduce it at least twice if you use pseudo-classes and identifiers, for example, I generally recommend watching a video on Instagram on how to shorten any code by almost five times and not cut it its functionality, unfortunately I don't remember the name of this video, but I do remember that it had posted by account that had about 68 thousand of followers! I am sure that the owner of this account sometimes use the help of to quickly gain the number of profile followers.

  2. should be:

    mv 99-virt-scsi-udev.rules /etc/udev/rules.d/
    chmod 644 /etc/udev/rules.d/99-virt-scsi-udev.rules

  3. Awesome article. Thanks a lot for sharing...

    One uncommon question:
    Do you maybe know, how to configure this disk timeout parameter for an OS X Guest VM? I've tried it already with the one from FreeBSD, but unfortunately OS X doesn't recognize it.

    Any feedback appreciated!
    Thanks - Bojan

  4. This has been the most helpful article about this problem.

    I found this while Googling about the problem I was having on Linux VMs.

    I find interesting that you have a suggested fix for Windows VMs. I've never seen this problem on my Windows VMs (2008 R2 and 7). In fact I've had my datastore offline for nearly an hour and all my Windows VMs recovered gracefully.

    I personally set this to 3600 seconds, because if there is a datastore issue, fixing it in 3 minutes is unlikely. Under an hour is to be expected.

  5. This comment has been removed by the author.

  6. And what with XenServer ? XenServer block devices xvd* doesn't have any timeout parameters. We have very ugly crash with XenServer due NFS storage timeouts (not enough free space on ZFS storage). We subsequently tested all versions from XenServer 6.2 to 6.5SP1, NFS mount parameters (timeo, hard/soft), different Guest OSs and kernels (Ubuntu, CentOS) but without any positive results. All linux guests in xenserver crash immediately (<1s) when NFS server generate long IO response.

  7. This comment has been removed by the author.

  8. Thanks for the post and scripts Andrew!

    I am running KVM hypervisors with RHEL/Oracle Linux and Windows guests, and all guests are utilizing virtIO drivers/disks. So, since KVM does not expose timeout values to guests, what would my solution be if Nexenta is taking more than 60 seconds to failover?

    Do I only need to adjust timeout values for all the block devices on the hypervisors? If I adjust /sys/block/sda/device/timeout to 600 on the hypervisor, does this mean my virtIO VM will effectively have a timeout setting of 600 seconds?

    I see that my RHEL VM's don't have a timeout file under /sys/block/, but my Windows VM's still have the registry key. Is this registry key ignored when Windows uses a "Red Hat VirtIO SCSI Disk Device"(that is the description under Device Manger)?


    1. These are very good questions. Too bad there were no responses. Let me know if you were able to get these answers. thanks.

    2. This has been the most helpful article about this problem.

      I found this while Googling about the problem I was having on Linux VMs.

      I find interesting that you have a suggested fix for Windows VMs. I've never seen this problem on my Windows VMs (2008 R2 and 7). In fact I've had my datastore offline for nearly an hour and all my Windows VMs recovered gracefully.personal investigations

  9. wget
    link is not working

  10. Thank you for sharing helpful info. We've learned so much from your blog
    In offering quality education and academic excellence in South Asia, Lyceum Northwestern University has a lengthy heritage of over 50 years. Located in the Philippines town of Dagupan.

  11. Thanks for this useful information...Good Job
    All The Best!!! cotton sarees in surat

  12. Thank you for excellent article.You made an article that is interesting.
    AWS Solutions Architect courses in Bangalore with certifications.

  13. Download your favorite Latest Mp3 Lyrics which are available in English, Hindi, Bangla, Telugu, Latin, Arabic, Russian, etc.

    Click Here

    Click Here

    Click Here

    Click Here

    Click Here


  14. Class College Education training Beauty teaching university academy lesson teacher master student spa manager skin care learn eyelash extensions tattoo spray


  15. I just loved your article on the beginners guide to starting a blog.If somebody take this blog article seriously in their life, he/she can earn his living by doing blogging.thank you for thizs article. pega online training , best pega online training ,
    top pega online training

  16. Great blog !It is best institute.Top Training institute In chennai

  17. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...

  18. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care

  19. If you want to become a successful Digital Marketer, you should keep on learning along with time tips. This can deliver perfect results to you at read the end of the day. The world of digital marketing is changing at a rapid pace. If you don’t keep on learning, you will not be able to keep on going nice forward with the trends. As a result, you will eventually be left out. You don’t want that to good info happen. That’s the main reason on why you need to keep on learning along with time. You will need to have the passion towards learning as well. Along with that, you read this need to keep the desire to succeed.


    Yang Merupakan Agen Bandarq, Domino 99, Dan Bandar Poker Online Terpercaya di asia hadir untuk anda semua dengan permainan permainan menarik dan bonus menarik untuk anda semua

    Bonus yang diberikan NagaQQ :
    * Bonus rollingan 0.5%,setiap senin di bagikannya
    * Bonus Refferal 10% + 10%,seumur hidup
    * Bonus Jackpot, yang dapat anda dapatkan dengan mudah
    * Minimal Depo 15.000
    * Minimal WD 20.000

    Memegang Gelar atau title sebagai Agen BandarQ Terbaik di masanya

    Games Yang di Hadirkan NagaQQ :
    * Poker Online
    * BandarQ
    * Domino99
    * Bandar Poker
    * Bandar66
    * Sakong
    * Capsa Susun
    * AduQ
    * Perang Bacarrat (New Game)

    Tersedia Deposit Via pulsa :
    Telkomsel & XL

    Info Lebih lanjut Kunjungi :
    Website : NagaQQ
    Facebook : NagaQQ Official
    Kontakk : Info NagaQQ
    linktree : Agen Judi Online
    WHATSAPP : +855977509035
    Line : Cs_nagaQQ
    TELEGRAM : +855967014811

    agen bandarq terbaik
    Winner NagaQQ
    Daftar NagaQQ
    Agen Poker Online

  21. CrownQQ Agen DominoQQ BandarQ dan Domino99 Online Terbesar

    Yuk Buruan ikutan bermain di website CrownQQ
    Sekarang CROWNQQ Memiliki Game terbaru Dan Ternama loh...

    9 permainan :
    => Poker
    => Bandar Poker
    => Domino99
    => BandarQ
    => AduQ
    => Sakong
    => Capsa Susun
    => Bandar 66
    => Perang Baccarat (NEW GAME)

    => Bonus Refferal 20%
    => Bonus Turn Over 0,5%
    => Minimal Depo 20.000
    => Minimal WD 20.000
    => 100% Member Asli
    => Pelayanan DP & WD 24 jam
    => Livechat Kami 24 Jam Online
    => Bisa Dimainkan Di Hp Android0619679319
    => Di Layani Dengan 5 Bank Terbaik
    => 1 User ID 9 Permainan Menarik

    Ayo gabung sekarang juga hanya dengan
    mengklick CrownQQ

    Link Resmi CrownQQ:

    Info Lebih lanjut Kunjungi :
    WHATSAPP : +855882357563
    Line : CS CROWNQQ
    Facebook : CrownQQ Official

  22. hey... Great work . I feel nice while i reading blog .You are doing well. Keep it up. We will also provide dial QuickBooks Support Number to reach us call to +1-855-756-1077 for instant help.

  23. its been long since i saw a post that's so educative and informational. i will make sure to share this my facebook group. you can also view contents on our websites below.

    French Bulldog Puppies For Sale

    French Bulldog Breeders

    French Bulldog Puppies For Sale Near Me

    French Bulldog Puppies For adoption

    French Bulldog Puppies

    Blue French Bulldog Puppies

  24. it's so refreshing to see a post that talks straight to the point. thanks so much for writing about this it has really helped me with building my experience. thanks a lot

    siberian husky puppies for sale near me
    Siberian Husky puppies
    Siberian Husky puppies for adoption
      Siberian Husky puppies breeders near me  

      white Siberian Husky puppies  

  25. I feel very glad to read your article. The content of the post is very informative and also i hope your next article is coming soon.
    Best Forex Course

  26. I would like to say you are posting amazing article and i like your post very much. Also it is very informative. Thank you. Great work. Keep it up!!
    Private Investigator London

  27. We tell you about Income Tax Return.

    This tax in India since 1961 through the multiple Amendment of the Constitution of India.

    This article is really helpful to you, Every business and offices required Income Tax Return in Delhi and Income Tax Return in Faridabad. We also provide professional service for tax return, tax guidanace in Income Tax Return in Karnataka as well as we provide Income Tax Return in Ahmedabad and Income Tax Return in Kerala.

    Get complete detail about income tax, tax refund status, income tax filing procedure, pan number, tax guide. Tax experts in India provided by TaxWala will assist you through the entire process. Online Income Tax Return File your return application & get your acknowledgement Online. Agents and consultanst at TaxWala help you to file income tax return done online in 3 hours without any problem.

    Our Tax Consultants also available for Income Tax Return in Gujarat, Income Tax Return in Haryana, Income Tax Return in Rajasthan and Income Tax Return in Punjab.

    We are best in taxation services, itr filing, income tax return in india etc.

    For SEO, SEM - Top 10 Digital Marketing Company In India

  28. its been long since i saw a post that's so educative and informational. i will make sure to share this my facebook group. you can also view contents on our websites below. Private investigator uk

  29. Your blog was quite frankly to us and has almost every answer to our question about virtual machines. Thanks for sharing and I hope you will keep sharing. PhD Dissertation Writing Services

  30. Máy tính Hải long là địa chỉ bán máy tính cũ uy tín chất lượng, có đội ngũ tư vấn chuyên sâu, am hiểu tường tận về từng chi tiết của máy tính. Đặc biệt, thời gian bảo hành cũng như chính sách đổi trả, không kém việc bạn mua máy mới 100% tại các siêu thị lớn.
    Top những máy tính bàn cũ làm việc chất lượng
    Bỏ túi ngay một số mẹo kiểm tra khi mua máy tính bàn cũ
    Địa chỉ: Số 9 ngách 109 ngõ 69a Hoàng Văn Thái, Thanh Xuân, Hà Nội
    Điện thoại: 0972 105 943

  31. Come up with a great learning experience of Azure training in Chennai, from Infycle Technologies, the best software training institute in Chennai. Get up with other technical courses like Data Science, Selenium Automation Testing, Mobile App Development, Cyber Security, Big Data, Full Stack Development with a great learning experience and outstanding placements in top IT firms. For best offers with learning, reach us on +91-7504633633, +91-7502633633.

  32. Thanks for one marvelous posting! I truly enjoyed reading it, you might be a great author. I will make sure to bookmark your blog and will come back in the future. I want to encourage that you continue your great job.

    ibm full form in india |
    ssb ka full form |
    what is the full form of dp |
    full form of brics |
    gnm nursing full form |
    full form of bce |
    full form of php |
    bhim full form |
    nota full form in india |
    apec full form |

  33. Hii,
    This is Great Post.. for me, Thanks for sharing with us!!
    Buy Real Facebook Live Stream Views