Wednesday, October 9, 2013

Lots of L2ARC not a good idea, for now

Before I get into it, let me say that the majority of the information below and analysis of it comes from one of my partners in crime (coworker at Nexenta), Kirill (@kdavyd). All credit for figuring out what was causing the long export times, and reproducing the issue internally (it was initially observed at a customer) after we cobbled together enough equipment, and analysis to date, goes to him. I'm just the messenger, here.

Here's the short version, in the form of the output from a test machine (emphasis is mine):

root@gold:/volumes# time zpool export l2test

real    38m57.023s
user    0m0.003s
sys     15m16.519s

That was a test pool with a couple of TB of L2ARC, filled up, with an average block size on pool of 4K (a worst case scenario) -- and then simply exported, as shown. If you are presently using a pool with a significant amount of L2ARC (100's of GB's or TB's), (update:) and you have multiple (3+) cache vdevs, and have any desire to ever export your pool that has a time limit attached to it before you need to be importing it again to maintain service availability, you should read on. Otherwise this doesn't really affect you. In either case, it should be fixable, long-term, so probably not a panic-worthy piece of information for most people. :)


If you want the workaround, by the way, it is to shoot the box in the head -- panic it, power it off hard, etc, to export the pool. This is obviously very dangerous if you do not have a proper setup with an actually utilized ZIL (preferably through nice log devices) and all your important data arriving synchronously, or if you have other pools on the system you don't want to lose access to as well. Unfortunately if either of those is the case, the only other workaround would be to remove the cache vdevs from the pool, wait out the eviction stuff, and only then export the pool. Better than just exporting it and waiting, since you'd be online while you removed the cache vdevs, but fairly time consuming still.

Here's your culprit, where we sit for potentially minutes, or even hours if you had a sufficiently 'bad' environment for this issue:

PC: _resume_from_idle+0xf1    CMD: zpool export l2test
  stack pointer for thread ffffff4b94b44180: ffffff022cdd9820
  [ ffffff022cdd9820 _resume_from_idle+0xf1() ]
    swtch+0x145()
    turnstile_block+0x760()
    mutex_vector_enter+0x261()
    _l2arc_evict+0x8d()
    l2arc_evict+0xb0()
    l2arc_remove_vdev+0x9b()
    spa_l2cache_drop+0x65()
    spa_unload+0xa6()
    spa_export_common+0x1d9()
    spa_export+0x2f()
    zfs_ioc_pool_export+0x41()
    zfsdev_ioctl+0x15e()
    cdev_ioctl+0x45()
    spec_ioctl+0x5a()
    fop_ioctl+0x7b()
    ioctl+0x18e()
    dtrace_systrace_syscall32+0x11a()
    _sys_sysenter_post_swapgs+0x149()

Analysis is ongoing, but it appears to be made exponentially worse the more L2ARC you have, the bigger the L2ARC devices are individually, the clock speed of your CPU (as this is a single-CPU-bound task), and your average block size (as that affects how many l2hdr entries you have).

This should be relatively easy to fix, in theory -- ARC evictions were already made asynchronous, but it looks like this wasn't done for L2ARC as well. If it is, this should be less of an issue.

Update 10/10/13:
Well, it appears there is code for an async l2arc eviction, and it's in play:

root@gold:/volumes# echo zfs_l2arc_async_evict::print | mdb -k
0x1 (B_TRUE)

So the investigation continues as to why this is happening anyway.

Update 10/10/13:
Kirill continues to investigate, and believes he's pinned the problem down to the async l2arc eviction not being always actually being asynchronous. In his testing, much of it is actually done asynchronously, but the minute there's more than X number going, the next one gets done synchronously. This is tied to the number of cache vdevs you have. And you cannot export until all the data they referenced is cleared, so the export won't finish until the only remaining tasks are in process (and are asynchronous, otherwise it'll have to wait them out). This explains why the export did complete before this was all actually through -- ongoing tasks were asynchronous and there were no more outstanding. Seemingly there is a 'maxalloc' being set on taskq_create by this line:

 arc_flush_taskq = taskq_create("arc_flush_tq",
     max_ncpus, minclsyspri, 1, 4, TASKQ_DYNAMIC);

The '4' is the culprit here. In Kirill's words: "Probably at the beginning of export, we are flushing ARC as well, so 2 threads are already busy, leaving us with only 2 for L2ARC. It is evicting asynchronously, just not all drives, because taskq isn’t big enough, since someone probably didn’t consider that there may be more than 2 L2ARC’s on a system :)", then adds, "Also explains why I couldn’t reproduce it with a single 1TB L2ARC - looks like you need at least three."

So the major takeaway here is that the original correlation between size of L2ARC and the size of the individual L2ARC vdevs is actually only of concern if you hit this 'bug' in the first place. To hit it, the size of your L2ARC and L2ARC vdevs isn't important, it is basically the number of them - seemingly in most situations you'll need at least 3 cache vdevs in the pool to ever run into this.

Update 10/14/13:
So apparently the arc_flush_taskq and the asynchronous ARC & L2ARC eviction code is all only in NexentaStor at the moment, having not (yet? I cannot publicly state our intent with this code, I have truly not been informed internally of what it is) been pushed back to illumos. Fortunately this means that on Nexenta 3.x machines, you'll only hit this long export time problem if you have more than 2 L2ARC devices in a pool you're trying to export, in general, and also the amount of ARC related to the pool shouldn't effect export times, either. Long-term we'll of course move to fix this issue, likely by increasing the limit above 4, but only after more thorough analysis internally.

Unfortunately it seems to mean that if you're not running Nexenta 3.x, and are on an older version or using an alternative illumos-derivative, you may very well be susceptible to a long export time even with just a lot of small block data in ARC, as well as any number of L2ARC devices including just 1. I do not have a box of sufficient size with a non-Nexenta illumos distro on it at the moment to do any testing, I'm afraid.

54 comments:

  1. Thanks for the post. We are definitely hitting this bug with 3 SSD drives for L2ARC. Our failover time is typically about 20-25 minutes. to export the pool with a pool size of 8TB use( pool size of 30TB) and swing it over to the other head node. It looks like I'll me removing one of the L2ARC drives before we manually swing over (upgrade, etc.) for now.

    ReplyDelete
  2. Scott, if you have SLOG devices in your pools and all your datasets have sync set properly (zvols too) then it is perfectly safe to panic one node via uadmin or reboot -p.

    ReplyDelete
  3. I've been dealing with this problem for a couple years now. I run OmniOS so I'm not benefiting from the work Nexenta has done on this. My systems are all HA with RSF-1.

    My procedure for doing maintenance is to first set sync=always on the pool of the pool I need to export. I then panic the system and the other head takes over the pool. When I'm finished I set sync=standard back on the pool. This has worked many times for me with out issue.

    One thing I would like to point out is that removing the cache SSDs from the pool is not helpful here. As soon as you execute "zpool remove tank {cache_ssd}" you get into a blocking situation that too can last a ridiculous amount of time. On one pool I have with 8 400GB SSDs the first remove took 12 minutes, that blocked the I/O the entire time. That was 12 minutes of downtime I was not planning on dealing with.

    Because of this issue, I prefer to build my high performance systems with more RAM and less L2ARC.

    Hopefully Nexenta will release their work on this to Illumos soon.

    ReplyDelete
  4. I simply want to tell you that I’m all new to blogs and truly liked you’re blog site. Very likely I’m likely to bookmark your site .You surely come with remarkable articles. Cheers for sharing your website page.
    Home Interiors in Chennai

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. I can get very detailed information from your blog alltime so looking forward for more from your side if anybody interested refer them links to get updates relevant to interior designs and approaches from:
    Best Architects in India
    Turnkey Interior Contractors in Chennai
    Architecture Firms in Chennai
    Warehouse Architect
    Factory Architect Chennai
    Office Interiors in Chennai
    Rainwater Harvesting chennai

    ReplyDelete
  7. Nice blog, Visit Mutual Fund Wala for best mutual fund investment schemes.
    Investment Advisor in Delhi

    ReplyDelete
  8. This is an informative blog you share with us, thank you so much for sharing this.
    Lifestyle Magazine India

    ReplyDelete
  9. Thank you so much for sharing such an amazing blog. Visit Kalakutir Pvt Ltd for the best Commercial Vehicle Painting & Branding, Godown Floor Marking Painting and Caution & Indication Signages services in delhi, India.
    Godown Floor Marking Painting

    ReplyDelete
  10. This should be relatively easy to fix, in theory -- ARC evictions were already made asynchronous, but it looks like this wasn't done for L2ARC as well. If it is, this should be less of an issue, please see here : Starup

    ReplyDelete
  11. Thanks for provide great informatic and looking beautiful blog, really nice required information & the things i never imagined and i would request, wright more blog and blog post like that for us. Thanks you once agian

    birth certificate in delhi
    name add in birth certificate
    birth certificate in gurgaon
    birth certificate correction
    birth certificate in noida
    birth certificate online
    birth certificate in ghaziabad
    birth certificate in india
    birth certificate apply online
    birth certificate in bengaluru

    ReplyDelete
  12. Ngày nay, bàn học không chỉ có chức năng để học nữa mà còn như một vật trang trí trong phòng của các bé. Khác hẳn với bé gái, bàn học dành cho bé trai mang vẻ năng động và tinh nghịch. Vì vậy, hãy tham khảo những mẫu bàn học dành cho bé trai dưới đây để tìm ra mẫu phù hợp cho con mình nhé!

    ReplyDelete
  13. hy thanks for sharing this information.very helpful and useful thanks

    ReplyDelete
  14. Very nice blog and articles. I am realy very happy to visit your blog. Now I am found which I actually want. I check your blog everyday and try to learn something from your blog. Thank you and waiting for your new post. thanks

    ReplyDelete
  15. Nice Blog !
    Any issue pops up in this acclaimed accounting software can be fixed in the least possible time by our talented professionals at QuickBooks Customer Service Phone Number 1-(855) 550-7546. Our experts are highly skilled and have years of experience in resolving all the issues of QuickBooks. Our number is open 24/7.

    ReplyDelete
  16. Hey! Good blog. I was facing an error in my QuickBooks software, so I called QuickBooks Support Phone Number (855)756-1077. I was tended to by an experienced and friendly technician who helped me to get rid of that annoying issue in the least possible time.

    ReplyDelete
  17. Hey! Mind-blowing blog. Keep writing such beautiful blogs. In case you are struggling with issues on QuickBooks software, dial QuickBooks Customer Support Phone Number (877)948-5867. The team, on the other end, will assist you with the best technical services.

    ReplyDelete
  18. this is really too useful and have more ideas from yours. keep sharing many techniques. eagerly waiting for your new blog and useful information. keep doing more.
    QuickBooks Error 6189








    ReplyDelete
  19. Hey! What a wonderful blog. I loved your blog. QuickBooks is the best accounting software, however, it has lots of bugs like QuickBooks Error. To fix such issues, you can contact experts via QuickBooks Customer Service Number

    ReplyDelete
  20. Hey! Nice Blog, I have been using QuickBooks for a long time. One day, I encountered QuickBooks Customer Service in my software, then I called QuickBooks Customer Service Number. They resolved my error in the least possible time.

    ReplyDelete
  21. Hey! Mind-blowing blog. Keep writing such beautiful blogs. In case you are struggling with issues on QuickBooks software, dial QuickBooks Customer Support Number . The team, on the other end, will assist you with the best technical services.

    ReplyDelete
  22. Hey! Excellent work. Being a QuickBooks user, if you are struggling with any issue, then dial QuickBooks Phone Number. Our team at QuickBooks will provide you with the best technical solutions for QuickBooks problems.

    ReplyDelete
  23. Hey! Lovely blog. Your blog contains all the details and information related to the topic. In case you are a QuickBooks user, here is good news for you. You may encounter any error like QuickBooks Error, visit at QuickBooks Customer Support Number for quick help.

    ReplyDelete
  24. Great tips! Thanks for sharing useful information... hair salons ues

    ReplyDelete
  25. Hey! Lovely blog. Your blog contains all the details and information related to the topic. In case you are a QuickBooks user, here is good news for you. You may encounter any error like QuickBooks Error, visit at QuickBooks Customer Service (855)741-3663 for quick help.

    ReplyDelete
  26. Hey! Mind-blowing blog. Keep writing such beautiful blogs. In case you are struggling with issues on QuickBooks software, dial QuickBooks Support Phone Number (888)233-6656. The team, on the other end, will assist you with the best technical services.

    ReplyDelete
  27. Hey! Mind-blowing blog. Keep writing such beautiful blogs. In case you are struggling with issues on QuickBooks Enterprise Support (855)756-1077, dial QuickBooks Support Phone Number . The team, on the other end, will assist you with the best technical services.

    ReplyDelete
  28. Hey! Well-written blog. It is the best thing that I have read on the internet today. Moreover, if you are looking for the solution of QuickBooks for MAC Support , visit at QuickBooks Support Phone Number (888)233-6656 to get your issues resolved quickly.

    ReplyDelete
  29. nice blog. if you are searching for a quickbook customer service you can contact us on call.+1 866-669-5068

    ReplyDelete
  30. Nice & Informative Blog !
    If you are looking for the best accounting software that can help you manage your business operations. call us at QuickBooks support service Phone Number.+1 888-272-4881

    ReplyDelete
  31. Good contant. we are provide a best service for custumer in Quickbooks support serviceyou can contact us at.+18882724881

    ReplyDelete

  32. Hi! Excellent blog. I feel great to be here reading your brilliant post. Moreover, if you are an avid QuickBooks user and facing any issue, dial QuickBooks Customer Care Phone Number.+13464148256 and get instant solutions for QuickBooks queries


    ReplyDelete
  33. Hi! Excellent blog. I feel great to be here reading your brilliant post. Moreover, if you are an avid QuickBooks user and facing any issue, dial QuickBooks Customer Care Phone Number.+1 602-325-1557,AZ and get instant solutions for QuickBooks queries

    ReplyDelete
  34. Thanks for sharing amazing blog. Yoga can makes the easy to control your mind ,to understand about your mind and to keep calm your mind. Yoga, types of yoga , yogainfo ,you reach us at

    ReplyDelete
  35. If you need help on correctable errors or issues on your desktop, call us at Quickbooks Customer Service Phone Number+1 855-769-6757 to get the best services. We are here 24/7 and are ready to provide answers to all your QuickBooks questions.

    ReplyDelete
  36. This is beneficial Information for me Thanks For Sharing Such as Information .Kindly Go Through QuickBooks Support Phone Number +1 866-448-6293 For any Best QuickBooks Customer Service

    ReplyDelete