Benchmarking Storage for Vmware

I’m currently working on an eBook “Storage Basics for vSphere”.  As it’s nearly finished, I thought I’d put up a couple of extracts over the next week or two prior to release.

This part covers the basics of benchmarking storage using IOMeter, which can be downloaded from here.  Please do post a comment or rate this post (at the bottom).

Benchmarking Storage with IOMeter

Benchmarking storage can be more complicated than it first seems. Using a consistent approach, benchmarking is a useful tool for system sizing in the first instance, by measuring existing storage that is known to be stretched, but also to confirm that new implementations are performing as expected and, more importantly, as required.

Performance Metrics

Benchmarking storage for virtualised environments involves three key measures, 

  • Sequential MB/s, both read and write.  This usage pattern is significant for vmdk based backups and for some file server applications making use of large files.
  • 8K Random IOPS(IOs per second), primarily mixed read and write but also individually for read and write.  These metrics give a good relative indication of the performance of storage for virtualised environments, since competing workloads create random patterns. 
  • Latency in ms. Latencies must be understood since there is likely to be a point where increasing the throughput results in significant latency that will slow down user response times beyond acceptable limits.  The queue depth (that drives latency) that a storage system can support also ultimately determines the number of competing workloads that can be accommodated.

Performance Tools

The most common tools for assessing system performance are dd, HDTune and IOMeter.  IOMeter is a favourite since it’s free, easy to use, and highly configurable.  It runs primarily on Windows (Server 2003 or 2008).

Once downloaded and installed (all in less than a minute), the IOMeter interface is simple and clean (note that it needs to be run as administrator on 2008):

How and Where to Test

Benchmarking involves performing the key tests iteratively with varying queue depths – allowing for ramp and run times, this is a time consuming process.

Queue depth determines how much IO the OS (or application) will ‘allow’ the underlying controller to optimise.  By re-ordering and combining commands, physical IO can be streamlined or reduced by the controller.  The more commands the controller can process in this way before needing to report something back to the OS, the better the ’hit-rate’ of such optimisations (and hence the higher the ultimate throughput), but the trade-off is latency, which can have a devastating impact on the user experience.

Generally,

  • Benchmark within a guest VM.  This is the only way to properly understand the storage performance for the virtual environment.  Wherever possible, do so using a host that is running only the guest being used for benchmarking, since CPU scheduling delays can result in overstatement of the results due to timing inaccuracies.
  • The test file size (‘sectors’ field, the area used for testing by IOMeter) must be large enough to avoid any significant percentage of controller cache hits, unless of course cached performance is being tested to determine network latency for example.  A file size of 4 or 8GB is usually sufficient for local storage, and perhaps as much as 30GB when testing NAS devices with GB’s of cache.

Each test needs to be repeated with various queue depths to find the maximum throughput at an acceptable latency, a limit of 50ms sometimes being cited.

The test size is defined in IOMeter on the front sheet in sectors (8GB being 16,777,216 sectors) along with the queue depth for the test (see picture above).

Most tests should be performed using a ‘ramp’ time, especially for write tests, being long enough to fill the caches so that the results show sustained throughput.  A one minute ramp and five minute duration will usually give fairly accurate results.  This is set on the ‘Test Setup’,

Defining the Workloads

With the basic settings configured the workloads need to be defined.  IOMeter includes many predefined workloads, but three core tests suffice, which only take a minute to configure:

  • 32K Sequential read – 32K, 100% sequential, 100% read
  • 32K Sequential write – 32K, 100% sequential, 0% read
  • 8K Real Life – 8K, 100% random, 70% read

When testing NFS volumes or any storage that isn’t 512-byte sector addressable (such as the latest SATA disks), the alignment must be set to 4K in these test definitions (4K can be used to test any storage since the file systems will be working in clusters of at least 4K).  The tests are configured from the ‘Access Specifications’ sheet.  Creating a test definition is straightforward,

Once defined, add one test (only) to the left hand pane on the Access Specifications sheet.

Setting it Running

With everything configured, click on the green flag and IOMeter will start the test, first writing out a massive test file of the size specified which will then be used for the test.  When that is written and the ramp time complete, the results sheet gives a real-time view of the storage (drag the slider left to see the results as it’s running):

Interpreting the Results

When testing networked storage with sequential workloads, often the network is the bottleneck and performance will approach n*110 MB/s (with gigabit Ethernet), where n is the number of active load-balanced paths.  Since local storage does not have this restriction, it can be significantly faster in these tests.

Sequential metrics are frequently cited, giving seemingly massive numbers that leave no insight as to why a server is suffering storage related performance problems.  As said already, random IOPS are generally much more important for virtualisation, but sequential metrics are often a good indication of the overall ‘health’ of the storage subsystem:

  • Any kind of network latency, congestion or packet loss will reduce sequential throughput, potentially significantly.
  • Storage subsystem misconfiguration, such as incorrect cache policies, will have a great impact on write performance.

Although there are far too many permutations to list ‘typical’ values, some measured values are given below.

  • Single SATA Drive: 60-90 MB/s
  • 4x SATA RAID-10: 120-150 MB/s
  • 5x SAS 10k RAID-5: 300+ MB/s

For the more important random workload IOPS, problems here will be indicative of different issues,

  • The impact of queue depth can clearly be measured through the average latency.
  • Alignment issues, particularly with NFS storage, will result in greatly reduced random write throughput.

Some values measured with the 8K program mentioned above are,

  • Single SATA Drive: 140 IOPS
  • 4x SATA RAID-10: 490 IOPS
  • 5x SAS 10k RAID-5: 1,100 IOPS

 

About the Forthcoming eBook Storage Basics for vSphere

Whilst virtualisation offers major advantages, for the SME there is a general lack of accessible quality guides to storage, and a wealth of ‘consultants’ all too eager to propose massively over-specified solutions.

The book aims to demystify the storage options for vmware and help the SME administrator avoid both consultant margins and performance and availability surprises later down the line.

VN:D [1.9.13_1145]
Rating: 5.0/5 (2 votes cast)
VN:D [1.9.13_1145]
Rating: +2 (from 2 votes)
Benchmarking Storage for Vmware, 5.0 out of 5 based on 2 ratings

14 comments

  1. [...] IOMeter on a Dell R610 (with 1 vCPU and 4GB RAM allocated) using the methods described on my blog, here.  For comparison, I’ve also run the tests on a few different storage systems, including a [...]

  2. Josh Coen says:

    James, what a great write-up. This is exactly what I was looking for. While the Iometer guide is clear in some areas, it is also very ambiguous in others, and your article really helped to clear things up for me. I am very interested reading your e-book. Any time frame for release? Thanks again.

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
    • james says:

      Hi Josh, many thanks for the feedback, I’m glad it was useful. I’ve been wondering what to do with the book and am considering transferring the lot to the wiki, as it is something of a moving target. I’ll have a think and get something out next week though. Cheers, James.

      VN:F [1.9.13_1145]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.13_1145]
      Rating: 0 (from 0 votes)
  3. Mike says:

    I am also interested in your storage information. I struggle with finding any sort of time to do any sort of benchmarking on the systems I’m quoting for my SMB customers and would love to review any knowledge you can pass along.

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
  4. Josh Coen says:

    James, I’m looking at some results using DAS and I wanted to get your opinion, if you have some time to look at it. I haven’t been able to figure out the alignment and offset for the storage, but using viclient to create the VMFS and server 2008 R2 as the guest, the storage should be aligned properly. The numbers seem to be somewhat inline with what you state above, but one thing that worried me in the 8K tests were the MB/s.

    http://i805.photobucket.com/albums/yy337/jcoen/IO_Tests.png

    If you have time, let me know what you think, thanks.

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
    • james says:

      Hi Josh, I sent you a mail direct. But your numbers looks OK although you should see higher if you increase the queue depth to at least twice the number of physical disks in the array. Cheers, James.

      VN:F [1.9.13_1145]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.13_1145]
      Rating: 0 (from 0 votes)
  5. [...] using my usual methods, this ‘free’ build has sequential read or write performance of 11MB/s and 8K random [...]

  6. Jason says:

    Hi James,

    Follow your blog and was curious to know if there was any update on your ebook?

    TIA,

    -Jason

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
    • james says:

      Hi Jason, thanks for posting. I’m sorry for the extended delay on that – it’s just lack of time. I’m hoping to get some time to work on it soon.

      VN:F [1.9.13_1145]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.13_1145]
      Rating: 0 (from 0 votes)
  7. René Frej Nielsen says:

    Hi,

    I’m trying to benchmark our EqualLogic PS4000VX which I’m having performance issues with when using VMDK disks. I have done the sequential test but I’m only getting 17 MB/s from IOMeter, but I know that I can generate at least 70-80 MB/s when copying files inside the VM.

    What could I be doing wrong? Is it the 8K test size?

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
    • james says:

      Hi, test sequential with 32K IOs and a queue depth of 32 IOs. Test file size should probably be about 30GB with these units. HTH

      VN:F [1.9.13_1145]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.13_1145]
      Rating: 0 (from 0 votes)
  8. [...] (Tech Republic) IOPs? (Yellow Bricks) Storage System Performance Analysis with Iometer (VMware) Benchmarking Storage for VMware (Peacon) Performance Troubleshooting VMware vSphere – Storage (Virtual Insanity) NetApp TR-3808 – [...]

  9. Sounds good, I like to read your blog, just added to my favorites ;)

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
  10. Adam Rush says:

    Hi James

    Another great post of yours that google has taken me too! Did you manage to finish that eBook in the end?

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)

Post a comment

 

Copyright © Peacon Ltd, 2010, 2011
virtualisation blog by James Pearce

If you find the content of this blog useful, please consider donating just a pound towards the costs:

WordPress Appliance - Powered by TurnKey Linux