File System Choice for NFS Servers

The vmware community is slowly waking up to NFS, and for good reason – there are many benefits over iSCSI, and it is in many ways ideal for the shared storage component of a DR installation where older servers can easily be re-purposed with new disks to provide reasonable chunks of networked storage.

But getting the most out of NFS needs some careful configuration.  Besides the hardware choice, one of the main performance limiting factors I’ve found to be the underlying file system.  I’ve looked in detail at VM disk performance running from NFS storage with the four main choices – so which Linux file system is the best for NFS servers providing shared storage for vmware?

The Test Rigs

I’ve performed tests on three targets:

  • a Pentium-4 with a single SATA drive (without SATA NCQ)
  • a Dell 2950 with four SATA drives running in RAID-10 (without SATA NCQ)
  • a Dell 2950 with six SATA drives running as 3x RAID-1 volumes (with SATA NCQ), then striped in Linux with mdadm to create a large RAID-10 volume (mdadm required due to BIOS volume size restrictions)

Ext3

The default file system for many 2.6 kernels, ext3 is old and reliable.  But it just isn’t designed for files of the size of VMDKs and as a result can’t really keep up.  Sequential write performance seemed to be limited to about 60MB/s on my test platforms, and delete performance was so bad that ESXi would time out for files as small as 12GB.

JFS

JFS has credible underpinnings from IBM, but I found in some configurations with sustained heavy write workload, the background writer process (jfscommit) could use steadily more and more CPU resource which ultimately limited throughput – and severly.

XFS

XFS has excellent support for the enormous files so needed for vmware and in most respects it is ideal – stable, fast and well proven.  Sequential write performance on my test rig was nearly double that of ext3 at over 100MB/s.

When tuned a little, in particular mounting with nobarrier, delete performance seems as good as VMFS – 2TB VMDK’s could be deleted on the PowerEdge test rigs almost instantly.

I did find a corner-case when used with madm volumes, where random mixed read-write workload IO (which is of such importance with VMs) appeared to throttle the array queue depth to only 1 IO, with a devastating performance impact – effectively reducing array performance to that of a single disk.

Ext4

ext4 based NFS provided sequential MB/s and random IOPS right at the top of table consistently.  It has new design features that make it very much more suitable for the very large files of interest here (than it’s earlier cousin), and sequential write performance was just as good as XFS.  It was immune to the corner-case affecting XFS with mdadm and I could never drive it into the CPU race condition seen with JFS.

The only downside is that the delete performance is very much lower than XFS, which then effectively limits the safe VMDK working size, depending on the speed of the NFS server and it’s workload.  I found VMDK’s up to about 1.2TB could be consistently deleted OK.

Conclusion

It seems XFS is the choice unless using mdadm, in which case ext4 is the way to go.  ext4 needs Linux kernel 2.6.28 or higher, effectively taking Debian out of the running, but Ubuntu 10.04 LTS (or 10.10) are an easy jump and it uses ext4 by default.

The only free and supported solution for vmware is the now ancient Fedora 8, so if looking at that route then only XFS should really be considered as it is too old for ext4 and ext3 is too slow.

VN:F [1.9.22_1171]
Rating: 5.0/5 (1 vote cast)
VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)
File System Choice for NFS Servers, 5.0 out of 5 based on 1 rating

6 comments

  1. orange says:

    Hi,

    Debian 6 has been release now and includes support for ext4 :-)

    Regards.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)
  2. T.J. says:

    Hey,

    Why isn’t XFS isn’t a good choice for Linux software RAID (mdadm). I did a bit of searching, and I found several cases where other folks were doing it.

    Thanks
    T.J.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • james says:

      Hi, I found that with Debian 5/6 and Ubuntu 10.x there was a very specific problem with the combination of mdadm, XFS and NFS. When running VMs from datastores of such a configuration, the maximum queue depth at the mdadm array was exactly 1 IO, for the case that the workload contained both reads and writes.

      IO-Meter benchmarking within the VM (on an aligned partition) would show the expected performance with 100% (random) read and 100% (random) write workloads, but with any mix (say 70:30, 50:50, makes no difference) despite the IO-Meter queue depth being say 16 IOs, the queue at the array would only be 1 IO. The impact of this is that the array provides the performance of one disk only.

      I tried everything I could to get around this, but couldn’t. This is on hardware with hardware RAID and BBWC of course.

      No other file system showed this perculiarity.

      Hope that helps!

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  3. sunghost says:

    Hi,
    nice article, but you have to keep in mind if you want to build big server with more than 16tb you cant use ext3 and ext4 because of the tools to resize the disk. thats my actual experience. im still looking for a good alternative for my debain system.

    VA:F [1.9.22_1171]
    Rating: 3.0/5 (1 vote cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • james says:

      Thanks for the feedback. XFS will take things far beyond 16TB, but I seem to recall ESX having some limitation at 16TB too?

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)

Post a comment

 

Copyright © Peacon Ltd, 2010, 2011
Technology blog by James Pearce

If you find the content of this blog useful, please consider donating just a pound towards the costs:

WordPress Appliance - Powered by TurnKey Linux