Tag esxi

vSphere 5 Hypervisor (Free ESXi) Crippled by New Licensing

Following the excitement of the vSphere 5 product launch, the realities of the new vRAM licensing model are still becoming clear.  It seems that vmware have simply underestimated the consolidation rates typically in use, as it seems a risky move indeed to effectively double license costs at a stroke – even with an 80% market share.

So queue another big disappointment: the vRAM entitlement for vmware Hypervisor (i.e. the free-licensed ESXi) has been set at an enforced 8GB, total.

This effectively makes the product next to useless, I suppose other than as a container for a single VM.  I guess the warning signs have been there for a while – read-only vCLI since v3.5, update manager gone in 4.1, and now functionality severely impaired in 5.  I wouldn’t mind betting this will be the last free release.

For the SME, the answer is simple and relatively painless: buy an Essentials pack, which is a give-away at $495 anyway.  This provides 144GB vRAM entitlement over 6 sockets, and vCentre Server too.

But where does this leave the home-lab enthusiast (and quite likely, vmware promoter, blogger, and VCP)?  I just can’t see that too many will want to stump us this kind of cash to continue research and training on the product.

VN:D [1.9.13_1145]
Rating: 5.0/5 (1 vote cast)
VN:D [1.9.13_1145]
Rating: +1 (from 1 vote)

vSphere 5

It’s been a while since I last posted, but I’ve been plenty busy trying to fit everything in and have a number of articles nearly ready to go.  But, there’s been a steadily increasing stream of posts appearing – and then disappearing – on vmware communities and rumours elsewhere about something rather interesting:

The much anticipated vSphere 5 launch is now very close, maybe even only weeks away.

Whilst vmware are keeping all the details confidential for now, it seems increasingly likely that v5 will arrive before the holiday season.

But what does this mean for VCPs?

Following the format of the vSphere 4 release, I’d expect VCP4s to be required only to attend a “whats new” type course, and have an upgrade exam available (for a limited time).

So getting in quick could again be important, both to make use of any update exams but also to keep ahead in the labour market: now is definitely the time for VCPs to clear the accumulated techy rubbish from their study desks and start sorting out some serious lab kit for VCP510 testing!

VN:F [1.9.13_1145]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)

How to upgrade ESXi to 4.1 Update 1 (U1)

Patching and updating free-licensed ESXi is a little more difficult since vmware withdrew the neat host update utility, but thankfully it’s still pretty easy.  With 4.1 Update 1 recently released, now is probably the time to catch up on patching for those that haven’t yet attempted this without the host update utility.

What’s Fixed in 4.1 Update 1 (build 348481)

Quite a lot, in short.  Notably some fixes for Windows Server 2008 R2 guests, support for new guests such as Ubuntu 10.10, and a new tunable for NFS that resolves large file creation time-out (no fix for the similar NFS large file delete time-out, but watch this space).  Scalability is also up to 160 processors.

Full detail in vmware’s release notes.

Prerequisites

  • vmware vCLI (which is installed on Windows, and note needs a reboot after installation)
  • vmware ESXi update package ZIP file, which can be downloaded from the downloads section of vmware.com after registration

Update Process

The update process is exactly the same as general 4.1 host patching, but this time only one patch needs to be applied, ESXi410-Update01:

After the update, the vSphere client also needs to be updated, but that is handled automatically when attempting connection.

See Also

VN:F [1.9.13_1145]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)

How to Patch ESXi 4.1

Since the introduction of 4.1, vmware have withdrawn the host update utility, presumably as a gentle nudge for those on the free version to at least buy the frankly bargain-priced Essentials pack.

But the free licensed version still has a place, particularly for home labs.  I tend to use it also on machines I’m repurposing to run as NFS servers to act as datastores for DR and archive purposes, since the performance charting with datastore latency numbers and built-in health monitoring are extremely useful.

Since vmware have just released the first round of patches for ESXi 4.1, taking the build number to 320092, here’s a quick guide on how to patch ESXi 4.1 using vihostupdate.pl.  And fortunately, this release continues to run just fine on HP’s ML115 G5.

What’s Fixed in build 320092

Finding the Patches

Go to vmware’s download page and search for patches and updates to ESXi 4.1:

At time of writing, only one patch is available, ESXi410-201010001, which is about 200MB.

Applying the Patches

As ever, the host needs to be in maintenance mode, which means stopping all VMs.  You therefore need the patch ZIP file (no need to unpack it) on a physical Windows based PC along with the vmware vCLI.

The update process itself is pretty straightforward, although you might not think that from the 94-page vmware update guide.  Basically:

  • List out the updates in the package using vihostupdate.pl -l
  • Apply each update to the ESXi host, using vihostupdate.pl -i
  • Restart the host

 

I’ve put a detailed step-by-step, with screenhosts, in the peacon blog wiki

Whilst updating from a PC over a wireless LAN should be OK, it is always preferable to use a wired connection in case the connection should be disturbed part way through the update process.

Rolling Back

There’s always the possibility that for whatever reason the patch process won’t work.  Whenever a host is patched, the last version is maintained by ESXi and can be restored by pressing shift-R at the initial boot loader screen (with the white progress bar at the bottom) providing an option to roll back:

Simply press Y and then Enter to boot the old version when prompted.

VN:D [1.9.13_1145]
Rating: 5.0/5 (1 vote cast)
VN:D [1.9.13_1145]
Rating: +1 (from 1 vote)

vSwap with SandForce SSDs

If, like me, your lab server is struggling along maxed out with only 8GB of RAM, disk IO can be a real problem because of guest paging and vSwapping.

I should really upgrade the box, but that means ditching the trusty ML115 – and in any case, the whole point of ESX is to do more with less.  There are also still many machines appearing with 8GB max RAM capacity, such as HP’s new microserver.  So I’ve looked at other options.

Memory Over-commitment

ESXi 4 has page sharing, ballooning and vSwapping, and v4.1 adds compression.  Yet whenever my box is over-committed to any serious extent, web servers takes minutes to spew out a page and any audio streaming just stops.

The disks are the issue – a 4-drive RAID-10 volume for everything.  Firing up a 3GB VM with the box already running at 90% RAM, ballooning requests guest level paging and the SATA array clatters away at something like 500 IOPS to service it.  With vSwapping doing more of the same, the controller queue depth of 128 commands pushes latency to 250ms – and everything pretty much grinds to a halt.

SandForce SSDs

I’ve had an eye on SSDs since reading this vmware community’s article on vSwapping to SSD at the start of the year.  The concept is simple enough – use an SSD for vSwap, which can respond 50 times quicker than a mechanical disk.  The problem is that SSDs have been pretty expensive and many of the more affordable drives have awful 4K random write performance, slower than a mechanical disk in some cases.

But there’s a new breed of cheaper, faster SSDs – I’ve been testing an OCZ Vertex II in my ML115 for a while, and while there’s a compatibility issue with the nVidia MCP55 SATA controller used in my lab ML115 (it’s detected in the BIOS, but ESXi doesn’t see it), it runs OK on Dell’s Perc 5i and 6i RAID controllers.

SSDs on Dell Perc RAID Controllers

The 5i is an old design and lacks SATA NCQ.  The controller works fine with the SSD, but performance is sub-optimal (about 7,000 4K random IOPS) so I swapped out the 5i for a 6i since it’s a drop-in replacement – it’s a bit faster, uses less power and has SSD and SATA NCQ support.

NCQ allows the controller to pass multiple commands to the SSD at once, making use of internal parallelism in the drive and so boosting throughput.  The 4K IOPS test (50% write) jumped to 14,700 on the 6i – quite simply stunning compared to mechanical storage and just what’s needed for vSwapping!

NCQ also gives the SATA RAID-10 array a 30% speed boost under stress.  The 6i struggles though with the SSD, sequential write levelling out at about 60MB/s.  Neither OCZ nor Dell could help unfortunately but it’s not of concern for my random IO use anyway.

Configuring vSwapping to run on SSD

With the SSD working OK on the 6i, a datastore can be created in the usual way and then used for vSwapping (see ‘Virtual Machine Swapfile Location’ page in the vSphere client). VMs then need to be suspended and resumed for the change to take effect.

vSwapping though is a very blunt stick – ballooning generates less disk IO and usually has less impact to VMs, because the VM’s own memory management is more intelligent.  It knows about areas that shouldn’t be paged and RAM that hasn’t been used recently, where vSwapping just chooses a bunch of pages at random and swaps them out (and stalls the VM while it does so).

So it seems to me that guest swap space should also be on the SSD, by adding a thick-provisioned disk to each VM on the SSD, creating a 64K aligned partition on the disk in the VM, and finally moving the swap file onto that drive (see here for Linux info).  The downside is the amount of space required – essentially twice the VM’s RAM allocation.

Does the SSD Deliver?

It’s already been demonstrated that SQL-Server performs quite well, but that was in an environment with much more RAM in the first place and an enterprise grade SSD.  My testing is rather less scientific.

My test box typically runs with about 7GB used, so generating memory over-commitment isn’t hard – just starting a vSphere lab does the trick: two ESXi VMs, vCentre Server, a Windows domain controller and a virtual router together take the RAM load to about 15GB.  Ramping that up in a short space of time is a harsh test and without the SSD completely stalled the box – RDP sessions were dropped, audio streaming died and web servers appeared offline.

With the vSwap and guest paging all running from the SSD, the host survives the test with some stuttering on audio streaming.  Response from web servers on a ‘first page out’ basis after the test seems to vary from about nine to about 30 seconds.  An active session to a VM running Photoshop during test was a bit patchy but mostly usable.

Once the system stabilises however, performance of everything seems pretty OK with the RAM initially truely maxed.

Being a home lab server, it hasn’t got hundreds of users pounding every VM so the system hangs together pretty well.  Essentially I guess that provided the active memory is comfortably within the physical RAM, performance should hold up.

SSD Swap and RAM Compression

With RAM compression disabled, the swap rate peaked at 36 MB/s with the ongoing rate depending entirely on the load.

I was expecting RAM compression to help a bit, but was surprised by how much.  RAM compressed consistently reduced the swap rate by well over half with my ‘keen’ settings, but did seem to slow things down quite a bit too:

mem.MemZipAllocPct – 50
mem.MemZipLowMemMaxSwapOut – 50
mem.MemZipBalloonXferPct – 30
mem.MemZipMaxRejectionPct – 10
mem.MemSwapSkipPct – 75

Tweaking MemZipAllocPct and MemZipLowMemMaxSwapOut to 25% seems to provide a happy balance of swap and RAM compression throughputs.  Reducing the write load on the SSD is a good thing due to their limited write cycle life.

With this configuration, creating sudden memory pressure by starting a 3GB VM seems to enable the system to work everything much harder – the SSD peaks at nearly 50MB/s and compression 30MB/s.  Writing this, a WHS VM is performing de-duplicated backups, I’m installing vCentre, and audio streaming is continuing pretty well.

In the lab environment some attention is also needed to mem.idletax, since it is not always desirable for idle VMs to be more heavily paged.

In Summary

The SandForce SSDs completely change the storage dynamic for small offices and home labs – a single SSD provides three times the random (swap/database) throughput of a 16-drive SAS array, at less than 1% of the power consumption.

Paging memory to memory is hardly new (remember EMS?), but for ESX, using SSD for swapping enables much higher RAM over-commitment and hence VM density.  Echoing the earlier vmware communities blog on the subject, vSwapping to SSD is something it seems that vmware should be looking at supporting formally, for example by adding TRIM support in the vSwapping configuration (to maintain the SSD performance) and enabling any queue depth throttling to be overridden for a vSwap datastore.

The quick RAM loading test performed here proves that there is no substitute for real RAM, but with more ordinary workloads (each with perhaps 10% of physical RAM allocated) everything holds up without a hitch with some serious over-commitment.  I can run many more VMs without any noticable performance problems, the SSD providing the occasional burst of speed needed as different VMs demand resources.

I found best performance with the guest paging and vSwapping on the SSD and RAM compression enabled.  The balloon driver was able to recover RAM more quickly than vSwap, and VM responsiveness was dramatically improved because spinning storage latencies were not then affected by host level swapping by multiple VMs.

The bottom line – adding the SSD has significantly increased the VM capacity of my ML115, but for how long will remain to be seen.

VN:D [1.9.13_1145]
Rating: 5.0/5 (1 vote cast)
VN:D [1.9.13_1145]
Rating: +1 (from 1 vote)

Copyright © Peacon Ltd, 2010, 2011
virtualisation blog by James Pearce

If you find the content of this blog useful, please consider donating just a pound towards the costs:

WordPress Appliance - Powered by TurnKey Linux