ESXi 5 Health Status Monitoring

My ESXi 4 Health Status Monitoring script (itself based on earlier work by William Lam) continues to generate a bit of interest, so I thought it about time I tested it against ESXi 5 – even though I don’t see a place for v5 in my own test lab.

There have been some reports in the comments here that it appeared not to work.  But, since the script is working entirely through ‘official’ interfaces, I was expecting most of it to work perhaps with a few tweaks.

But at least with vCLI v4, everything seems to work just fine!  The script, esx-health.pl, fired off the usual email just as it should:

One thing to note though – vmware have stopped including any hardware monitoring components with the ‘vanilla’ build.  So even basic reporting will require vendor provided CIM modules – I came across this excellent guide on adding LSI modules for monitoring Dell PERC array controllers, for example.

VN:F [1.9.13_1145]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)

vSphere 5: Where We’re At with Licensing

 ”With the introduction of VMware vSphere 5, VMware is evolving the product’s licensing to lay the foundation for customers to adopt a more cloud-like IT cost model based on consumption and value rather than physical components and capacity” - Mark Peek, VMware CFO

Which ever we cut it, the problem is clear: consumption based costing works well as a method of charging someone else for resources you own.  The problem for VMware is quite simple: their customers own the hardware already.  So effectively, the deal is to pay 100% of the hardware and operational costs associated with it, and then rent that capacity you’ve already bought back via VMware on top.  What!?

It’s clearly flawed and one has to wonder how this was able to get to such a major product launch with such glaring problems even on top of that – vSphere 5 supports 1TB RAM for VMs, but by the way that will cost you $70k to license.  And then there was the 8GB vRAM restriction on the free-licensed Hypervisor, rendering the product next to useless.

Community Backlash

Quite simply, what could VMware have been thinking?

Such has been the uproar that VMware has (relatively) quickly had to come back with another attempt, broadly doubling most vRAM limits previously announced and thankfully putting a very reasonable 32GB cap on the free Hypervisor (which in my mind makes it fit-for-purpose for it’s expected useful life).

Then quite separately there is the issue of existing customers on SnS agreements; VMware’s own upgrade terms state “Upgrades require that the Replacement License at least contain all of the functionality of the Original License”, which clearly they won’t.  I’d expect some out-of-court settlements on this point for those customers large enough to take it up.

Profits

But even now the simple fact is that host licenses previously had much larger RAM caps – 256GB per host for Essentials, Standard and Advanced, for example.  So why the change?

The argument goes that per-processor licensing is unsustainable for VMware, since increasing core counts have reduced the number of processors required.

But this actually doesn’t stack up either with their financial results nor with processor performance increases over time.  Announcing the results of Q2 this year, Mark Peek reported,

“Total second quarter revenues increased 37% year-over-year and license revenues increased 44%. Our non-GAAP operating margin was a record 31.6% and benefited from the strong sequential increase in license revenue. We expect margins to return to below 30% in the third quarter.

Trailing 12-month free cash flows were $1.6 billion, an increase of 56% from a year ago. Our balance sheet remains strong, with cash and investments of $3.7 billion and unearned revenues of $2.1 billion.”

OK, so under vSphere 4 licensing, revenues and profits exceeded market expectations.  And this in a very flat global economy.

So what about performance?  It’s very familiar that Moore’s Law, predicted to be saturated within a decade for the last three, continues skyward with CPU performance (and RAM and disk capacity too).  To pick out an example, look at the relative performance of 2x Intel 5680′s vs. 2x Zeon 2GHz – this works out at around 60% annual compound growth rate between their respective launches.  Core counts simply have to increase as above about 3 or 4GHz, the chips simply get too hot and electricity moves too slowly.

Economy

Outside of VMware’s offices, the economic outlook is rather bleak to say the least.  In the UK, the truth is that Government cuts haven’t even begun (public sector spending is still actually increasing month-on-month), and the Euro zone is in all but meltdown.  Meanwhile the US is only really just contemplating cuts… things are clearly going to get a lot tougher over the next five years.

So what can be the effect of fundamentally increased licensing cost and complexity?  The revisions to vRAM limits help a bit, right now, but the sting comes twice,

  • With still no word on how VMware plan to keep up with Moore’s Law, it is effectively impossible to budget for a vSphere deployment.
  • The rolling annual average vRAM consumption based charging adds a layer of complexity and uncertainly and actively drives businesses to embark on paper exercises to avoiding license cost, for example shutting down test VMs over the weekend.  Again, how can the resultant year-end cost adjustment be budgeted for?

Outlook

In common with many of the bloggers and VMware communities members, I’ve personally spent thousands of hours on the product, and simply put I love the technology.  It’s made it possible to save a stack of cash in tough trading times, increase service levels and reduce IT’s impact on the environment.  This is clearly a killer combination, and exactly why VMware finds itself where it is.

But the purchasing decision has to be based on return on investment.  Since we can’t now quantify the costs over the products expected useful life, except to use a worst-case scenario, it is now very difficult indeed to propose a VMware based solution, especially in light of the comments from the competitors.

vRAM based charging creates headaches at every level and fundamentally changes VMware’s roll, especially in the SME, from one of enabler to one of restricter.

And while I hate to say it, Microsoft has confirmed they have no plans to go down the vRAM(/vTAX) route, “No Memory Tax. Hyper-V supports up to 1 TB of physical memory per server and up to 64 GB per VM today.”  If only it would support NFS.

VN:D [1.9.13_1145]
Rating: 5.0/5 (1 vote cast)
VN:D [1.9.13_1145]
Rating: +1 (from 1 vote)

vSphere 5 Hypervisor (Free ESXi) Crippled by New Licensing

Following the excitement of the vSphere 5 product launch, the realities of the new vRAM licensing model are still becoming clear.  It seems that vmware have simply underestimated the consolidation rates typically in use, as it seems a risky move indeed to effectively double license costs at a stroke – even with an 80% market share.

So queue another big disappointment: the vRAM entitlement for vmware Hypervisor (i.e. the free-licensed ESXi) has been set at an enforced 8GB, total.

This effectively makes the product next to useless, I suppose other than as a container for a single VM.  I guess the warning signs have been there for a while – read-only vCLI since v3.5, update manager gone in 4.1, and now functionality severely impaired in 5.  I wouldn’t mind betting this will be the last free release.

For the SME, the answer is simple and relatively painless: buy an Essentials pack, which is a give-away at $495 anyway.  This provides 144GB vRAM entitlement over 6 sockets, and vCentre Server too.

But where does this leave the home-lab enthusiast (and quite likely, vmware promoter, blogger, and VCP)?  I just can’t see that too many will want to stump us this kind of cash to continue research and training on the product.

VN:D [1.9.13_1145]
Rating: 5.0/5 (1 vote cast)
VN:D [1.9.13_1145]
Rating: +1 (from 1 vote)

vSphere 5 Licensing – the SME Nightmare

Yesterdays vSphere 5 product launch was certainly slick and impressive, with the headline features being right on the money… but one of the more interesting points, the already infamous move to a ‘vRAM’ licensing model, has potentially serious implications for the SME customers currently running on Essentials.

The new model is simple enough – it adds the RAM allocated to running VMs in to the licensing mix.  But the limits vmware have set are already outdated, and therefore give the product no shelf life at all.

Take an Essentials Plus customer with three two-socket hosts, each with a fairly standard 96GB RAM, running well utilised.  Factor in Veeam Essentials for backup and perhaps replication, the latter being to a second Essentials cluster, and today under vSphere 4, the license cost would be:

  • vSphere Essentials Plus, for primary site – $4,495
  • vSphere Essentials, for secondary site – $495
  • Veeam Essentials – $2,000
  • Total $6,990

Fast-forward to vSphere 5, which seems to offer SME customers very little new functionality, and the cost will rocket because of the RAM being used in this example.  The Essentials kits cannot be extended past 24GB per CPU (and it has been reported that this will be an enforced limit in the Essentials kits), so the customer would need to look to Standard editions.  But this affects the Veeam licensing too.

Assuming Essentials (144GB total vRAM) would be enough for a limited functionality DR site, the license cost will rise thus:

  • vSphere Standard Acceleration Kit, for primary site – $10,000
  • vSphere Standard Additional CPU License due to vRAM limits – $995
  • vSphere Essentials, for secondary site – $495
  • Veeam Management Suite – 6 sockets – $7,200
  • Total $18,690

In an instant, the hypervisor layer licensing cost rose over 2.5x, with barely any new features to show for it.

And this is today – we all know that RAM requirements increase over time with Moore’s law – it’s been that way since the 1960′s.  This is quite different to the CPU licensing we’ve seen before, where increasingly power and core count has reduced the CPU count needed over time.  Which, I guess is vmware’s reasoning.

But unless the licensed vRAM amounts are continually adjusted, vmware might just find their customers leaving at rates following Moore’s law too.

VN:F [1.9.13_1145]
Rating: 0.0/5 (0 votes cast)
VN:D [1.9.13_1145]
Rating: +1 (from 1 vote)

Sever Room Cooling

Practically every SME site I visit that has air conditioning in their comms room or server closet has it running as low as it goes – usually 18°C.  But why?

According to a friendly air conditioning engineer, the IT guys just like to ‘feel’ the cooling when they walk in.  But does running server rooms ‘cold’ have any merit, or is it just money being wasted?

One the one hand, it’s well known that some electronic components (such as capacitors) last longer at lower temperatures – and fan bearings are similarly affected.

But the flip side is that simply running air conditioners flat-out costs, since as well as moving the heat generated by the servers, the systems are moving heat from the surrounding building fabric.  By running server rooms slightly higher than the surrounding building ambient, some of the heat load generated will be sunk into the building fabric, saving money on both parts of the thermal load.

Safe Range

Server quickspecs provide operating temperature ranges, usually 5 to 35°C or more, and a quick look at the DRAC temperature thresholds on a PowerEdge server shows that system board ambient is considered normal up to 42°C, and critical only at 47°C.

Dell and HP just don’t ask for the ambient temperature history in assessing warranty claims – so it seems that within the stated range, reliability isn’t materially affected.  It’s worth noting that the infamous (but out-dated) Google study of disks also found that higher temperatures gave, if anything, longer disk life, and since then disks have moved to fluid bearings that have far greater reliability than their predecessors.

Air conditioners add massively to the electricity bill (and carbon footprint) and excess cooling also dehumidifies the air more (due to the lower coil temperature), and this in turn leaves the environment more vulnerable to electrostatic discharge (which is bad!).

Set Points

The more something provides heating or cooling, the more it costs to run.  So what temperature set point will minimise cost?

Personally, I use a set point a few degrees higher than the surrounding building ambient, so that the comms room coolers are moving only the heat load generated by the equipment (rather than some too from the surrounding building).

For a typical SME or branch office comms room, with a single high-wall mounted cooler, directing the air flow over the front of the rack will generate savings by making use of the front-to-back cooling of the equipment (creating a so-called ‘hot aisle’ at the rear), because the overall ambient can be further increased without affecting the internal temperatures of the servers, thereby sinking much of the heat load to the surrounding building fabric:

Running the air conditioner fans at maximum speed maximises the coiling coil temperature at any particular load and therefore minimises dehumidification.

Working with What’s There

Another consideration is the types of systems installed.

Older R407C systems use at least 30% more electricity and dehumidify more than DC-inverter R410a systems (because of the simple on-off design).

DC-inverters meanwhile tend to be are most efficient running at about 80% stated load capacity.

For Example

Say a comms room had a 5kW R407C system, two 6kW R410a systems, and an electrical load (which can be checked via the UPS management cards) of about 9kW.  In this case, setting the two R410a systems to 25°C and the R407C to 27°C might work well, as the R407C system would ‘kick in’ only if one of the R410a systems packed up (because the 9kW electrical load cannot be moved completely by one of the 6kW R410a systems, resulting in a rise in the ambient temperature).

Because of our friends at VMware, often now I find the cooling systems are way over specified.  In the above example, electrical load might have been reduced to only 3kW and then the R407C system wouldn’t be needed at all, and a 25°C/27°C set point split between the two R410a systems.

As with everything, to maintain a reliable infrastructure monitoring is the key.  Option boards are available for Daikin systems (for example) to connect to environmental monitoring equipment such as APC’s NetBotz range.  Alternatively, a temperature sensor can be simply attached with a cable tie to the air conditioner outlets, and alarms configured on the environment monitor accordingly after some period of observation.

Server Room Air Conditioning Quick Tips

  • Understand what you’re dealing with – the electric load (which usually equals the thermal load), the air conditioner type, and their cooling capacity.
  • Minimise the use of R407C air conditioners.
  • Create hot isles – direct cooled air across the front of the racks as this is what the servers ‘breath’.  Cooling at the rear of the rack is essentially wasted.
  • Don’t use comms room air conditioners to cool the building – remove only the heat generated by the equipment.
  • Focus on server internal temperatures rather than room temperatures – a 25°C ambient as recorded via the server system board sensors is absolutely fine.
  • Be mindful of how quickly and how much temperatures will rise in the event of incoming mains power failure, as UPS shutdown policies may need to be revised (UPS run time could result in overheating, in the absence of any cooling).  This can be tested by shutting down the air conditioners and observing the rate of rise.
  • Try to work out the overall room air flow, watching out for hot-spots.  Server racks are cooled front-to-back and can therefore benefit from the use of blanking plates in the racks between servers and of course wire mesh doors, but comms equipment tends to be cooled side-to-side.  Internal division of racks (with side panels) can therefore be advantageous.
  • Work out a way of monitoring the health of installed air conditioners and generating alarms when necessary.
VN:D [1.9.13_1145]
Rating: 3.0/5 (1 vote cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)

Copyright © Peacon Ltd, 2010, 2011
virtualisation blog by James Pearce

If you find the content of this blog useful, please consider donating just a pound towards the costs:

WordPress Appliance - Powered by TurnKey Linux