Esx-health.pl

From blog.peacon.co.uk
Jump to: navigation, search

esx-health.pl is a development of a perl script written by William Lam that interogates an ESX or ESXi host, reporting it's hardware health status. It has an email capability to enable an email to be sent either if there is a hardware health alert, or by default.

  • Quick jump to source code
  • Originally introduced in the blog, here
  • Requires vmware vSphere CLI installed (which provides the Perl interpreter on Windows platforms)
  • Developed on a Windows platform (but should also work on linux)
  • If this is useful to you, please consider donating just a quid towards costs!

Contents

Syntax

C:\>perl esx-health.pl --server [servername] --username [username] --password [password]
--mailhost [mailhost] --maildomain [example.com] --mailfrom [mailfrom@example.com]
--mailto [mailto@example.com;mailto2@example.com] --cpuwarnpc [75] --memwarnpc [85]
--dswarnpc [25] --dscriticalpc [10] --warnofsnapshots --warnonchange --warnonalert
--exclude ["search regex"] --concise --logfile [esx-health.log]
--statusfile [esx-health-status.txt]

Parameters

--server [servername]

[servername] is the hostname or IP of the ESXi host to be queried. Minimum version is 3.5.

--username [username], --password [password]

[username] and [password] are the credentials needed to connect to the ESXi host. A read-only account is sufficient.

--mailhost [mailhost]

[mailhost] is the hostname or IP address of an SMTP server that will accept mail to the specified mailto address - i.e. a mail server for that domain, or one with relay enabled.

--maildomain [example.com]

[example.com] is the domain of the email account to send from (mailfrom).

--mailfrom [mailfrom@example.com]

[mailfrom@example.com] is the address that the email will be sent from.

--mailto [mailto@example.com;mailto2@example.com]

[mailto@example.com] is the address that the email will be sent to. Multiple addresses can be included, seperated by ';'.

--cpuwarnpc [75]

cpuwarnpc is the percentage CPU utilisation that will be considered a warning threshold, and if omitted defaults to 75 (i.e. 75%). Note that this is simply the most recent view of the system processor utilisation as would be reported by the vSphere client host summary view.

--memwarnpc [85]

memwarnpc is the percentage RAM utilisation that will be considered a warning threshold, and if omitted defaults to 85 (i.e. 85%). Note that this is simply the most recent view of the system processor utilisation as would be reported by the vSphere client host summary view.

--dswarnpc [25]

dswarnpc is the percentage of datastore free space that will be considered a warning threshold, and if omitted defaults to 25 (i.e. 25% space remaining).

--dscriticalpc [10]

dscriticalpc is the percentage of datastore free space that will be considered a critical threshold, and if omitted defaults to 10 (i.e. 10% space remaining).

--warnofsnapshots

When warnofsnapshots is specified, any running VMs with snapshots will be considered to be alert items (default: not considered alter items).

--warnonchange

warnonchange enables tracking of the number of warnings on a given host (see also --statusfile, below), and emails a notification only if the number of warnings has changed since last run.

Note that this simply tracks the number of warnings - so if a memory threshold warning were replaced with a VM snapshot warning between runs (so both runs had 1 (different) warning, no email would be generated on the second run.

--warnonalert

warnonalerts modifies warnonchange, so that a mail will always be sent if there is a active alert item.

If used as a regular scheduled task with --warnonchange --warnonalerts specificed, mail will always be sent unless:

  • there are no warnings present, and
  • there were no warnings present at last run either.

--exclude ["search regex"]

["search regex"] is a Perl regular expression against which each line item will be tested, and omited if a match is found. This can be as simple as a specific text string.

For example, to supress all lines containing "Power Supply 2" (which might generate a warning if it's not fitted, on some systems), specify

--exclude "Power Supply 2"

To also supress lines contiaing "Fan 4" (perhaps for a similar reason), seperate the search terms with a pipe ("|") character,

--exclude "Power Supply 2|Fan 4"

--concise

When concise is specified only lines with warnings are included. Particular useful when reports are viewed primarily on a hand-held device.

--logfile [esx-health.log]

[esx-health.log] is the path and file name to use for the log file, if not the default.

--statusfile [esx-health-status.txt]

[esx-health-status.txt] is the path and file name of the status file to record the number of alerts for a particular host (for use with --warnonchange<tt>). If omited, <tt>esx-health-status.txt will be used.

Note that one status file is required per host being checked if warnonchange is being used. So if a scheduled task calls esx-health.pl to check several hosts, a seperate status file must be specified if warnonchange is being used.

Sample Output

C:\>perl esx-health.pl --server esxi-1 --username readonly --password letmein --mailhost mail
--maildomain example.com --mailfrom Monitor@example.com --mailto somebody@example.com
Using mail host: mail.
Generating ESX host health report "esx-host-health-report.html" .
Processing esxi-1.home.local (VMware ESXi 4.0.0 build-261974): 0 alerts.
Query returned 0 alerts.
Sending email to somebody@example.com.
Done.


Sample email Output

Health-email.png


See Also

peacon blog

This wiki forms part of the peacon blog. RSS Feed.

Creating this content takes a great deal of time (and money for test hardware and hosting). If you've found this article useful, please consider donating a quid towards the costs. And thanks for reading!

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox