Hyper-V Cluster Virtual Machines Current Cluster Node SCOM Monitor

3 minute read

HyperVFirst of all, apologies for not updating this blog for over a months. Life has been pretty busy outside of work. My wife gave birth to our first child a months ago and I’ve been flat out looking after our little girl Rosie Smile.

Apart from enjoying every moment with the little one while I’m at home, At work, I have been asked to provide a solution to an issue that bothers our infrastructure support team on daily basis.

Background

Windows 2008 R2 Hyper-V Clusters are heavily utilised in my employer’s infrastructure. There are over 700 2-node Hyper-V clusters operating in the environment. For more information about how the System Center suite operates in the environment, please refer to the Case Study from Microsoft HERE.

DPM 2010 is installed on each Hyper-V cluster node to back up each other and virtual machines. Support teams get many SCOM alerts from DPM management pack everyday complaining about failed virtual machine backups because the virtual machines are not located on the host. Therefore, we need to make sure virtual machines are hosted on the right host before DPM backups start.

Analysis

Initially, we thought this is going to be an easy fix. we could just set the preferred nodes for each VM in Failover Cluster manager and configure auto failback before backups start. However, we then realise auto failback does not live migrate VM’s. the VM’s are paused and therefore been taken offline during the migration process. So I implemented the fix via SCOM instead.

Outcome

Basically, I have created a monitor targeting Hyper-V clusters to run daily at 1:00am to detect if any virtual machine cluster resources are not hosted by (one of) their preferred nodes. If so, a diagnostic task runs to check if all cluster nodes are up (to make sure cluster is in a healthy state before live migrations). then based on the outcome of diagnostic task, a recovery task runs to migrate any VMs that are not on preferred host to the preferred host.

Management Pack Details

Class Definition

  • Hyper-V Cluster (Base Class: Microsoft.Windows.Server.Computer)

Discovery:

  • Hyper-V Cluster Discovery
    • Script discovery. Any Windows Server Computers that meets following criteria:
      1. Windows Server Computer property "IsVirtualNode" = true (Windows cluster)
      2. service "vmms" (Hyper-V Virtual Machine Management service) exists
      3. The cluster contains "Virtual Machine" as a resource type.

Monitor:

  • Hyper-V Virtual Machine Current Cluster Node Monitor
    1. Checks if virtual machines are hosted by one of their preferred hosts.
    2. Generates a critical alert when at least one virtual machine is not on preferred hosts.
    3. diagnostic task to check the status of each cluster node.
    4. if all cluster nodes are up, recovery task runs to live migrate virtual machines that are not on preferred host to (one of) their preferred hosts.

Other Considerations:

I have created a custom probe action module to detect:

  • Any VM’s on wrong hosts (not on preferred hosts)
  • Any VM’s that are current on preferred hosts
  • Any VM’s that do not have preferred hosts configured
  • Are all VMs on preferred hosts (Boolean)
  • Are all VMs have preferred hosts configured (Boolean)

This probe action module is then wrapped into a data source module and also been used in the monitor type for the  Hyper-V Virtual Machine Current Cluster Node Monitor. If later on we need to be alerted if any VM’s on those 700+ clusters don’t have preferred nodes configured, I can use the same probe action and data source module for the new monitor.

During the development of this Management Pack, this TechNet blog post really helped:

Your MP Discoveries and Clustering

Management Pack Download:

Download Both sealed and unsealed versions HERE.

Leave a comment