Tag Archives: MimboloveManagement Pack

My Experience of Using Silect MP Studio During Management Pack Update Process

Written by Tao Yang

Thanks to Silect’s generosity, I have been given a NFR (Not For Resell) license for the MP Studio to be used in my lab last November. When I received the license, I created a VM and installed it in my lab straightaway.  However, due to my workloads and other commitments, I haven’t been able to spend too much time exploring this product. In the mean time, I’ve been trying to get all the past and current MS management packs ready so I can load them into MP Studio to build my repository.

Today, one of my colleagues came to me seeking help on an error logged in the Operations Manager log on all our DPM 2012 R2 servers (where SQL is locally installed):

image

It’s obvious the offending script(GetSQL2012SPNState.vbs) is from the SQL 2012 MP, and we can tell the error is that the computer FQDN where WMI is trying to connect to is incorrect. In the pixelated area, the FQDN contains the NetBIOS computer name plus 2 domain names from the same forest.

I knew the SQL MP in our production environment is 2 version behind (Currently on version 6.4.1.0), so I wanted to find out if the latest one (6.5.4.0) has fixed this issue.

Therefore, as I always do, I firstly went through the change logs in the MP guide. The only thing I can find that might be related to SPN monitoring is this line:

SPN monitor now has overridable ‘search scope’ which allows the end user to choose between LDAP and Global Catalog

image

I’m not really sure if the new MP is going to fix the issue, and no, I don’t have time to unseal and read the raw XML code to figure it out because this version of the SQL 2012 monitoring MP has 49,723 lines of code!

At this stage, I thought MP Studio might be able to help (by comparing 2 MPs). So I remoted back to my home lab and quickly loaded all versions of SQL MP that I have into MP Studio.

SNAGHTML10c3cde7

I then chose to compare version 6.5.4.0 (the latest version) with version 6.4.1.0 (the version loaded in my production environment):

image

It took MP Studio few seconds to generate the comparison result, and I was surprised how many items have been updated!

image

Unfortunately, there is no search function in the comparison result window, but fortunately, I am able to export the result to Excel. When I exported to Excel, there are 655 rows! when I searched the script name mentioned in the Error log (GetSQL2012SPNState.vbs), I found the script was actually updated:

image

Because the script is too long and it’s truncated in the Excel spreadsheet, I had to go back to MP Studio and find this entry (luckily entries are sorted alphabetically).

Once the change is located, I can copy both parent value and the child value into clipboard:

image

I pasted the value into Notepad++, as it contains some XML headers / footers and both versions of script, I removed headers and footers, and separated the scripts into 2 files.

Lastly, I used the Compare Plugin from NotePad++ to compare the differences in both scripts, and I found an additional section in the new MP (6.5.4.0) may be related to the error that we are getting (as it has something to do with generating the domain FQDN):

image

After seeing this, I took an educated guess that this could be the fix to our issue and asked my colleague to load MP version 6.5.4.0 into our Test management group to see if it fixes the issue. When we went to load the MP, we found out that I have already loaded it in Test (I’ve been extremely busy lately and I forgot I did it!). So my colleague checked couple of DPM servers in our test environment and confirmed the error does not exist in Test. It seems we have nailed this issue.

Conclusion

Updating management packs has always been a challenging task (for everyone I believe). In my opinion, we are all facing challenges because not knowing EXACTLY what has been changed. This is because:

  • It is impossible to read and compare each MP files (i.e. the SQL 2012 Monitoring MP has around 50,000 lines of code, then plus the 2008 and 2005 MP, plus the library MP, etc.), they are just too big to read!
  • MP Guide normally only provides a vague description in the change log (if the are change logs after all).
  • Any bugs caused by the human error would not be captured in the change logs.
  • Sometimes it is harder to test a MP in test environment because test environments normally don’t have the same load as production, therefore it is harder to test some workflows (i.e. performance monitors).

And we normally rely on the following sources to make our judgement:

  • The MP Guide. – Only if the changes are captured in the guide, and they are normally very vague.
  • Social media (tweets and blogs) – but this is only based on the blog author’s experience, the bug you have experienced may not been seen in other people’s environment (i.e. this particular error I mentioned in this post probably won’t happen in my lab because I only have a single domain in the forest).

Normally, you’d wait someone else to be the guinea pig, test it out and let you know if there are any issues before you start updating your environment (i.e. the recent bug in Server OS MP 6.0.7294.0 was firstly identified by SCCDM MVP Daniele Grandini and it was soon been removed from the download site by microsoft).

In MP Studio, the feature that I wanted to explore the most is the MP compare function. It really provides OpsMgr administrators a detailed view on what has been changed in the MP and you (as OpsMgr admin) can use this information to make better decisions (i.e. whether to upgrade or not? are there any additional overrides required?). Based on today’s experience, if I start timing before I loaded the MPs into the repository, it probably took me less than 15 minutes to identify this MP update is something very worth trying (in order to fix my production issue).

Lastly, There are many other features MP Studio provides, I have only spent a little bit time on it today (and the result is positive). In my opinion, sometimes, the best way to describe something is to use an example, thus I’m sharing today’s experience with you. I hope you’ve found it informative and useful.

P.S. Coming back to the bug in the Server OS MP 6.0.7294 that I have mentioned above, I ran a comparison between 6.0.7294 and previous version 6.0.7292, I can see a lot of perf collection rules have been changed:

image

and if I export the result in Excel, I can actually see the issue described by Daniele (highlighted in yellow):

image

Oh, one last word before I call it the day, to Silect – would it be possible to provide search function in the comparison result window (so I don’t have to rely on Excel export)?

Updated Management Pack for Windows Server Logical Disk Auto Defragmentation

Written by Tao Yang

defragBackground

I have been asked to automate Hyper-V logical disk defragmentation to address a wide-spread production issue at work. Without having a second look, I went for the famous Autodefrag MP authored by my friend and SCCDM MVP Cameron Fuller.

Cameron’s MP was released in Octorber, 2013, which is around 1.5 year ago. When I looked into Cameron’s MP, I realised unfortunately, it does not meet my requirements.

I had the following issues with Cameron’s MP:

  • The MP schema is based on version 2 (OpsMgr 2012 MP schema), which prevents it from being used in OpsMgr 2007. This is a show stopper for me as I need to use it on both 2007 and 2012 management groups.
  • The monitor reset PowerShell script used in the AutoDefrag MP uses OpsMgr 2012 PowerShell module, which won’t work with OpsMgr 2007.
  • The AutoDefrag MP was based on Windows Server OS MP version 6.0.7026. In this version, the fragmentation monitors are enabled by default. However, since version 6.0.7230, these fragmentation monitors have been changed to be disabled by default. Therefore, the overrides in the AutoDefrag MP to disable these monitors become obsolete since they are already disabled.

In the end, I have decided to rewrite this MP, but it’s still based on Cameron’s original logics.

New MP: Windows Server Auto Defragment

I’ve given the MP a new name: Windows Server Auto Defragment (ID: Windows.Server.Auto.Defragment).

The MP includes the following components:

Diagnostic Tasks: Log defragmentation to the Operations Manager Log

There are 3 identical diagnostic tasks (for Windows Server 2003, 2008 and 2012 logical disk fragmentation monitors). These tasks log an event log entry to agent’s Operations Manager log before the defrag recovery tasks starts.

Group: Drives to Enable Fragmentation Monitoring

This is an empty instance group. Users can place logical disks into this group to enable the “Logical Disk Fragmentation Level” monitors from the Microsoft Windows Server OS MPs.

You may add any instances of the following classes into this group:

  • Windows Server 2003 Logical Disk
  • Windows Server 2008 Logical Disk
  • Windows Server 2012 Logical Disk

 

Group: Drives to Enable Auto Defrag

This is an empty instance group. Users can place logical disks into this group to enable the diagnostic and recovery tasks for auto defrag.

You may add any instances of the following classes into this group:

  • Windows Server 2003 Logical Disk
  • Windows Server 2008 Logical Disk
  • Windows Server 2012 Logical Disk

 

Group: Drive to Enable Fragmentation Level Performance Collection

This is an empty instance group. Users can place logical disks into this group to enable the Windows Server Fragmentation Level Performance Collection Rule.

Note: Since this performance collection rule is targeting the “Logical Disk (Server)” class, which is the parent class of OS specific logical disk classes, you can simply add any instances of the “Logical Disk (Server)” class into this group.

Event Collection Rule: Collect autodefragmentation event information

This rule collects the event logged by the “Log defragmentation to the Operations Manager Log” diagnostic tasks.

Reset Disk Fragmentation Health Rule

This rule is targeting the RMS / RMS Emulator, it runs every Monday at 12:00 and resets any unhealthy instances of disk fragmentation monitors back to healthy (so the monitor regular detection and recovery would run again next weekend).

Auto Defragmentation Event Report

This report lists all auto defragmentation events collected by the event collection rule within a specified time period

image

Windows Server Fragmentation Level Performance Collection Rule

This rule collects the File Percent Fragmentation counter via WMI for Windows server logical disks. This rule is disabled by default.

If a logical drive has been placed into all 3 above groups as I mentioned above, you’d probably see a performance graph similar to this:

image

As shown in above figure, Number 1 indicates the monitor has just ran and the defrag recovery task was executed, the drive has been defragmented. Number 2, 3 and 4 indicates the fragmentation level is slowly building up over the week and hopefully you’ll see this similar pattern on a weekly interval (because the fragmentation level monitor runs once a week by default).

Various views

The MP also contains various views under the “Windows Server Logical Drive Auto Defragment” folder:

image

What’s Changed from the Original AutoDefrag MP?

Comparing with Cameron’s original MP, I have made the following changes in the new version:

  • The MP is based on MP schema version 1, which works with OpsMgr 2007 (as well as OpsMgr 2012).
  • Changed the minimum version of all the referencing Windows Server MPs to 6.0.7230.0 (where the fragmentation monitors became disabled by default).
  • Sealed the Windows Server Auto Defragment MP. However, in order to allow users to manually populate groups, I have placed the group discoveries into an unsealed MP “Windows Server Auto Defragment Group Population”. By doing so, all MP elements are protected (in the sealed MP), but still allowing users to use the groups defined in the MP to manage auto defrag behaviours.
  • Changed the monitor overrides from disabled to enabled because these monitors are now disabled by default. This means the users will now need to manually INCLUDE the logical disks to be monitored rather than excluding the ones they don’t want.
  • Replaced the Linked Report with a report to list auto defrag events.
  • Additional performance collection rule to collect the File Percent Fragmentation counter via WMI. This rule is also disabled by default. It is enabled to a group called “Drives to Enable Fragmentation Level Performance Collection”
  • Updated the monitor reset script to use SDK directly. This change is necessary in order to make it work for both OpsMgr 2007 and 2012. The original script would reset the monitor on every instance, the updated script would only reset the monitors for the unhealthy instances. Additionally, the monitor reset results are written to the RMS / RMSE’s Operations Manager log.
  • Updated LogDefragmentation.vbs script for the diagnostic task to use MOM.ScriptAPI to log the event to Operations Manager log instead of the Application log.
  • Updated message in LogDefragmentation.vbs from “”Operations Manager has performed an automated defragmentation on this system” to “Operations Manager will perform an automated defragmentation for <Drive Letter> drive on <Server Name>” – Because this diagnostic task runs at the same time as the recovery task, so the defrag is just about to start, not finished yet, I don’t believe the message should use past tense.
  • Updated the diagnostic tasks to be disabled by default.
  • Created overrides to enable the diagnostics for the “Drives to Enable Auto Defrag” group (same group where the recovery tasks are enabled).
  • Updated the Data Source module of the event collection rule to use “Windows!Microsoft.Windows.ScriptGenerated.EventProvider” and it is only looking for event ID 4 generated by the specific script (LogDefragmentation.vbs). –by using this data source module, we can filter by the script name to give us more accurate detection.

 

How do I configure the management pack?

Cameron suggested me to use the 5 common scenarios from his original post when explaining different monitoring requirements. In Cameron’s post, he has listed the following 5 scenarios:

01. We do not want to automate defragmentation, but we want to be alerted to when drives are highly fragmented.

In this case, you will need to place the drives that you want to monitor in the “Drives to Enable Fragmentation Monitoring” group.

02. We want to ignore disk fragmentation levels completely.

In this case, you don’t need to import this management pack at all. Since the fragmentation monitors are now disabled by default, this is the default configuration.

03. We want to auto defragment all drives.

In this case, you will need to place all the drives that you want to auto defrag into 2 groups:

  • Drives to Enable Fragmentation Monitoring
  • Drives to Enable Auto Defrag

04. We want to auto defragment all drives but disable monitoring for fragmentation on specific drives.

Previously when Cameron released the original version, he needed to work on an exclusion logic because the fragmentation monitors were enabled by default. With the recent releases of Windows Server OS Management Packs, we need to work on a inclusion logic instead. So, in this case, you will need to add all drives that you want to monitor fragmentation level to the “Drives to Enable Fragmentation Monitoring” group, and put a subset of these drives to “Drives to Enable Auto Defrag” group.

05. We want to auto defragment all drives but disable automated defragmentation on specific drives.

This case would be similar to case #3: you will need to place the drives that you are interested in into these 2 groups:

  • Drives to Enable Fragmentation Monitoring
  • Drives to Enable Auto Defrag

In addition to these 5 scenarios, another scenario this MP is catered for is:

06. We want to collect drive fragmentation level as performance data

In this case, if you want to simply collect the fragmentation level as perf data (with or without fragmentation monitoring), you will need to add the drives that you are interested in into the “Drives to Enable Fragmentation Level Performance Collection” group.

So, How do I configure these groups?

By default, I have configured these groups to have a discovery rule to discover nothing on purpose:

image

As you can see, the default group discoveries are looking for any logical drives with the device name (drive letter) matches regular expression ^$. “^$” represent blank / null value. Since all the discovered logical device would have a device name, these groups will be empty. You will need to modify the group memberships to suit your needs.

For example, if you want to include C: drive of all the physical servers, the group membership could be something like this:

grouppop

Note: In SCOM, only Hyper-V VMs are discovered as virtual machines. if you are running other hypervisors, the “virtual machine” property probably wont work.

MP Download

There are 2 management pack files included in this solution. You can download them HERE.

image

Credit

Thanks Cameron for sharing the original MP with the community and providing guidance, review and testing on this version. I’d also like to thank all other OpsMgr focused MVP folks who have been involved in this discussion.

Lastly, as always, please feel free to contact me if you have questions / issues with this MP.

VMM 2012 Addendum Management Pack: Detect Failed VMM Jobs

Written by Tao Yang

Background

My MVP friend Flemming Riis needed OpsMgr to alert on failed VMM jobs. After discovering that the native VMM MPs don’t have a workflow for this, I have offered my help and built this addendum MP to alert failed and warning (Completed w/ Info) VMM jobs:

image

I thought it is going to be a quick task, turned out, I started writing this MP about 1 month ago and only able to release it now!

The actual MP is pretty simple, 2 rules sharing a same data source which executes a PowerShell script to detect any failed and warning jobs in VMM. I wrote the initial version in few hours and sent it to Flemming and Steve Beaumont  to test in their environments right before the MVP Summit. After the summit, we found out the MP didn’t work in their clustered VMM environments. We then spent a lot of time emailing back and forth trying to figure out what the issue was. In the end, I had to build a VMM cluster in my lab in order to test and troubleshoot it Smile.

So, BIG BIG “Thank You” to both Flemming and Steve for their time and effort on this MP. It is certainly a team effort!

MP Pre-Requisites

This MP has 2 pre-requisites:

  • PowerShell script execution must be allowed on VMM servers and the VMM PowerShell module must be installed on the VMM server (It should by default).
  • The VMM server must be fully integrated with OpsMgr (configure via VMM console). This integration is required because this integration creates RunAs account to run workflows in native VMM management pack. This Addendum management pack also utilise this RunAs account.

SNAGHTML42d92eab

Alert Rules:

This MP contains 2 alert rules:

  • Virtual Machine Manager Completed w/ Info Job Alert Rule (Disabled by default)
  • Virtual Machine Manager Failed Job Alert Rule (Enabled by default)

image

Both rules shares a same data source with same configuration parameters values (to utilise Cook Down). They are configured to run on a schedule and detects failed / warning jobs since the beginning of the rule execution cycle. i.e. by default, they run every 3 minutes, so they would detect any unsuccessful jobs since 3 minutes ago. An alert is generated for EVERY unsuccessful job:

SNAGHTML42e07b14

SNAGHTML42e1b950

Note: Please keep in mind, If you enable the “Completed w/ Info job alert rule”, because we utilise Cook Down in these 2 rules, if you need to override the data source configuration parameters (IntervalSeconds, SyncTime, TimeoutSeconds), please override BOTH rules and assign same values to them so the script in the data source module only need to run once in every cycle and feed the output to both workflows.

Download

Since it’s a really simple MP, I didn’t bother to write a proper documentation for this, it’s really straight forward, I think I have already provided enough information in this blog post.

Please test and tune it according to your requirements before implementing it in your production environments.

Download Link

Lastly, I’d like to thank Steve and Flemming again for their time and effort on this MP. If you have any questions in regards to this MP, please feel free to send me an email.

OpsMgr Weather Monitoring MP Updated

Written by Tao Yang

I got an email from someone up in Sweden the other day in regards to the Weather Monitoring MP that I released few months ago. I’ve been made aware that a negative temperature reading is being recorded as a positive value (i.e. –8 degrees is being collected as 8 degrees).

First of all, apologies for this mistake. I wrote the PowerShell script for the probe action module back in July last year, when most of world was in summer. I didn’t even think about negative values and I couldn’t test it anyway…

Last night, I spent some time fixing the management pack. As I was fixing the code, I also found few other issues due to in consistencies in www.webservicex.net (where the MP is getting the data from). for example, some locations have decimal points in temperature value (i.e. Vancouver, Canada), some locations have multiple <Wind> tags in the return data, etc. oh well, webservicex is a free service so there’s no point bagging them for inconsistencies.

Below is a list of bugs that are fixed in this release (1.0.1.0):

  • Incorrect temperature collected when the reading is below zero
  • Incorrect temperature collected when the reading contains decimal points
  • script error when pressure reading is not within <pressure> tag (i.e. Vancouver, Canada uses <PressureTendency> tag). in this situation, pressure reading is not probed.
  • fixed wind direction and speed probe when there are multiple <Wind> tags in the result.
  • Agent task not displaying wind speed in KM/H
  • Updated temperature related performance views to display negative temperature readings.

Note:

I’ve updated below 4 temperature related performance views so they can display negative values:

image

The Y-Axis range is set to –130 to 134 (degrees) for Imperial Unit (Fahrenheit) and –90 to 57 (degrees) for Metric Unit (Celsius). According to Wikipedia, these figures are the highest and lowest temperature ever recorded on this plant. They can be customised by right clicking the view and choose “Personalize view…”:

image

the updated MP can be downloaded HERE. The download link from the original post is also updated.

SCOM Management Pack: Detecting USB Storage Device Connect and Disconnect Events

Written by Tao Yang

There was a requirement at work that people need to be notified when a USB storage device (USB key or portable USB hard disks) is connected or disconnected from SCOM monitored Windows computers.

So I wrote a 2 very simple alert generating rules to detect USB Mass Storage Device creation and deletion WMI event. I set both rules to run every 60 seconds so within 60 seconds of the event, an Information alert is generated in SCOM:

Alert for USB Storage Device Connection Event:

image

Alert for USB Storage Device Removal Event:

image

I have also created a dynamic group called Virtual Windows Computers in the MP so I can disable both rules for virtual machines. This is how I defined the group:

image

Please note this Virtual Machine discovery only detects virtual machines running on Microsoft’s virtual host platform. If you open System Center Internal Library MP in MPViewer and check the Raw XML for discovery “Discover if Windows Computer is a Virtual Machine”, you’ll see it the WQL:

image

So if you have non-Microsoft virtual machines (i.e. VMware) in your environment and you want to disable these 2 rules for those virtual machines, you will need to modify my group or create your own group in my management pack.

Download: USB Storage Device Detection Management Pack

SCOM Management Pack: Daylight Saving Time Change Detection

Written by Tao Yang

I wrote this management pack to detect system time changes that are caused by daylight saving. It’s called “Custom Daylight Saving Detection”.

Background:

When supporting a infrastructure that has servers across the globe, it is hard to keep track of daylight saving schedules for all time zones. There is a requirement that we need to be notified when the Windows servers have entered/exited daylight saving time.

Functionalities:

This management pack generates alerts when:

  1. the agent computer has entered / exited Daylight Saving time.
  2. the agent computer is entering / exiting Daylight Saving time within a configurable number of days (default is 10 days).

Inside the Custom Daylight Saving Detection Management Pack:

This MP consists of 4 rules:

image

 

Rule #1. Compare Current and Previous DaylightInEffect WMI Setting

Rule Type: Timed Script

Schedule: Runs every hour on 1 minute after full hour on Saturdays and Sundays

Description:

The script inside this rule retrieve DaylightInEffect value from Win32_ComputerSystem and compare it with the value previously saved in <OpsMgr Agent Install Dir>\CustomDST_MP.Control file. If the current value and the previous value are different, a event with event ID 9999 is logged in Operations Manager event log and CustomDST_MP.Control file is updated to the current value from WMI. This rule only runs during the weekend because daylight savings don’t happen during week days.

Note:

1. If CustomDST_MP.Control file does not exist, the script creates it and store the Current DaylightInEffect value in it. therefore there is no pre-configuration required on agent computers.

2. If the script detects the time zone is not impacted by daylight saving, it will skip to the end of the script and CustomDST_MP.Control file is not created on the agent computer.

The event 9999 looks like this:

image

(Please note the timing from above example is not correct because for testing purpose, I manually changed the value in the CustomDST_MP.control file.)

Rule #2. The “Daylight Saving In Effect” Setting has been changed

Rule Type: Alert Generating Rule

Description: This rule detects the event 9999 from Operations Manager event log that are generated by the previous rule (rule #1). and generates a information alert similar to this:

image

Rule #3. Detect Next Daylight Saving Effect Date and Time

Rule Type: Timed Script

Schedule: Runs on 12:00am every Monday

Description: The script inside this rule detects when (exact date and time) is the agent computer entering or exiting Daylight Saving time. if it is within the configurable number of days (AlertDayRange, the default is set to 10 days), an event with event ID 9998 is logged in Operations Manager log. the script works out the exact date and time for the daylight saving time changes and approximate number of days from when the script was run.

Event 9998 looks similar to this:

image

Note: for testing purpose, I have changed the AlertDayRange from 10 to 150 so the alert can be generated. In live environment, please be aware if you change the number to be greater than 14, you will be notified multiple times because the script runs once a week. For the same reason, if you change the number to be smaller than 7, you may not to get notified at all.

How to change AlertDayRange to a number other than 10?

To change AlertDayRange from the default of d10 days, simply create an override for this MP for the parameter called “Argument”:

image

Rule #4. The system time will be changed by Daylight Saving setting soon

Rule Type: Alert Generating Rule

Description: This rule detects the event 9998 from Operations Manager event log that are generated by the previous rule (rule #3). and generates a information alert similar to this:

image

This management pack can be downloaded here.

There are 3 files in the zip file. the XML file is the actual management pack (unsealed). the other 2 vbscripts are the actual scripts used in rule #1 and #3 for your reference.