SCOM: Monitoring an Interactive Process and The Recovery Task

Written by Tao Yang

Recently I’m working on a management pack for a series of apps for a business unit of my employer. There is a large number of processes that I need to monitor and they run interactively on the console session. Auto Admin Logon is enabled on these servers, when the server starts up, it automatically logged on using the account configured and the the interactive processes are automatically started.

Setting up monitors for these processes is easy. However, I went a step further and created a generic write action module to be used as recovery task that restarts the process interactively on the console session.

There is one pre-requisite for the recovery task: I had to use PsExec to launch the process on console session. PsExec can be downloaded here: http://technet.microsoft.com/en-us/sysinternals/bb897553. PsExec needs to be copied locally to the computers that are being monitored.

I’ll now use use an example to go through how I setup the monitor, write action module and recovery task for notepad.exe

01. First of all, I created a class and its discovery to target my test machine “Client01”

02. Added “Microsoft.SystemCenter.ProcessMonitoring.Library” as a reference in my MP.

image

03. Created a process monitor for notepad.exe

    • Monitor Type: Process Instance Count Monitor Type (from “Microsoft.SystemCenter.ProcessMonitoring.Library”)
    • Monitor Configuration:
ProcessName notepad.exe
Frequency 60
MinInstanceCount 1
MaxInstanceCount 1
InstanceCountOutOfRangeTimeThresholdInSeconds 5
  • Note: While I was setting up the monitor, I realised the process name is case sensitive. Also, Frequency is in seconds
  • image
  • This is pretty much the same as using the Process Monitoring template from from the SCOM operations console (under Authoring Pane) – Except I used my own class rather than targeting to a group. Below is from the process monitoring wizard:
  • image

04. Now once I import the MP into my SCOM management group, I can verify it is working (from health explorer):

image

05. Because the way this monitor works, it is only healthy when the process count is in between MinInstanceCount and MaxInstanceCount (both set to 1 in this case). So the monitor’s health turns to Errorif there are say 2 instance of notepad running. Therefore I need to run a diagnostic task to determine how many instances are actually running because I only want to run the recovery task when the instance count is less than 1. I created a diagnostic task to run when the monitor’s health is in Error state. This diagnostic has only 1 action module: “Microsoft.Windows.ScriptPropertyBagProbe”:

image

    • Module configuration:
    • ScriptName CheckProcessDiagnostic.vbs
      Arguments notepad.exe
      ScriptBody refer to the vbscript below
      TimeoutSeconds 60
    • Here’s the script:
'==========================================
' AUTHOR:            Tao Yang
' Script Name:        CheckProcessDiagnostic.vbs
' DATE:                27/01/2012
' Version:            1.0
' COMMENT:            - Script to check process state.
'                    - Used for OpsMgr Management Pack diagnostic tasks.
'==========================================
ProcessName = WScript.Arguments.Item(0)
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oBag = oAPI.CreatePropertyBag()
WMIQuery = "Select * From Win32_process WHERE name = '" + ProcessName + "'"
Set objWMIService = GetObject("winmgmts:\\.\root\cimv2")
Set colProcesses = objWMIService.ExecQuery (WMIQuery)
Call oBag.AddValue("ProcessName",ProcessName)
If colProcesses.count < 1 Then
Call oBag.AddValue("Result","Positive")
Else
Call oBag.AddValue("Result","Negative")
End If
oAPI.Return(oBag)
  • This script returns a property bag variable“Result”. The value of “Result” is “Positive” if there is less than 1 instance of notepad.exe running. otherwise, the value is “Negative”. I will use the the value of “Result” to determine whether to run the recovery task or not by using a condition detection module in recovery task later.

06. Create a Write Actions module for the recovery task. I’m creating a separate module for this so I can use it in recovery tasks of multiple monitors.

    • image
    • Member Module: “Microsoft.Windows.PowerShellWriteAction”
    • image
    • Module Configuration:
    • image
    • While editing this module, Add below secion between </ScriptBody> and </Configuration>:

<Parameters>
<Parameter>
<Name>PsExecPath</Name>
<Value>$Config/PsExecPath$</Value>
</Parameter>
<Parameter>
<Name>PathToExe</Name>
<Value>$Config/PathToExe$</Value>
</Parameter>
<Parameter>
<Name>Context</Name>
<Value>$Config/Context$</Value>
</Parameter>
<Parameter>
<Name>Arguments</Name>
<Value>$Config/Arguments$</Value>
</Parameter>
</Parameters>
<TimeoutSeconds>$Config/TimeoutSeconds$</TimeoutSeconds>

image
Place the PowerShell script below between <ScriptBody></ScriptBody> section:

#=================================================
# AUTHOR:  Tao Yang
# DATE:    16/01/2012
# Version: 1.0
# COMMENT: Start a exe on console session under LocalSystem Context
#=================================================

param([string]$PsExecPath, [string]$PathToExe, [string]$Context, [string]$Arguments)
# $Context should have only 2 possible values: &quot;System&quot; or &quot;User&quot;. &quot;User&quot; needs Auto Admin Logon Enabled
Function Get-ConsoleSessionInfo
{
$results = Query Session
$ConsoleSession = $results | select-string &quot;console\s+(\w+)\s+(\d+)\s+(\w+)&quot;
if ($ConsoleSession)
{
$UserName = $ConsoleSession.Matches[0].groups[1].value
$SessionID = $ConsoleSession.Matches[0].groups[2].value
$State = $ConsoleSession.Matches[0].groups[3].value
$objConsoleSession = New-Object psobject
Add-Member -InputObject $objConsoleSession -Name &quot;UserName&quot; -Value $UserName -MemberType NoteProperty
Add-Member -InputObject $objConsoleSession -Name &quot;SessionID&quot; -Value $SessionID -MemberType NoteProperty
Add-Member -InputObject $objConsoleSession -Name &quot;State&quot; -Value $State -MemberType NoteProperty
} else { $objConsoleSession = $null }
Return $objConsoleSession
}

$Mode = $null
#Determine UserID
If ($Context -ieq &quot;User&quot;)
{
$strUserName = $null
$DefaultPassword = $null
#detect if auto admin is enabled, if so, retrieve username and password from registry
$WinlogonRegKey = get-itemproperty &quot;HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\&quot;
If ($WinlogonRegKey.AutoAdminLogon = &quot;1&quot;)
{
$DefaultUserName = $WinlogonRegKey.DefaultUserName
$DefaultDomainName = $WinlogonRegKey.DefaultDomainName
$DefaultPassword = $WinlogonRegKey.DefaultPassword
$strUserName = &quot;$DefaultDomainName<code>\$DefaultUserName&quot;
}

If ($strUserName -and $DefaultPassword)
{
$Mode = &quot;User&quot;
} else {
Write-Error &quot;Owner variable set to </code>&quot;User<code>&quot; but Auto Admin Logon is not configured!&quot;
}
} elseif ($Context -ieq &quot;System&quot;) {
$Mode = &quot;System&quot;
} else {
Write-Error &quot;Incorrect Owner variable. it can only be </code>&quot;User<code>&quot; or </code>&quot;System<code>&quot;&quot;
}

#$thisScript = Split-Path $myInvocation.MyCommand.Path -Leaf
#$scriptRoot = Split-Path(Resolve-Path $myInvocation.MyCommand.Path)
#$PsExecPath = Join-Path $scriptRoot &quot;PsExec.exe&quot;
If (!(Test-Path $PsExecPath))
{
Write-Error &quot;Unable to locate PsExec.exe in $scriptRoot. Please make sure it is located in this directory!&quot;
} else {
#Get Console Session ID
$ConsoleSessionID = (Get-ConsoleSessionInfo).SessionID
if ($ConsoleSessionID)
{
If ($Mode -eq &quot;User&quot;)
{
$strCmd = &quot;$PsExecPath -accepteula -i $ConsoleSessionID -d -u $strUsername -p $DefaultPassword $PathToExe $arguments&quot;
Write-Host &quot;Executing $strCmd</code>...&quot;
Invoke-Expression $strCmd
} elseif ($Mode -eq &quot;System&quot;) {
$strCmd = &quot;$PsExecPath -accepteula -i $ConsoleSessionID -d -s $PathToExe $arguments&quot;
#run app under LOCALSYSTEM context
Write-Host &quot;Executing $strCmd`...&quot;
Invoke-Expression $strCmd
}
} else {
Write-Error &quot;No one is currently logged on to the console session at the moment.&quot;
}
}

Note:this PowerShell script uses command “query session” to detect the session ID of the console session.

Note: When you save the configuration of this module, please ignore this error:

image

Add the following item under Configuration Schema tab:

image

Note: Make sure “TimeoutSeconds” type is set to “Integer” and others are set to “String”

I also defined “TimeoutSeconds” as an overridable paramter:

image

Finally, set the Accessibility to Public (so it can be used in other management pack once this management pack is sealed”):

image

07. Create a recovery task to run after Diagnostic Task that I created from the step 5.

image

  • This recovery task has 2 modules: a condition detection module (System.ExpressionFilter) and an Actions module (From the Write Actions module I created from Step 6)

image

    • Condition Detection Module (System.ExpressionFilter):
    • image
    • Click Edit and add below:

<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type=”String”>Diagnostic/DataItem/Property[@Name=’Result’]</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type=”String”>Positive</Value>
</ValueExpression>
</SimpleExpression>
</Expression>

image

Actions Module (Module Type from the write action module created in Step 6)

PsExecPath Path to PsExec.exe on the target computer
PathToExe The executable that you want PsExec to run
Context 2 Possible values: “User” or “System”
Argument arguments for the executable that PsExec is executing
TimeoutSeconds

image

Note: Regarding to the Context variable, I designed the script to launch PsExec to execute the executable either under LOCALSYSTEM (  with –s  operator in PsExec) or under the user that’s configured for Auto Admin Logon (with –u <username> and –p <password> operators in PsExec). Because when Auto Admin Logon is enabled, the default username and password is stored in the registry key (HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon). If “Context” is set to “User”, the script reads the username and password from registry and pass them into PsExec. So, if Auto Admin Logon is not configured, the script won’t work if “Context” is set to “User”

image

Note: In this example, the recovery task simply launch notepad.exe on the console session. I can also tell notepad to open a txt file if I add the path of the txt file to “Arguments”.

Note: This recovery task will error out if no one has logged on to the console session of the target computer.

Now, everything is setup, time to put it to test.

image

From screen capture below, I can see the monitor’s health became Error at 10:44pm 27/01/2012. After the Diagnostic task determined there is no notepad.exe running, the recovery task kicks in, at 10:45pm, it launched notepad.exe on console session (session ID 2). The PID of notepad.exe is 4000.

Now, when I go to the target computer, notepad is launched on the console session and I can easily get the details of notepad.exe process:

image

You can see from above screen capture, notepad.exe was started at the same time when the recovery task ran, the session ID is 2, Owner is the account configured for Auto Admin Logon and process ID is same as the output from PsExec. Therefore, this instance of notepad.exe is the one started by the recovery task!

I’ve attached the 2 scripts used in Diagnostic and recovery tasks below. as well as my sample unsealed MP.

Download From Here

Please feel free to contact me if you have any questions or suggestions.

2 comments on “SCOM: Monitoring an Interactive Process and The Recovery Task

  1. Tao,
    I was looking for a way to display the number of process instances in a dashboard and came across your blog. I see from your script you are getting the number of process instances, would you happen to know how I could display this data in a dashboard?
    So it would have
    Servername Process Name Process Instance count

    Thanks for any help.

  2. Seth,
    This is possible but there are some catches. Two quick possible approaches come to mind and both involve authoring.
    1) Create a custom class representing a server instance(s). Probably easiest to use WindowsComputer as your base class. Give your custom class some properties (type: Int) that represent your specific services, assuming you have a finite list of specific services that you care about.
    Write a script discovery to confirm the processes existence and enumerate their count. Create a state view for your custom class type, personalize it to show all the interesting services and their counts.
    Note: For this type of data/dashboard to be useful you would need to refresh it fairly often; that is, run the discovery on a short interval. However, there are drawbacks to running discoveries too frequently that relate to agent load and storage of config changes in the DW. That discussion is outside the scope of this post.
    2) Almost the same as #1 above, but you could extend a base class instead creating a new custom class. I’ve never had to do this but is is technically possible. #1 is probably easier. (at least for someone like me who’s never had to extend a class)

    What might be a better solution would be to simply configure a Task (targeting Windows Computer or Server class) that runs a simple POSH script to nicely display the processes and their count. There are plenty of examples online of such a script.

    Although this thread has some age on it, there might be others who could benefit from this reply. Hopefully so. Cheers.

Leave a Reply