Monday, June 22, 2015

Health Service Failure Runbook



Every time I apply a SCOM 2012 R2 update I go to the best source on the web, Kevin Holman’s Step by Step series.



But here is my experience, issues that I encountered and of course, the solution.

After running all SCOM server-related updates as per Kevin’s document I was now ready to update all Managed Agents waiting to be updated in the Pending Management pane.

Half updated with no issues, but the other half (200+ agents) would generate the following error:


The Agent Management Operation Agent Install failed for remote computer xxxx
Install account: xxx\ScomAction
Error Code: 8007041D
Error Description: The service did not respond to the start or control request in a timely fashion.
Microsoft Installer Error Description:
For more information, see Windows Installer log file "C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\AgentManagement\AgentLogs\AgentInstall.LOG
C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\AgentManagement\AgentLogs\AgentPatch.LOG
C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\AgentManagement\AgentLogs\MOMAgentMgmt.log" on the Management Server.




The key words here is in the error description “The service did not respond to the start or control request in a timely fashion”

When I open the affected agent services, indeed the Microsoft Monitoring Agent (HealthService) was stopped.

That of course generated a “Health Service Heartbeat Failure” alert and if the agent was a Cluster or Domain Controller then there were a plethora of other Critical alerts that came with it.

The funny part is that SCOM had successfully updated the agent but failed to re-start the Health Service which presented a challenge since there is nothing that can be done from within the SCOM console to resuscitate the now grey-out agent.

A quick solution is to remotely start the Microsoft Monitoring Service but it’s impractical on a 400+ agent population.



The Solution:


I created a SCORCH 2012 R2 Runbook to start the Microsoft Monitoring Service




Under the hood:

The Runbook listens for the ‘Health Service Heartbeat Failure’ alert
Ping the server to ensure it has not been shutdown or rebooted. 
If ping fails, an Information alert is created, mainly so it won’t interfere with the ‘Failed to Connect to Computer’ Critical alert that is generated immediately after.
If the ping is successful we pass the information to the next activity to start the Microsoft Monitoring Agent service.
Last step, we closed the ‘Health Service Heartbeat Failure’ alert and write ‘Closed by SCORCH’ in custom filed 1 as a successful stamp.



Disclaimer: 
All software and information is provided “AS IS” with no warranties. Use at your own risk! Please test it in a Lab environment first!






Tuesday, June 9, 2015

AD User Attribute Changes Audit Report



The following SCOM 2012 R2 ACS report provides detailed attribute changes done to any Active Directory user.
The Challenge
There are many reports that provide similar information included in SCOM Audit reports. One example is located in Reporting>>Audit Reports>>DAC_-_Object_Attribute_Changes

However, there are thousands of AD Attributes which include hundreds of AD User-related attributes making the above mentioned report very convoluted. The use of sometimes cryptic attribute names, values and operation description adds to the complexity of the report making it hard to read especially for non-tech people whom are, most of the time, the recipients of many SCOM reports.

Sample out-of-the-box report



The Solution:

The attached report focuses on AD User attributes displayed via Outlook which are a representation of LDAP fields and are by far the most commonly modified.

Most Common User Attributes




In my report I have replaced all AD User attributes with user-friendly names.


AD User Attribute Name
Friendly Name
displayname
Display Name
givenname
First Name
initials
Initials
sn
Last Name
mailNickname
Email Alias
streetAddress
Address
description
Description
title
Title
company
Company
department
Department
physicalDeliveryOfficeName
Office
msExchAssistantName
Assistant
telephoneNumber
Phone Number
L
City
st
State/Province
Postal Code
Zip/Postal Code
co
Country/Region
thumbnailPhoto
Photo


Sample Report

Mundo SCOM AD User Attribute Changes Report

The report takes two variable in between two %% signs: ‘User Name Contains’ (Affected User) and/or ‘Attribute Name Contains’ (Changed Attribute) or just enter two %% to get all possible results.



Preparing the AD environment:

How to enable AD Object Auditing, Audit Policies or Advanced Audit Policies setup is out of the scope of this post. 

However here is quick description of what is needed in order to produce the report:
On you Domain Controllers, enable ‘Directory Service Changes’ Audit Policy Subcategory, which is part of the Directory Service Audit Policy Category. Make sure to enable both.

AD object attribute changes are captured in Event ID 5136: A directory service object was modified which is part of the above Subcategory.

Enable Auditing to all Users via GPO or manually for a small number of users.

For a single user go to ‘Advanced’ security setting, Auditing. Add ‘Write all properties’.


Disclaimer: 
All software and the information is provided “AS IS” with no warranties. Use at your own risk! Please test it in a Lab environment first!




Monday, April 13, 2015

BlackBerry BES 12 Management Pack for SCOM 2012 R2


Update:

Thanks all for your feedback.

<<<For those asking for a customized MP, you can email me directly if you wish to "brand" this or any MP with your company name instead of "MundoSCOM".>>>.


This management pack is for monitoring BlackBerry Enterprise Server version 12.

This MP is designed for BES 12 servers that have been upgraded from BES version 5. This setup is done in order to manage BlackBerry legacy and BB10 devices.

(Link to download xml file below)

Console View:



Discoveries:

This management pack uses a seed class that searches for the following registry key:
HKLM\SOFTWARE\Wow6432Node\BlackBerry\BES12.


 Monitored BES and BES12 Services:



NOTE: All Monitors are enabled by default. In a Cluster setup, monitors for services set to ‘Manual’ could be disabled in order to avoid alerts from the passive node when servers are rebooted.


Distributed Application:



Disclaimer: 
The Management Pack and the information is provided “AS IS” with no warranties. Use at your own risk!

Link:



Thursday, April 9, 2015

SCOM 2012 R2 Command Notification Channel using PowerShell Fails


This issue may be common but I couldn't find any information on the different SCOM blogs hence I decided to post it.

Problem:
I added a second Management Server (MS) to my lab management group.
I have some Command channels that execute different PowerShell scripts copied locally on the RMSe which are triggered by an event rule.
The trigger rule was being generated successfully but the PowerShell scripts did not get executed.

Solution:

1. The new MS is automatically added to the “Notification Resource Pool” therefore it needs the same scripts copied locally. (Same folder structure for both MS servers)





2. The Notifications account needs appropriate permission on the folder/share where these scripts are stored to execute them.




3.  To maintain the same information on both Script shares I have created a batch file with a Robocopy command which is executed weekly via a Scheduled Task




       Command in Batch file:
       robocopy "C:\SCOM\ScriptFolder" "\\MS2\C$\SCOM\ScriptFolder" 
       /E /ZB /X /PURGE /COPYALL /TEE /LOG:E:\Copy_from_HD_to_Ext_HD.log

       Meaning of switches used in above command explained below

         /E :: copy subdirectories, including Empty ones.     

        /ZB :: use restartable mode; if access denied use Backup mode.

       /COPYALL :: COPY ALL file info (equivalent to /COPY:DATSOU).

       /PURGE :: delete dest files/dirs that no longer exist in source.

       /X :: report all eXtra files, not just those selected.

      /TEE :: output to console window, as well as the log file.

      /LOG:file :: output status to LOG file (overwrite existing log).



      Thanks to ITBloggerTips for posting this very useful Robocopy command!!

     http://itbloggertips.com/2013/05/robocopy-command-copy-only-new-changed-files-sync-both-the-drive/


Disclaimer:
The information is provided “AS IS” with no warranties. Use at your own risk!


SCOM and Orchestrator Voice Notification Solution with Twilio and Automys

SCOM and Orchestrator Voice Notification Solution with Twilio and Automys. Cherry Picking SCOM alerts… Problem: Issue # 1 : Spam (a...