First of all, apologies for not been able to blog for over a month. I changed jobs. I started my new job 4 weeks ago, with all the things going on, I couldn’t find time to sit down and write blog articles. I am now working in a large “System Center Shop” which consists of SCCM, SCOM, SCVMM, Opalis, SCDPM and Hyper-V that spread over Australia national wide. so hopefully, my future blogs will have posts related to other System Center products as I get my hands on these products, not just SCCM and SCOM.
Anyways, in the last couple of days, I noticed on the 2-Node RMS cluster at work, when RMS is running on Node B, Event ID 29104 is logged in Operations Manager event log:
“OpsMgr Config Service failed to send the dirty state notifications to the dirty OpsMgr Health Services. This may be happening because the Root OpsMgr Health Service is not running.”
This event was only generated when RMS was running on Node B, also, when Node B was the active node, RMS health state was greyed out in SCOM console. If I fail over RMS to Node A, everything is fine.
After spent some time troubleshooting the issue, I have found there are some registry keys mismatch between 2 nodes, they are under HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Server Management Groups\<Management Group Name>\
On node B, “IsRootHealthService” is set to 0, Node A is set to “1”
Also on node B, there is a set of sub keys “Parent Health Services”. this set of keys should not exist in RMS:
So, to fix the issue, I firstly failed over RMS to node A, then changed “IsRootHealthService” on node B to “1” and deleted “Parent Health Services” key from node B. After that, I failed RMS back to node B, the 29104 events were no longer been logged and the RMS health state is not grey anymore.
Again, I did not consult Microsoft on this one, please take a back up of the registry keys before you change it and I am not responsible for any damages it may cause.