I have seen numerous posts floating around regarding the SCOM Availability Reports showing "Monitoring Unavailable" even though the objects were healthy for the time period. For example, I can run the SCOM Availability Report, select "Exchange 2007 Service”, select the date range and expects lots of green, yellow, and red, but instead I primarily see dark gray.
The issue could be caused by the following:
- Bad Calculations during health rollups
- Bad Performance
- The Data Warehouse is behind
- and others
Several blog posts already exist regarding the above issues and can be found with Google/Bing. We will not be addressing those specific issues. We will be dealing with a very specific issue, which as far as I can tell is a bug, but I am not going to hold my breath.
Data Warehouse Availability Aggregation Process
SCOM has a process called the Data Warehouse Availability Aggregation Process. The Data Warehouse Availability Aggregation Process is somewhat complicated with about 10 steps that you probably don't care about. If you do, you can check this diagram, which gives a fairly decent picture of what is going on in the process.
Remember the issues various people continue to have with Management Server Resource Pools becoming unavailable in SCOM 2012? That resource pool unavailability is calculated just like any other object. Inside the Data Warehouse, a table called "dbo.HealthServiceOutage" keeps the outage data when this occurs on all objects. However, sometimes, it forgets to enter an outage end time. And that is the key to our current issue.
So lets take a look at the Health Service Outage table.
- The object that is "unavailable" is truly still unavailable. This SHOULD be NULL.
- The object is now healthy, but an EndDateTime did not get written. Unless you have a big problem in your environment, you should have very few of these.
- This is NOT supported by Microsoft
- You SHOULD backup your Data Warehouse before making any changes
- After we make the changes, you Data Warehouse might have to do A LOT of recalculate, causing a kick in performance for a short period of time, or it might cause you data warehouse to fall behind for a little while. If you are having performance issues with your Data Warehouse, you should address them first.
How to Make my objects available again in SCOM Availability Reports
- Gets the rows from the query above
- Updates the DWLastModifiedDateTime to the current UTC Date and Time
- Updates the EndDateTime to match the StartDateTime