JAMA00

SCOM 2007 R2

Archive for the ‘Agent Settings’ Category

Agent proxy

Posted by MarcKlaver on July 1, 2011

Until now we have set the Agent proxy for an agent only when required and we used a script to do this. See this link for more information. But now Microsoft has come up with something new in the Exchange 2010 management pack. It will not discovery anything until you have set the agent proxy on for the Exchange servers (so we can’t do this afterwards anymore). So this meant for us we need to make a choice:

  1. Manually enable the proxy agent setting for all Exchange 2010 servers (now and in the future). Which means an Exchange 2010 server will not be discovered until we actually do.
  2. Enable the proxy agent for all agents.

Counting at the moment around 60 percent of the agents already has the proxy functionality enabled. So what’s the advantage of not setting this setting on default for all agents? Looking at security, you have to enable this setting already for all security important servers (AD, Exchange, ISA, Citrix, etc.). And since we have no knowledge of when an Exchange server is connected to our environment, we decided to enable it for all agents.

This is the script to do it:

$rootMS="RMS.TEST.LOCAL"

#——————————————————————————-
# Add operations manager snapin and connect to the root management server.
#——————————————————————————-
add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";
set-location "OperationsManagerMonitoring::";
new-managementGroupConnection -ConnectionString:$rootMS;

## set proxy enabled for all agents where it is disabled
$NoProxy = get-agent | where {$_.ProxyingEnabled -match "False"}
$NoProxy | foreach {$_.ProxyingEnabled = $true}
$NoProxy | foreach {$_.ApplyChanges()}

See also Kevin Holman’s blog

Posted in Agent Settings | 2 Comments »

What is the optimal setting for my environment when it comes to missed heartbeats?

Posted by rob1974 on July 14, 2010

I’m probably not the first to write about heartbeats and its mechanism. So just a quick overview of how it works.

Heartbeat mechanisme: An agent sends out a heartbeat on the monitoring port 5723 to a management servers. The management server (rms actually) keeps track of the last received heartbeat of each agent in its management group. When an agent’s heartbeat hasn’t been received for X time alert “heartbeat failure” alert is generated. The heartbeat failure alert in turn triggers a ping on the agent-managed server’s FQDN. When this fails as well another alert will be generated, the “server unreachable” alert.

This blog is about the “X” time to wait before a heartbeat failure alert will be generated. X isn’t actually a direct configurable time. It’s a combination of 2 settings. The agent setting “the heartbeat interval” in seconds and the server setting “number of missed heartbeats allowed”. The default for this setting is 60 seconds and 3 missed heartbeats allowed. So “X” is default 60×3 = 3 minutes.

image

image 

I assume the default is ok in most cases, but if you have a large and complex network or reboot servers without setting maintenance it might generate a lot of noise as well. So here’s what we did to find the optimal setting for X without starting to experiment with settings themselves.

By running the query below you can see the number of alerts for “Health Service HeartBeat Failure” and “Failed to Connect to Computer” with a closed status.

Select alertstringname, count(*) as Number_of_alerts from AlertView
where ResolutionState = 255
and (alertstringname = ‘Health Service HeartBeat Failure’ or alertstringname = ‘Failed to Connect to Computer’)
group by AlertStringNAme
order by 2 DESC

When you haven’t changed retention for closed alerts it gives the number for about 1 week. For our environment it turned out to be around 3000 heartbeat alerts, which is about a weekly alert for each server. However most of these alerts are gone before someone looked at the “problem”.

In this post i’ve given some queries to identify the auto closing alerts already. I modified the query a bit to only see heartbeat failure and failed to connect to computer alerts.

By running the query below you can see the number of heartbeat failure alerts which were closed within 2 minutes after creation.

Select alertstringname, count(*) as Number_of_alerts from AlertView
where ResolutionState = 255
and ResolvedBy =’system’
and (alertstringname = ‘Health Service HeartBeat Failure’ or alertstringname = ‘Failed to Connect to Computer’)
and DATEDIFF(MI,TimeRaised,TimeResolved) <= 2
group by AlertStringNAme
order by 2 DESC

The result of this query in my environment was 1450 alerts were auto-closed within the first 2 minutes. So if X would have been 5, it probably would have prevented 1450 alerts.

I’ve plotted the X versus the expected number of heartbeat failures. Please note i left out quite a lot of values for X, but i haven’t adjusted the scale for this and after 1 hour i still would get a few heartbeat failures.

image

So what’s the optimal X for this environment?

Actually you still can’t say what the setting should be. It still depends on what is acceptable for your environment as we’re talking about how fast you can detect whether a server is down or not. Setting X to 60 would give us the least of heartbeats, but it wouldn’t make any sense either. I believe finding a balance between noise and when we have to take a look is more important, i’d say the optimal X for my environment is 7-8. This will leave about 800 heartbeats alerts weekly, but this is acceptable for us.

Also note you might miss unexpected reboots whatever the value for X is. If it’s important not to miss them, just pick up the event about unexpected reboot from the system eventlog by an alert rule and make that alert critical.

Posted in Agent Settings, Management Servers | 4 Comments »

Setting agent proxying

Posted by MarcKlaver on January 21, 2010

Several management pack (if not all) require you to change the security settings for the agent to allow the agent to act as a proxy:

proxysetting

We are using a slightly modified version of the Operations Manager Support Team blog script, which will enable this setting, based on a class name. We run this script on a daily basis.

But then the question is: How do I know which class I can use.

When an agent does not have the proxy setting enabled (but should) an alert is generated on the management server and looks like this:

image

 Note: We have lowered the severity to warning instead of the default critical.

In the “Alert Description” you can find which management pack wants the proxy setting enabled (Microsoft Windows DFSReplication is this example). When you export this management pack to xml, you can retrieve the required class.

For our production environment we have now enabled agent proxying for these classes:

#——————————————————————————-
# Citrix servers for zone data collection.
#——————————————————————————-
Citrix.PresentationServer.ServerRole

#——————————————————————————-
# Cluster nodes.
#——————————————————————————-
Microsoft.Windows.Cluster.Node
Microsoft.Windows.Cluster.Service

#——————————————————————————-
# DFSReplication service
#——————————————————————————-
Microsoft.Windows.DfsReplication.Service

#——————————————————————————-
# DNS Server 2003
#——————————————————————————-
Microsoft.Windows.DNSServer.2003.Server

#——————————————————————————-
# DNS Server 2008
#——————————————————————————-
Microsoft.Windows.DNSServer.2008.Server

#——————————————————————————-
# Exchange 2003 Servers
#——————————————————————————-
Microsoft.Exchange.ServerRole.2003

#——————————————————————————-
# Exchange 2007 discovery helper.
#——————————————————————————-
Microsoft.Exchange2007.Component.DiscoveryHelper

#——————————————————————————-
# ISA server 2006
#——————————————————————————-
Microsoft.ISAServer.2006.Firewall.ServerRole

#——————————————————————————-
# NLB windows 2008.
#——————————————————————————-
Microsoft.Windows.NetworkLoadBalancing.2008.ServerRole

#——————————————————————————-
# SMS 2003
#——————————————————————————-
Microsoft.SMS.2003.Microsoft_SMS_2003_Providers_Installation
Microsoft.SMS.2003.Microsoft_SMS_2003_Site_Database_Servers_Installation

#——————————————————————————-
# System Center Configuration Manager 2007
#——————————————————————————-
Microsoft.SystemCenter.ConfigurationManager.2007.Microsoft_SMSv4_Providers_Installation
Microsoft.SystemCenter.ConfigurationManager.2007.Microsoft_SMSv4_Primary_Site_Servers_Installation
Microsoft.SystemCenter.ConfigurationManager.2007.Microsoft_SMSv4_Site_Database_Servers_Installation

Posted in Agent Settings, Management Servers | 6 Comments »