JAMA00

SCOM 2007 R2

Archive for December, 2009

One or more management servers do not get new updates from the RMS.

Posted by rob1974 on December 22, 2009

We’ve had a few issues where the management servers had events 21024 (requesting new updates) regularly, but never had received a new update. This management still was functional with the old configuration. This happened when we installed a new management pack, approved an agent or set an override. But it did not happen every time, which makes finding the problem very hard. However, most likely it will have something to do with file locking of the .edb files. We made sure our antivirus wasn’t scanning these files. This seemed to help a bit, but it does still happen every now and then.

To solve this issue we had to go and remove the configuration for this management group on the RMS and restart the healthservice.

Stop health service

rename the health service state (..\System Center Operations Manager 2007\Health Service State)

Start health service

Because the management servers keep on serving their agents, it’s difficult to determine whether the management server has an issue. We used to stumble upon this issue (e.g. we couldn’t move an agent to a management server or we kept getting alerts from a rule which we had disabled), but we really wanted to know this as soon as this happened.

We compared the modification dates of the opsmgrconnector.config.xml (in ..\System Center Operations Manager 2007\Health Service State\Connector Configuration Cache\<management group>\) files and found that the RMS differed quite a bit from the management servers, but the management servers all had more or less the same date.

We found the modify dates between the management servers and the root management server were always under 24 hours (looks like a forced configuration update once a day, although it might just be some discovery or our set agent proxy script). The management servers’ configuration xml were always within 1 minute of each other.

We’ve created and scheduled a script on an agent managed machine to check the differences between the config files every hour. When the threshold has been passed the script generates an event to the application log. The thresholds are shown in the table below.

  informational warning critical
Diff MS-RMS >24 hours >36 hours >48 hours
Diff MS-MS >1 minute >2 minutes >5 minutes

 

We’ve chosen to let these events be picked up by SCOM as the management servers are still accepting alerts even when they don’t have a configuration update recently. Just make sure the initial rules are distributed to the agent.

When you experiencing the same issue please vote for the bug report on the connect site as well.

Advertisements

Posted in Management Servers, troubleshooting | 2 Comments »

Creating computergroups

Posted by rob1974 on December 9, 2009

Our environment consist of 60+ forests over 60+ customers. These customers need to be seperated from each other as the administrators of these customers are in different teams. This post explains our approach to automate as much as possible.

We started off with scripting an manual installed agent. The manual installation will take care of all the settings we need to create groups based on a registry setting as well as the certificate and the connection to the management servers itself.

During installation an engineer would have to supply a customer string. The customer string is of the form customer_subgroup1[_subgroup2] where Subgroup2 is optional. This string is placed in a registrykey (called customername) as well as a customer value derived from the supply string (called customerid). The customer id was created to give the reporting guys the option to find all servers from 1 customer easily, but it has use in our approach of creating groups too.

creating discoveries

First of all, we created discoveries for the registry keys we needed. We just creating this from the console as properties inherited from windowscomputer. After creation we exported the management pack and sealed it and reimported it.  The management pack has been called jamaAttributes. Sealing is needed for using the discovered values in another management pack (sealing can be done from within the R2 authoring console).

This is what the discovery looks like for customername:

<Discovery ID="jamaAD_MicrosoftWindowsComputer_CustomerNameDiscovery" Enabled="true" Target="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer" ConfirmDelivery="false" Remotable="true" Priority="Normal"> 
<Category>PerformanceCollection</Category> 
<DiscoveryTypes> 
<DiscoveryClass TypeID="jamaType_WindowsComputer_jamaAttributes"> 
<Property TypeID="jamaType_WindowsComputer_jamaAttributes" PropertyID="jamaProperty_JamaTypeWindowsComputerJamaAttributes_CustomerName" />  </DiscoveryClass> 
</DiscoveryTypes> 
<DataSource ID="AttributeDiscoveryGeneratedByUI828e02077e6d480b9f104cd757ece492" TypeID="MicrosoftWindowsLibrary6172210!Microsoft.Windows.RegistryDiscoverySingleProvider">
    <ComputerName>$Target/Property[Type="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>

<AttributeName>AttributeDiscoveryRulef5afc3f6856842e69c6b078c940f7f25</AttributeName>     <Path>SOFTWARE\jama\OpsMgr2007\attributes\customer_name</Path> 
<PathType>1</PathType> 
<AttributeType>1</AttributeType> 
<Frequency>86400</Frequency> 
<ClassId>$MPElement[Name="jamaType_WindowsComputer_jamaAttributes"]$</ClassId> 
<InstanceSettings> 
<Settings> 
<Setting> 
<Name>$MPElement[Name="jamaType_WindowsComputer_jamaAttributes"]

/jamaProperty_JamaTypeWindowsComputerJamaAttributes_CustomerName$</Name> 
<Value>$Data/Values/AttributeDiscoveryRulef5afc3f6856842e69c6b078c940f7f25$</Value> 
</Setting> 
<Setting> 
<Name>$MPElement[Name="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer"]/PrincipalName$</Name> 
<Value>$Target/Property[Type="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer"]/PrincipalName$</Value> 
</Setting> 
</Settings> 
</InstanceSettings> 
</DataSource>
</Discovery>

The jamaStrings are the names we given to this later on. If you only use the console it will be some unreadable guid. We still have a few of those in our xml, but we don’t use these later on and therefore did not replace them.

creating groups

Our first approach was just to use the console to create a group. As jamaAttributes has been sealed we can easily create a server based discovery based on the customername/customerid properties. We have a property that is seperated with underscores and this we’ll use to create serveral groups for our customers.

The groups were saved to an unsealed mp called jamaGroups

Group1 = customer (property name = customerid). This group contains all servers from 1 customer. This can be achieved by adding the all the subgroups for this customer (creating nested groups) or by adding all the servers directly. We aren’t using nested groups for reasons i will explain later on.

ScreenHunter_131

Subgroup1 = logical unit for a customer. The engineers will think of this groups to whatever they think is needed for that customer and make management for them more easily. However we recommend to use Prod(uction), Accept(ion), etc. We also recommend to use as little as possible subgroups, even add a bogus prod group for small customers. The reason is for this is DNS, which will be another post in the future.

image

Subgroup2 = another logical unit for a customer. This one is optional and doesn’t have many restrictions other then naming conventions. Most used subgroups here are "application" or "location" groups e.g. domaincontrollers, exchange servers, or location Amsterdam, Paris, New York, etc.

ScreenHunter_134

However, we soon realised we missed out of a lot of alerts as SCOM doesn’t monitor from computers as the toplevel entity anymore. The 2 classes we needed to add to our groups were Virtual Servers (clusters) and the Health Watcher.

adding virtual servers

Virtual servers we also missing when we scoped on a group. However the same registry keys are available to use when we extend the property to virtual servers as well. Now the groupdiscovery includes virtual server and computer objects with 2 discoveries.

The xml of extending virtual servers with our customername property:

<Discovery ID="jamaAD_MicrosoftWindowsClusterVirtualServer_CustomerNameDiscovery" Enabled="true" Target="MicrosoftWindowsClusterLibrary6172210!Microsoft.Windows.Cluster.VirtualServer" ConfirmDelivery="false" Remotable="true" Priority="Normal">
<Category>PerformanceCollection</Category>
<DiscoveryTypes>
<DiscoveryClass TypeID="jamaType_VirtualServer_jamaAttributes">
<Property TypeID="jamaType_VirtualServer_jamaAttributes" PropertyID="jamaProperty_JamaTypeVirtualServerJamaAttributes_CustomerName" />
</DiscoveryClass>
</DiscoveryTypes>
<DataSource ID="AttributeDiscoveryGeneratedByUI246f3178580348039635fc6d8409da94" TypeID="MicrosoftWindowsLibrary6172210!Microsoft.Windows.RegistryDiscoverySingleProvider">
<ComputerName>$Target/Property[Type="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
<AttributeName>AttributeDiscoveryRule4195ea8a61b9497f9083c85d315a3a03</AttributeName>
<Path>SOFTWARE\Getronics\jama\OpsMgr2007\attributes\customer_name</Path>
<PathType>1</PathType>
<AttributeType>1</AttributeType>
<Frequency>86400</Frequency>
<ClassId>$MPElement[Name="jamaType_VirtualServer_jamaAttributes"]$</ClassId>
<InstanceSettings>
<Settings>
<Setting>
<Name>$MPElement[Name="jamaType_VirtualServer_jamaAttributes"]

/jamaProperty_JamaTypeVirtualServerJamaAttributes_CustomerName$</Name>
<Value>$Data/Values/AttributeDiscoveryRule4195ea8a61b9497f9083c85d315a3a03$</Value>
</Setting>
<Setting>
<Name>$MPElement[Name="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer"]/PrincipalName$</Name>
<Value>$Target/Property[Type="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer"]/PrincipalName$</Value>
</Setting>
</Settings>
</InstanceSettings>
</DataSource>
</Discovery>

The group in the console would become:

ScreenHunter_136

adding health watcher

The health watcher monitors whether a server has a heartbeat with the management servers. And when the heartbeat hasn’t been received 3 (or any configured number) consecutive times by a management server, the management server will try to ping the server to check whether the server is up or just the agent has a problem. When there’s no heartbeat it will raise a heartbeat failure alert. When also the ping fails it will also raise a "computer unreachable" alert. So basically it will alert you when a server is down.

However since we’re using computergroups to scope our operators, they won’t see these alerts. And what is the use of monitoring when you don’t even see when a server is down?

An msdn blog post by Steve Rachui gave the answer to this question.

But as the post already suggest, this is were things get tough. We have to leave our clickable console and start to understand what the xml looks like so we can modify it to what we need. We would need to add the healthwatcher’s discovery to each group’s discoveries and make sure the references are correct.

Looking at the group mp xml

If you’re quite happy with just manually editing the xml the blog of Steve Rachui will be enough to figure it out. However we wanted to go 1 step further as the manual edits are prone to mistake and the most important reason we are lazy and don’t want to do this 600 times.

This management pack contains 4 top levels in the xml, Manifest, TypeDefinitions, Monitoring and LanguagePacks (reading xml files is best with internet explorer as it allows to collapse parts you won’t be looking at, for editing notepad++ with the xml addon helps a lot).

The Manifest contains the name of the management pack and its version. As well as the references to other management packs. The microsoft.windows.library, Microsoft.SystemCenter.Library, Microsoft.SystemCenter.InstanceGroup.Library and our jamaAttributes all with version number and public key token as well as an alias to use in the rest of the xml. As long as the public key token doesn’t change the referenced management packs can be upgraded without changing the refences here.

<Manifest>
  <Identity>
    <ID>jamaGroups</ID>
    <Version>1.0.0.4</Version>
  </Identity>
  <Name>jamaGroups</Name>
  <References>
   +<Reference Alias="jamaAttributes1530">
    <Reference Alias="MicrosoftSystemCenterInstanceGroupLibrary6172210">
      <ID>Microsoft.SystemCenter.InstanceGroup.Library</ID>
      <Version>6.1.7221.0</Version>
      <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
    </Reference>
   +<Reference Alias="SystemCenter">
   +<Reference Alias="MicrosoftWindowsLibrary6172210">
  </References>
</Manifest>

I’ll jump to LanguagePacks now as the other 2 parts contains UINAMEunreadablerandomstring which i don’t understand. But in the LanguagePacks it becomes clear. This is where a that random string get paired with something readable.

By replacing the random strings by something more readable will help you to understand the xml a lot better. One thing is important though, this string needs to be unique for your management group. To achieve this we start with the name of the management pack and add a description that make sense  and is unique for this management pack (we used the groupname without spaces, hyphens or underscores). Replace all the strings in the entire management pack with the new name.

We’ve also added a description to the management pack name. The agent version is our scripted agent and helps us identify whether we are running the correct group mp version or not.

<DisplayStrings>
  <DisplayString ElementID="jamaGroups">
    <Name>jamaGroups</Name>
    <Description>Agent version: 2.5.0 build 396</Description>
  </DisplayString>
  <DisplayString ElementID="jamaGroupsCustomerSub1Sub2.Group">
    <Name>Customer-Sub1-Sub2</Name>
  </DisplayString>
  <DisplayString ElementID="jamaGroupsCustomerSub1Sub2.Group.DiscoveryRule">
    <Name>Populate Customer-Sub1-Sub2</Name>
    <Description>This discovery rule populates the group ‘Customer-Sub1-Sub2‘</Description>
  </DisplayString>
</DisplayStrings>

So back to TypeDefinitions, this gives information about what kind of class your group is. And since we just replaced the random strings we can actually read this without going back and forth to the displaystrings part to find out what name we are looking at. Groups are default a non-hosted singleton class, which is fine for what we’re trying to do.

<ClassTypes>
  <ClassType ID="jamaGroupsCustomerSub1Sub2.Group" Accessibility="Public" Abstract="false" Base="MicrosoftSystemCenterInstanceGroupLibrary6172210!Microsoft.SystemCenter.InstanceGroup" Hosted="false" Singleton="true" />
</ClassTypes>

So now to the actual group definitions. The groups consist of 3 membership rules. 1 to add computers to the group, 1 to add virtual servers to the group and 1 to add healthwatcher objects of the windows computers (already contained in this group) to this group.

<Discovery ID="jamaGroupsCustomerSub1Sub2.Group.DiscoveryRule" Enabled="true" Target="jamaGroupsCustomerSub1Sub2.Group" ConfirmDelivery="false" Remotable="true" Priority="Normal">
  <Category>Discovery</Category>
  <DiscoveryTypes>
    <DiscoveryRelationship TypeID="MicrosoftSystemCenterInstanceGroupLibrary6172210!Microsoft.SystemCenter.InstanceGroupContainsEntities" />
  </DiscoveryTypes>
  <DataSource ID="GroupPopulationDataSource" TypeID="SystemCenter!Microsoft.SystemCenter.GroupPopulator">
    <RuleId>$MPElement$</RuleId>
    <GroupInstanceId>$MPElement[Name="jamaGroupsCustomerSub1Sub2.Group"]$</GroupInstanceId>
    <MembershipRules>
      <MembershipRule>
        <MonitoringClass>$MPElement[Name="jamaAttributes1530!jamaType_WindowsComputer_jamaAttributes"]$</MonitoringClass>
        <RelationshipClass>$MPElement[Name="MicrosoftSystemCenterInstanceGroupLibrary6172210!Microsoft.SystemCenter.InstanceGroupContainsEntities"]$</RelationshipClass>
        <Expression>
          <SimpleExpression>
            <ValueExpression>
              <Property>$MPElement[Name="jamaAttributes1530!jamaType_WindowsComputer_jamaAttributes"]

/jamaProperty_JamaTypeWindowsComputerJamaAttributes_CustomerName$</Property>
            </ValueExpression>
            <Operator>Equal</Operator>
            <ValueExpression>
              <Value>customer_sub1_sub2</Value>
            </ValueExpression>
          </SimpleExpression>
        </Expression>
      </MembershipRule>
      <MembershipRule>
        <MonitoringClass>$MPElement[Name="SystemCenter!Microsoft.SystemCenter.HealthServiceWatcher"]$</MonitoringClass>
        <RelationshipClass>$MPElement[Name="MicrosoftSystemCenterInstanceGroupLibrary6172210!Microsoft.SystemCenter.InstanceGroupContainsEntities"]$</RelationshipClass>
        <Expression>
          <Contains>
            <MonitoringClass>$MPElement[Name="SystemCenter!Microsoft.SystemCenter.HealthService"]$</MonitoringClass>
            <Expression>
              <Contained>
                <MonitoringClass>$MPElement[Name="MicrosoftWindowsLibrary6172210!Microsoft.Windows.Computer"]$</MonitoringClass>
                <Expression>
                  <Contained>
                    <MonitoringClass>$Target/Id$</MonitoringClass>
                  </Contained>
                </Expression>
              </Contained>
            </Expression>
          </Contains>
        </Expression>
      </MembershipRule>
      <MembershipRule>
        <MonitoringClass>$MPElement[Name="jamaAttributes1530!jamaType_VirtualServer_jamaAttributes"]$</MonitoringClass>
        <RelationshipClass>$MPElement[Name="MicrosoftSystemCenterInstanceGroupLibrary6172210!Microsoft.SystemCenter.InstanceGroupContainsEntities"]$</RelationshipClass>
        <Expression>
          <SimpleExpression>
            <ValueExpression>
              <Property>$MPElement[Name="jamaAttributes1530!jamaType_VirtualServer_jamaAttributes"]/jamaProperty_JamaTypeVirtualServerJamaAttributes_CustomerName$

</Property>
            </ValueExpression>
            <Operator>Equal</Operator>
            <ValueExpression>
              <Value>customer_sub1_sub2</Value>
            </ValueExpression>
          </SimpleExpression>
        </Expression>
      </MembershipRule>
    </MembershipRules>
  </DataSource>
</Discovery>

Script the XML

Even when you don’t understand the xml above, you’ll see the only thing that will change is the part in blue when you add more groups using the same criteria.

Our original plan was to create nested groups. However, you can’t script this as a group will get a unique ID upon import of the management pack and nested groups are created based on this UID. As this is a random ID, it’s the reason why we fill the groups with just all the computers in the groups. Also, the operators wouldn’t see a difference when the click the “scope” button, it would just list all availabe groups to them whether they are nested or not.

So we have actually 3 different groupcriteria/expessions for customer, customer_sub1 and customer_sub1_sub2 (example above).

expression for customer:

<Expression>
  <SimpleExpression>
    <ValueExpression>
      <Property>$MPElement[Name="jamaAttributes1530!jamaType_WindowsComputer_jamaAttributes"]

/jamaProperty_JamaTypeWindowsComputerJamaAttributes_CustomerId$</Property>
    </ValueExpression>
    <Operator>Equal</Operator>
    <ValueExpression>
      <Value>customer</Value>
    </ValueExpression>
  </SimpleExpression>
</Expression>

expression for customer_sub1:

<Expression>
  <RegExExpression>
    <ValueExpression>
      <Property>$MPElement[Name="jamaAttributes1530!jamaType_WindowsComputer_jamaAttributes"]

/jamaProperty_JamaTypeWindowsComputerJamaAttributes_CustomerName$</Property>
    </ValueExpression>
    <Operator>MatchesRegularExpression</Operator>
    <Pattern>^customer_sub1_.*$</Pattern>
  </RegExExpression>
</Expression>

All the different groups can be derived from 1 string which we set upon agent install. When we create a new agent we have to edit an ini file with all the possible customername string allowed and the agent install fails when this string is incorrect. This is because we don’t want too much groups to be created.

Because we have the customername string and the rest doesn’t change in the xml, we can script the xml. I’ve used vbs as i’ve done lots of scripting with vbs in the past. I probably should have used the xmldom object, but i only found out of this object when i was nearly finished :). A simple writeline function to a file did the trick as well. Just use any scripting language you are comfortable with…

So first i read the ini and put the all possible strings in an array (arrRawGroups). Then i created the 3 different kind of groups i would need and put them in an array as well. Below is an example of one of the functions i used to create the customer groups.

function findTopLevelGroups(arrRawGroups)
‘ function finds the "customer names" and creates the toplevelgroups based on this name.
‘ returns array of "customer: " groups to create.
    Dim strRawGroup
    Dim arrHelper
    Dim strTopGroup
    Dim strTgroup
    Dim arrTopGroups()
    Dim bExist
    Dim intCount

intCount = 0
    Redim arrTopGroups(intCount)

wscript.echo "Finding Top Level groups."

for each strRawGroup in arrRawGroups
        arrHelper = Split(strRawGroup, "_", -1, 1)
        strTopGroup = arrHelper(0)
        if arrTopGroups(0) = "" then
            arrTopGroups(0) = strTopGroup
            intCount = intCount + 1
        else
            bExist = FALSE
            for each strTgroup in arrTopGroups
                if strTgroup = strTopGroup then
                    bExist = TRUE
                end if
            next
            if not bExist then
                Redim Preserve arrTopGroups(intCount)
                arrTopGroups(intCount) = strTopGroup
                intCount = intCount + 1
            end if
        end if
    next

findTopLevelGroups = arrTopGroups

end function

After that it’s just echo-ing the manifest, typedefinitions, monitoring and languagepacks to an xml file and replacing the “blue” strings with the appropriate groupnames. I’ve found only the monitoring part a bit challenging. Not that the coding is diffucult, but the xml parts are quite large and a simple typo makes all the difference in the world. Last but not least, when you automatically increase the version number of the management pack file it allows you to upgrade the existing management pack, whenever a new group needs to be created.

There is another approach to all this, Mike Eisenstein created a non-singleton group which creates instances of any subgroup. For us this had the drawback that it would create the groups of any name in the registry. But some might like his approach as well. See case 11 on his blog.

Posted in grouping and scoping | 1 Comment »