JAMA00

SCOM 2007 R2

Updating manual installed agents from the console

Posted by MarcKlaver on May 9, 2011

We have created a management pack, that will update the manual installed agents, from a task in the operations console. However, before you can use this management pack you should have implemented the JDF framework for file distribution. This management pack depends on the framework, so if you are not capable of setting up the framework, this management pack is useless to you.

 

Below is the final XML file for your management pack called jamaAgent.Update.xml:

<?xml version="1.0" encoding="utf-8"?><ManagementPack ContentReadable="true" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <Manifest>
    <Identity>
      <ID>jamaAgent.Update</ID>
      <Version>0.4.0.0</Version>
    </Identity>
    <Name>jamaAgent.Update</Name>
    <References>
      <Reference Alias="MSWL">
        <ID>Microsoft.Windows.Library</ID>
        <Version>6.1.7221.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
      <Reference Alias="MSSCL">
        <ID>Microsoft.SystemCenter.Library</ID>
        <Version>6.1.7221.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
    </References>
  </Manifest>
  <Monitoring>
    <Tasks>
      <Task ID="jamaAgent.Update.ConsoleTask.AgentUpdate" Accessibility="Public" Enabled="true" Target="MSSCL!Microsoft.SystemCenter.HealthService" Timeout="300" Remotable="true">
        <Category>Custom</Category>
        <WriteAction ID="PA" TypeID="MSWL!Microsoft.Windows.ScriptWriteAction">
          <ScriptName>jamaTaskUpdateAgent.vbs</ScriptName>
          <Arguments />
          <ScriptBody><![CDATA[
‘——————————————————————————-
‘ File   : jamaTaskUpdateAgent.vbs
‘ Use    : Script for the jama Agent Update task.
‘ SVN    : Revision: 136
‘          Date: 2011-04-12 09:14:33 +0200 (Tue, 12 Apr 2011)

‘ Note(s): 1) —
‘——————————————————————————-
option explicit
on error goto 0
setlocale("en-us")

const INT_RETRIES           = 5
const INT_DEFAULT_WAIT_TIME = 120
const STR_DOWNLOAD_PATH     = "/files/opsmgr/updates/agent/"

jamaMain()

‘——————————————————————————-
‘ jamaMain

‘ Use    : Main entry for this script.
‘ Input  : —
‘ Returns: —
‘ Note(s): 1) —
‘——————————————————————————-
function jamaMain()
    dim jdf
    dim iMinutesToWait
    dim bForceUpdate

    if(not jdfLoadFramework(jdf, null, null, null, null, null)) then
        wscript.echo "The Jama Distribution Framework could not be loaded. No update is performed."
    else
        wscript.echo "Jama Distribution Framework loaded" & vbNewLine & vbNewLine & jdf.GetInfoString(wscript.fullname)
        ‘—————————————————————————
        ‘ Your code here!
        ‘—————————————————————————
        if(jamaInitialize(iMinutesToWait, bForceUpdate)) then
            if((bForceUpdate = false) and jamaRestartRequired()) then
                wscript.echo "Update could not be scheduled due to dependent services." & vbNewLine & _
                             "Use the force option to override this behaviour and force the update to run."
            else
                if(jamaScheduleUpdateIn(jdf, iMinutesToWait) = true) then
                    wscript.echo "Update is scheduled to run in " & iMinutesToWait & " minutes."
                else
                    wscript.echo "Update could not be scheduled. No update will be performed!"
                end if
            end if
        else
            wscript.echo "Initilization failed. No update will be performed!"
        end if
    end if
end function

‘——————————————————————————-
‘ jamaInitialize

‘ Use    : Initialize the script.
‘ Input  : iMinutesToWait – integer – Number of minutes to wait before
‘                                     activating the schedule (output)
‘ Returns: Boolean – TRUE  – No errors detected.
‘                    FALSE – An error was detected.
‘ Note(s): 1) —
‘——————————————————————————-
function jamaInitialize(byref iMinutesToWait, byref bForceUpdate)
    dim colNamedArguments
    dim bResult
    dim bFound

    bResult = false
    bFound  = false
    bForceUpdate = false
    set colNamedArguments = WScript.Arguments.Named

    if(colNamedArguments.Exists("minutes")) then
        iMinutesToWait = cint(lcase(colNamedArguments.Item("minutes")))
        bFound         = true
        bResult        = true
    end if

    if(colNamedArguments.Exists("force")) then
        if(lcase(colNamedArguments.Item("force")) = "true") then
            bForceUpdate = true
        end if
    end if

    if(not bFound) then
        iMinutesToWait = INT_DEFAULT_WAIT_TIME
        bResult        = true
    end if

    jamaInitialize = bResult
end function

‘——————————————————————————-
‘ jamaRestartRequired

‘ Use    : Checks if an update of the agent will trigger a restart or reboot.
‘ Input  : —
‘ Returns: Boolean – TRUE  – A restart is required.
‘                    FALSE – A restart is not required.
‘ Note(s): 1) If the dependency could not be determined, this function will
‘             return true.
‘          2) For windows 2000 this function will always return true.
‘          3) When true on a 2003 server, a reboot is required but not forced.
‘          4) When true on 2008 or higher, the dependent services could be
‘             restarted by the installer service.

‘——————————————————————————-
function jamaRestartRequired()
    dim strCmd
    dim iResult
    dim bResult
    dim fso
    dim fh
    dim strTempFile
    dim strLine

    strTempFile = "jamaTaskUpdateAgent.$$$"
    bResult = true
    strCmd  = "tasklist /fo csv /m EventCommon.dll /FI ""imagename ne HealthService.exe"" /FI ""imagename ne MonitoringHost.exe"" > " & strTempFile
    jamaRun(strCmd)

    set fso = CreateObject("scripting.filesystemobject")
    if(fso.FileExists(strTempFile)) then
        on error resume next
            err.clear
            set fh = fso.OpenTextFile(strTempFile, 1)                           ‘ Open for reading.
            if(err.number = 0) then
                do while(not fh.AtEndOfStream)
                    strLine = lcase(fh.ReadLine)
                    if(instr(strLine, "info: no tasks") > 0) then               ‘ No dependencies found.
                        bResult = false
                    end if
                loop

                set fh = fso.GetFile(strTempFile)
                fh.delete(true)                                                 ‘ delete file.
            end if
        on error goto 0
    end if

    jamaRestartRequired = bResult
end function

function jamaRun(byval strCmd)
    dim objShell
    dim iResult

    iResult      = 0
    set objShell = wscript.createObject("wscript.shell")
    iResult      = objShell.run("cmd /c " & strCmd, 0, true) ‘ hidden and wait for result
    set objShell = nothing
    jamaRun      = iResult

    jamaRun = iResult
end function

‘——————————————————————————-
‘ jamaScheduleUpdateIn

‘ Use    : Retrieve and schedule the required update.
‘ Input  : jdf            – object  – JDF object
‘          iMinutesToWait – integer – Minutes to wait for the schedule.
‘ Returns: Boolean – TRUE  – No errors.
‘                    FALSE – An error was detected.
‘ Note(s): 1) —
‘——————————————————————————-
function jamaScheduleUpdateIn(byref jdf, iMinutesToWait)
    dim strSourceFile
    dim strTargetFile
    dim strTargetDir
    dim iCount
    dim iResult
    dim bResult
    dim strCmd
    dim fso

    iCount = 0
    bResult = false
    set fso = CreateObject("scripting.filesystemobject")
    strSourceFile = "jamaAgentUpdate" & jdf.Platform & ".exe"
    strTargetDir  = jdf.ExpandString("$JDF_IN_DIR$" & "jamaAgentUpdate\")
    if(jdf.CreateDirectory(strTargetDir)) then
        strTargetFile = strTargetDir & strSourceFile
        bResult = fso.FileExists(strTargetFile)
        if(not bResult) then
            do
                iResult = jdf.GetFile("jdfBaseDistribution", STR_DOWNLOAD_PATH & strSourceFile, strTargetFile)
                iCount  = iCount + 1
            loop while((iCount < INT_RETRIES) and (iResult <> 0))
        end if
        if(iResult = 0) then
            bResult = fso.FileExists(strTargetFile)
            if(bResult) then
                strCmd  = strTargetFile & " /verysilent"
                bResult = jdf.ScheduleTaskIn(strCmd, iMinutesToWait)
            end if
        end if
    end if
    jamaScheduleUpdateIn = bResult
end function

‘——————————————————————————-
‘ From template.vbs
‘——————————————————————————-
‘——————————————————————————-
‘ jdfLoadFramework

‘ Use    : Load and initialize the JDF framework.
‘ Input  : objJDF               – object – Object passed back with framework.
‘          bInitialize          – bool   – true or null for initializing.
‘          strVersion           – string – Minimum framework version required.
‘          bForceVersion        – bool   – true, false or null.
‘          strCustomerId        – string – Customer id or null.
‘          strDefaultUploadPath – string – Default upload path.
‘ Returns: bool – TRUE  – Initialization of the framework succeeded.
‘                 FALSE – Initialization of the framework failed.
‘ Note(s): 1) strVersion = null
‘                 Any version of the framework will be accepted. The value of
‘                 strForceVersion will be ignored.

‘          2) strVersion = "x.x.x"
‘                 Integer values, seperated by dots, e.g. : "3.5.11"

‘          3) bForceVersion = null/false
‘                 The given version is a minimum version required for the
‘                 framework.

‘          4) bForceVersion = true
‘                 The framework version must exactly match with the given
‘                 version number.

‘          5) strCustomerId = null
‘                 The customer id will be retrieved from the registry. See
‘                 the documentatiohn for more information.

‘          6) strDefaultPath = null
‘                 The default upload path will be set to the root: "/". This
‘                 argument can hold JDF variables (both default and custom
‘                 provided in the jdf.jdp file). See the documentation for more
‘                 information.

‘          7) Normal use is:
‘                 jdfLoadFramework(jdf, null, null, null, null, null)
‘                 where jdf is the object you pass to the function.
‘——————————————————————————-
function jdfLoadFramework(byref objJDF, byval bInitialize, byval strVersion, byval bForceVersion, byval strCustomerId, byval strDefaultUploadPath)
    dim bResult, fso, strFrameworkFile, objReg, strResult

    bResult   = false
    strResult = ""
    on error resume next
        err.clear
        set objReg=GetObject("winmgmts:{impersonationLevel=impersonate}!\\.\root\default:StdRegProv")
        if(err.number = 0) then
            objReg.GetStringValue  &H80000002, "SOFTWARE\Company\jama\jdf", "path", strResult     ‘ If you use another registry location, change this value!
            if((err.number = 0) and (strResult <> "") and (not isnull(strResult))) then           ‘ We got something, so try it.
                strFrameworkFile = strResult & "jdf.wsc"                                          ‘ This now should be the full path to the framework file.
                set fso = CreateObject("Scripting.FileSystemObject")
                if(err.number = 0) then
                    if(fso.FileExists(strFrameworkFile)) then
                        set objJDF = GetObject("script:" & strFrameworkFile)                      ‘ File exist, try to load it.
                        if(err.number = 0) then                                                   ‘ Framework found and object loaded.
                            if(bInitialize = false) then                                          ‘ No initialization.
                                bResult = false
                            else                                                                  ‘ Initialize the framework for use.
                                bResult = objJDF.InitializeFramework(strVersion, bForceVersion, wscript.scriptname, strCustomerId, strDefaultUploadPath)
                            end if
                        end if
                    end if
                end if
            end if
        end if
    on error goto 0
    jdfLoadFramework = bResult
end function
]]></ScriptBody>
          <TimeoutSeconds>300</TimeoutSeconds>
        </WriteAction>
      </Task>
    </Tasks>
  </Monitoring>
  <LanguagePacks>
    <LanguagePack ID="ENU" IsDefault="false">
      <DisplayStrings>
        <DisplayString ElementID="jamaAgent.Update">
          <Name>jamaAgent.Update</Name>
        </DisplayString>
        <DisplayString ElementID="jamaAgent.Update.ConsoleTask.AgentUpdate">
          <Name>jama Agent Update</Name>
        </DisplayString>
      </DisplayStrings>
    </LanguagePack>
  </LanguagePacks>
</ManagementPack>

Now the script in this management pack expects a few things.

 

const STR_DOWNLOAD_PATH = "/files/opsmgr/updates/agent/"

This constant is used to specify the path on the remote SCP server, where the update files can be retrieved. If you use another location, change the value of the variable.

 

strSourceFile = "jamaAgentUpdate" & jdf.Platform & ".exe"

This value is used to generate the name of the update to retrieve. We have re-packaged the two update files required for the agent update into a single executable (see also this article from Kevin Holman for more information). You can download an archive with three packages (for all suported platforms) on this location. If you create you own, just make sure the final name of the update package corresponds with the final strSourceFile variable value:

jamaAgentUpdateIA64.exe –> For the itanium platform
jamaAgentUpdateX64.exe –> For the x64 platform
jamaAgentUpdateX86.exe –> For the x86 platform

 

What we actually do is very simple.

  • We detect the platform we are running on and construct the correct file name of the file to retrieve (this update package contains the required .msp files for the agent).
  • We retrieve the update package.
  • We schedule the update package to run in ## minutes and to run unattended.
  • We hope for the best 🙂

If something goes wrong with the installation, the failure can be found in the log file created by the update. There are two logfiles:

%TEMP%\jamaAgentUpdate.<platform>.log
%TEMP%\jamaAgentUpdate.ENU.<platform>.log

Also the package itself will again detect we if we are running on the correct platform and only start the update if the package is the correct platform version.

 

As you can see in the .XML file, we use the jdfBaseDistribution account for retrieving the file. So you don’t need to configure the SCP server for every customer you have. Just make the file available for the jdfBaseDistribution account. The task itself is targeted against the health service. Select the “Agent By Version” leaf in the console, to get an overview of all your agents and their version:

image

If you select a health service, the task pane will show you the new update task:

image

Note that the task is not version aware, so it will always be available and will run. So you can do an update over a current installed agent (which does work, without issues). After selecting the task, the tasks pane will be shown and you can change the arguments for the script.

image

If you don’t change anything, the task will try to schedule the update to run in 120 minutes. If you require the another time, use the /minutes:## argument. The task will be scheduled ## minutes in the future. So if I want the update to run within 5 minutes:

image

 

 

The first thing the task does is checking for dependencies (for more information about the dependencies, see this link). If no dependencies are found, the job is scheduled ## minutes in the future (or 120 if no override is given):

image

 

 

If a dependency is found, the tasks will not schedule the update:

image

But as the output shows you can override the dependency detection, by adding the /force:true option in the argument list. If set the update will always run, even if there are dependencies detected.

 

After the update is finished, the agent should reflect the new version information:

image

You now can update your manual installed agent, using the operations console 🙂

Advertisements

Posted in Agent | Leave a Comment »

JDF: Jama Distribution Framework

Posted by MarcKlaver on March 28, 2011

This blog describes our framework for file distribution between a SCOM agent and our central SCOM environment. After this framework is installed, you will be able to transfer files between your SCOM agent and a central location.  Before you continue, be warned. I can not deliver a single file, which will do all that I just promised. I will however guide you through the work that needs to be done to setup a file distribution framework for SCOM, but you will need to change scripts, compile the management pack and install and configure a secure copy server. Changes will be minimized as much as possible and most actions are automated with scripts. But first things first, let’s describe the general idea behind the framework……

 

Transferring files

The idea behind the framework is to write an set of scripts, that would make it possible to transfer files to and from the SCOM agent. The main trigger for this wish was our installed base, which are manual installed agents only. Updating these agents manually is a long job for every CU update. If we could just schedule an update from the operations console and sit-back, that would be great. But in order to do that, we need to be able to get the required update files to the SCOM agent. Keep in mind that we do not have a single customer but multiple customers, all with (or without) their own file distribution method. Being able to distribute and update from the operations console itself would be ideal.

But transferring files should also be secure. Our solution: Secure Copy, based on private / public key pairs for data transfers.

And since we have multiple customers, we need to make sure that Customer A is not able to see or change any data we retrieve from Customer B (and vice versa). Therefor we have to implement a secure way to copy files to and from the SCOM agent and separate data based on customers. Our solution: A Secure Copy implementation, that is capable of creating virtual directory structures, based on a customer (secure copy) account.

We did not want to change any firewall rules (with the exception of adding a new management server). So we basically wanted to use communication from the SCOM agent on TCP port 5723 and only initiate communication from the SCOM agent. Our solution: A Secure Copy server configured to listen on port 5723 for secure copy clients.

Finally it needed to be implemented on all our agents, not just a subset of agents. Our solution: A Microsoft Windows Scripting Component, written in vbscript, which will run on all Windows versions we support (Windows 2000 and higher).

So if we draw this in a high overview, it would look like this:

image

And yes it looks very simple 🙂 So the first thing we need is a secure copy server.

 

Secure Copy server

The secure copy server can be any implementation, as long as it is capable of implementing private/public key pairs and able to generate virtual directories. We use the bitvise winsshd service. Now before we start implementing this service, we need to make some rules.

  1. Only logins with private/public ssh key pairs are allowed.
  2. Only secure copy is allowed (no sftp or ssh).
  3. There will only be one general (shared) account and this account has read only access only to it own virtual directories. We call the shared account: jdfBaseDistribution This will prevent accounts from uploading data to a “shared” location. This is also the account that will be used to distribute the framework.
  4. Data will be compressed as much as possible, but the framework will not rely on the secure copy implementation and will “zip” the files to upload into a “package”.
  5. When a “package” is downloaded, it is expected to be a “zip” file, which can be extracted at the SCOM agent side by the framework.
  6. Files (not “packages”) that will be downloaded, must be compressed files to reduce network traffic (this is not forced by the framework).
  7. Customer accounts will have the naming convention: jdfCustomer_<CustomerName>
  8. The root (virtual) directory for every account will always be empty. All directories will be read only, with the exception of the “upload” directory. But it will not be possible to remove files from the “upload” directory. The following structure will be used for every customer

image 

Off course you can implement your own directories and rules for read/write, but these are the rules we use to explain the framework. If you start with these, the examples and code will work correctly.

 

Note: The shared account (jdfBaseDistribution) will not have a /upload directory!

A seperate account for each customer

We also need an account for every customer we support. Each customer will have the same folder layout (as show above), but this will be virtual directories. This will result in the same “view” for every customer, but data will be uploaded and downloaded from customer specific locations on the Secure Copy server.

image

Note: Files can not be deleted from the upload directory.

As can be seen, the account jdfCustomer_JAMA has four virtual mount paths. And with the exception of the root path (which should never be used), all data is ‘re-directed’ to a directory specific for that account. So if we create a second account (jdfCustomer_JAMA00), you can see that it has its own ‘re-directed’ directories.

image

This results in the same view for both customers (the Virtual mount path), but data being stored and retrieved from physical different locations.

We only allow the secure copy protocol and only logins with private/public key pairs. On how to configure winsshd with private/public key pairs, see this link.

 

ssh key pairs

For each customer we need to create an ssh public/private key pair. We use puttygen.exe to create the key pairs. The public key is saved as “jdfCustomer_<CustomerName>.public” and the private key is saved as “jdfCustomer_<CustomerName>.private”.

Note: You can not use a password phrase for the private key file (not supported by the framework). So be sure to only distribute the private key file to the correct customer.

When creating the keys with puttygen.exe, you should use the default Paramters, as shown below:

image

When your implementation of the secure copy server is working correctly you can continue with the next step: configuring the framework. Just make sure you can use the secure copy server with private/public key pairs before you continue.

 

The JDF Framework

The JDF Framework itself needs to be configured for use in your environment. What is required is fully described in the documentation of the framework. Although I tried to create a generic framework, still some environment dependent settings are required. But first you should download the framework 🙂

The framework can be downloaded here

The documentation can be downloaded here

 

What’s next

If you have setup the framework, you can start testing it. The framework download includes two examples on how to use the framework. The first thing we created with this framework was the ability to update our manual installed agents from the console (we have this working in our test environment). In my next blog I will create a management pack which will update your (manual) installed agents to CU4. Just make sure you got the framework up and running 🙂 If you fail to do so just write a comment and I will try to answer your questions.

Posted in management packs | 1 Comment »

SCOM’s “un-discovery”. What doesn’t work here… And how to correct it.

Posted by rob1974 on January 26, 2011

 

SCOM’s main benefit of monitoring imho is it’s ability to discover what is running on a server and based on that information start to monitor the server with the appropriate rules. When you follow Microsoft’s best practices you’ll first perform a lightweight discovery to create a very basic class and have the more heavy discoveries run against that basic class. This is pretty good stuff actually. it helps quite a lot for the performance of an agent as it will only run heavy discoveries if the server has an application role and never run on servers which have nothing to do with that application.

However, I’ve recently found out a drawback with this 2 step discovery, which I can probably explain the best with a real world example:

Discover the windows domain controllers on “windows computers” (the management pack from where this discovery runs in is an exception. usually it’s in the application mp itself; apparently MS thought of domain controllers being basic info. similar discoveries for workstation and member servers can be found in this mp as well). For this discovery a wmi query is used to determine if the “windows computer” is a domain controller as well (SELECT NumberOfProcessors FROM Win32_ComputerSystem WHERE DomainRole > 3; if this returns something, it’s a dc)

image

When it is a “windows domain controller” it will run a few other discoveries to determine more info.image

Just by looking at the classes you can imagine it’s not really lightweight anymore.

image

So far so good, on all my windows computer I run a simple query and if that query returns something SCOM will also run a script that founds more interesting stuff about the DC.

But here’s the catch with this kind of discovery. Suppose I don’t need a certain DC anymore, but I still need to keep the server as it’s running some application I still need to use and monitor. What will happen? The lightweight discovery will do its job. It will correctly determine that the server is not a “windows domain controller” anymore and as a result it won’t run the script-discovery anymore.

You might ask, why is that bad, we didn’t want that, did we? Yes you are correct, we didn’t want to run this discovery against servers that aren’t DC’s, but SCOM doesn’t unlearn the discovered classes automatically. Because this discovery never runs again SCOM never unlearns this server doesn’t have the “Active Directory Domain Controller Computer Role” anymore. And this is the class that is used for targetting rules and monitors. So allthough SCOM knows the server isn’t a “windows domain controller” anymore, it still is monitoring the “Active Directory Domain Controller Computer Role”. This will result in quite a lot of noise (script errors, ldap failures, etc).

For now, there’s just a workaround available. You will need to override the 2nd discovery for that particular server. As the first discovery doesn’t include this server as an object of class, you can’t override the discovery for a “specific object of class: Windows Domain Controller”. You’ll need to create a group and include the server object. Then use the override the object discovery “for a group…” and choose the group you’ve just created.

image

What’s the point of disabling a discovery that didn’t run anyway? Well now you can go to powershell and run the “Remove-DisabledMonitoringObject” cmdlet. This will remove the discovered object classes for this discovery and all of the monitoring attached to those classes.

Discoveries make SCOM stand out from other monitoring tools, but it needs to work both ways. Finding out this took me about 1 day. And that’s just 1 issue with 1 server (DNS was also installed on this server and had the same issue). Loads of servers might change role without me knowing about it and when it’s not being reported to me I’ll just have extra noise in SCOM. I’m just not sure if this can be picked up within SCOM itself or that the “un-discovery” needs to be done by the mp’s logic. For the AD part it needs to be picked up by Microsoft anyway, but if the logic is build in the management pack then it will have an impact on all the custom build mp’s by all you SCOM authors out there.

Posted in general, management packs, troubleshooting | 3 Comments »

Distributing files with SCOM

Posted by MarcKlaver on October 29, 2010

Didn’t you wish there was a way to distribute files, using the SCOM environment? And not depend on others to get a file across? Well we did and we wrote a management pack that does just that; distributing files to the target servers. But hold on, don’t get too exited it is still SCOM and no file distribution application, so our solution has a few disadvantages:

  1. Targeting – If you target the MP to a class, all servers in that class will get the MP (even if all rules in the MP are disabled by default).
  2. Scheduling – There is no way to schedule a delivery to a remote server. As soon as the MP is imported and the RMS detects the new configuration, the MP will be distributed.

Both issues can result in very high network traffic, if not taken into account. Now the first one we can slightly control, by targeting at specific classes. The more specific the class the better. Of course we wanted our files to be distributed to all computers 🙂 The second one is only under control by controlling the time of the MP import. Of course this is not ideal, but till now we had nothing.

What we do is creating a management pack, that will hold a script. In that script, all other files we want to distribute are placed in a comment section at the bottom of the script, after being converted to a hex notation. It looks something like this:

‘<BEGIN_FILE>jamaMaintenanceMode.vbs
‘272D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D
‘0A272046696C6520202020202020203A206A616D614D61696E
‘74206C6F6720656E74727920746F207075742074686520636F
‘61696E74656E616E6365206D6F64652E0D0A2720246376735F
‘652E7662732C7620312E3420323030392F30382F3331203130
‘2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D2D

At the other side (the agent side) we reconstruct the files again, and place the files on a fixed location. Now we have distributed the files using SCOM! The solution can distribute both text based files as binary files, but binary files will double in size when converted to the hex notation. So distributing a 500K binary file, will end up with a 1MB script to be distributed (and a 2MB script on the agent side, were it is stored in unicode).

So in short it is a nice method for distributing either small files or to a very limited set of targets. Distributing large files to all agents isn’t a good idea with this method (but it will work).

 

What do you need:

image

The above files and directory structure.  You should create this directory structure before continuing:

include_binary – This is the location were you need to place your binary files, which need to be distributed.

include_text – This is the location were you need to place your text files, which need to be distributed.

output – This directory is used to store the generated code.

The files you need can be found here!

 

Now importing this MP’s without editing, results in a management pack doing nothing. So if you want to distribute a file, this is how:

  1. You should edit the BuildTargetScript.cmd file for every change in your distribution you want to be delivered:

    image

    The script that will generate the files, will check for this version number. If it is not equal to what is found in the registry it will recreate the files.  The STR_TARGET_DIR variable will hold your target directory on the target machines. This variable is used to check if the files can be generated correctly on your local desktop.

  2. You should also change the jamaTextDistribution.vbs

    image

    These two constants will be combined to create your target directory for the files:

    %SystemDrive%\STR_BASE_DIR\STR_DISTRIBUTION_DIR

    All files will be placed inside the above directory (this combined directory, must be equal to the STR_TARGET_DIR form the BuildTargetScript.cmd file above).

    Secondly you should change were in the registry you would like to store the version number:

    image

    You only have to do this once. If the key can not be found, it will be created.

     

  3. Finally you have to change the jamaDistribution.Distribution.xml file that will be distributed. Running the BuildTargetScript.cmd file will generate an ouput file in the output directory. The complete contents of that file needs to be inserted into this .xml file. Below you can find the location where inside the .xml file you need to paste the contents of the output file:

    image 

  4. Now import the two management packs and your file(s) will be distributed.  NOTE: Default the script will run every hour, but if all files are present, the impact is minimal.

 

What’s next?

Well since we can now get any file at the agent side we can start building a complete file distribution system (which we will :)) and after that we can finally automate the update of our manual installed agents and fully automate the manual installation!

Posted in management packs | Leave a Comment »

WINS Connector Alert

Posted by rob1974 on August 26, 2010

The WINS connector checks the WINS lookup by a DNS server. I suppose the monitor only runs when you have configured DNS to use WINS forward lookup.

image

In order for this monitor to work you need to have a static record in WINS which doesn’t exist in DNS (this is not mentioned in the knowledge, but it runs a nslookup, so make sure it doesn’t resolve by using dns) and configure the monitor to lookup this HostName. Default the monitor looks up “PlaceHolder”, so you could just create a Wins record named “PlaceHolder”.

I had done this, but i still received errors and when i ran the nslookup query i did receive a valid response for placeholder, so the wins connector does work.

The knowledge mention something about a debug flag to get some helpful troubleshooting information to solve the issue. The description of the debug events i got:

DNS.TTL.vbs : Starting DNS.TTL.vbs Host:PlaceHolder Server:xxx.xxx.xxx.xxx

DNS.TTL.vbs : Writing Property Bag . State=False ttl1:0 ttl2:0 Authority Flag:

It’s not helpful at all and even worse it’s wrong as well. The vbscript file’s name is ttl.vbs, so look for this in the “system center management” folder. Also the parameters are wrong so manually running it fails as well.

To run the script manually on the dns server open a commandline box and run:

path.to.ttl.vbs>cscript /nologo ttl.vbs <hostname> <dnsserverip> false

<hostname>= static wins entry, which doesnt exist in dns (placeholder)  (The script says to fill in fqdn, but this is incorrect as well).

<dnsserverip> = listening ip address of the dns server 

bolean = debug flag. When you set this to true you get the worthless debug information in scom, so keep it on false 🙂

When you save the output to an xml file and open it you’ll get something like this.

<Collection>

  <DataItem type=”System.PropertyBagData” time=”2010-08-25T17:18:49.1162249+02:00” sourceHealthServiceId=”BA0AF2AD-5058-0DA0-D5D0-BF3CDD878B88“>

    <ConversionType>StateData</ConversionType>

    <Property Name=”state” VariantType=”8“>ERROR</Property>

  </DataItem>

</Collection>

i’ve modified the script so it will run show the stdout for the nslookup and show a line that it has exited the regex compare function (which it shouldn’t when it functions ok).

Just save the script below to a temp location and run it from there. When you run this vbs and all goes well you should just get this as output:

std_output:

————
Got answer:
    HEADER:
        opcode = QUERY, id = 1, rcode = NOERROR
        header flags:  response, want recursion, recursion avail.
        questions = 1,  answers = 1,  authority records = 0,  additional = 0

    QUESTIONS:
        xxxxxxxxxxx, type = PTR, class = IN
    ANSWERS:
    ->  xxxxxxxxxxxxxxxxx

        name = xxxxxxxxxxxxxxxxxx
        ttl = 168 (2 mins 48 secs)

————
Server:  xxxxxxxxxxxxxxxxxx
Address:  xxxxxxxxxxxxxxxxxx

————
Got answer:
    HEADER:
        opcode = QUERY, id = 2, rcode = NOERROR
        header flags:  response, auth. answer, want recursion, recursion avail.
        questions = 1,  answers = 1,  authority records = 0,  additional = 0

    QUESTIONS:
        xxxxxxxxxxxxxx, type = A, class = IN
    ANSWERS:
    ->  xxxxxxxxxxxx

        internet address = xxxxxxxxxxxx
        ttl = 1200 (20 mins)

————
Name:    xxxxxxxxxxxxxxx
Address:  xxxxxxxxxxxx

When it fails it will log a line before this output:

exit function at regex: 16

This means the WINS_LOOKUP_REGEX array fails at  “”^\s*ttl = “,_” or the line after that.

I couldn’t be bothered to figure out the exact regular expression mismatch, rewrite the WINS_LOOKUP_REGEX array, disable the monitor and create a new one with a new script. I’ve just disabled this monitor as it’s just gives me incorrect information.

Modified script:

'
' Microsoft Corporation
' Copyright (c) Microsoft Corporation. All rights reserved.
'
' ttl.vbs
'
' Determine if a wins connector is healthy.
'
' Parameters -
'                       TargetComputer	The FQDN of the computer targeted by the script.
'			Server - the listening ip
'                      	DebugFlag          True / False

Option Explicit

SetLocale("en-us")

Const DNS_TRACEEVENTNUMBER	= 1125
Const DNS_SCRIPTNAME = "DNS.TTL.vbs"
Const SCOM_ERROR=1
Const SCOM_WARNING=2
Const SCOM_INFORMATIONAL=4
Const SCOM_PB_STATEDATA = 3
Const NSLOOKUP_PATH = "%SystemRoot%\system32\nslookup.exe" 

Dim ImagePath, oWMI, rc, oArgs, oAPI, oDiscoveryData, oInst, SourceID, ManagedEntityId, TargetComputer, OSVersion, oDebugFlag
Dim TTL1 , TTL2, AuthorityFlag
dim host,server,bolDebug
dim objAPI  ,boolWins , oPropertyBag
Dim  sCommand,iErrCode, sOutput, sError,m_sNetshPath ,aSubMatches,  oShell, objArgs

Dim WINS_LOOKUP_REGEX

WINS_LOOKUP_REGEX = Array( _
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 ".*\r\n",_
			      "^[\s]*questions = [0-9]*,",_
			      "^[\s]*answers = ",_
			      "[0-9]*,",_
			      "^[\s]*authority records = ",_
			      "[0-9]*",_
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 ".*\r\n",_
                                 "^\s*ttl = ",_
                                 "[0-9]*.*" )         

'***************
'
' start here.
'
'***************
On Error Resume Next

Set objAPI = CreateObject("MOM.ScriptAPI")
If Err.Number <> 0 Then
  Wscript.Quit
end if
Set oPropertyBag = objAPI.CreateTypedPropertyBag(3)
If Err.Number <> 0 Then
  ThrowErrorAndExit "CreateStateDataTypedPropertyBag failed. code = " & Err.Number
end if

Set objArgs = WScript.Arguments
If objArgs.Count <> 3 Then
    Call objAPI.LogScriptEvent( DNS_SCRIPTNAME & " <host>  <computername>  [debug [true | false]")
    wscript.Quit
End If 

host = objArgs(0)
server = objArgs(1)
bolDebug= objArgs(2)

Set oShell = CreateObject("WScript.Shell")
boolWins=cbool(false)
TTL1=0
TTL2=0

Set oShell = CreateObject("WScript.Shell")

trace "Starting " & DNS_SCRIPTNAME & " Host:" & host  & " Server:" & server 

sOutput=ExecuteCmd("-debug -querytype=a " +host + " "+ server ,NSLOOKUP_PATH  ,true)
If  LCase(sOutput) <> "error" Then

	If   GetSubMatches( WINS_LOOKUP_REGEX, sOutput, sOutput, aSubMatches) Then

		TTL1=cint(aSubMatches(8) )
		AuthorityFlag=cint(aSubMatches(4) )

    end if
end if

if 	AuthorityFlag=1 then
	Wscript.sleep (1020)
	sOutput=ExecuteCmd("-debug -querytype=a " +host + " "+ server ,NSLOOKUP_PATH  ,true)
	If  LCase(sOutput) <> "error" Then

		If   GetSubMatches( WINS_LOOKUP_REGEX, sOutput, sOutput, aSubMatches) Then

			TTL2=cint(aSubMatches(8) )
			if ttl1>ttl2 then
				 boolWins=cbool(true)
			end if
		end if
    end if
end if	

wscript.echo ""
wscript.echo "std_output:"
wscript.echo sOutput 

trace "Writing Property Bag . State=" & cstr(boolWins) & " ttl1:" & cstr(ttl1) & " ttl2:" & cstr(ttl2) & " Authority Flag:" & cstr(	AuthorityFlag)

if boolWins=true then
	oPropertyBag.AddValue "state", "OK"
else
	oPropertyBag.AddValue "state", "ERROR"
end if
objAPI.AddItem(oPropertyBag)
If Err.Number <> 0 Then ThrowErrorAndExit "Error adding state data to property bag. code = " & Err.Number
objAPI.ReturnItems
If Err.Number <> 0 Then ThrowErrorAndExit "Error returning property bag data. code = " & Err.Number  

Set objAPI = Nothing
Set oPropertyBag = Nothing

Wscript.Quit

'*******************************************************

Sub ThrowErrorAndExit(Message)

   Err.Clear
   Call oAPI.LogScriptEvent(DNS_SCRIPTNAME, DNS_TRACEEVENTNUMBER, SCOM_ERROR, Message)
   WScript.Quit

End Sub

Sub Trace(Message)
   If (bolDebug) Then
      Call objAPI.LogScriptEvent(DNS_SCRIPTNAME, DNS_TRACEEVENTNUMBER, SCOM_INFORMATIONAL, Message)
   End If

End Sub

Function GetSubMatches(ByVal aRegexes, ByVal sText, ByRef sRemainingText, ByRef aCapturedSubMatches)
  Dim oRegex
  Set oRegex = New RegExp
  oRegex.Global = False

  Dim oMatches
  Dim oMatch
  Dim sPattern
  Dim aSubMatches()
  aCapturedSubMatches = aSubMatches

  GetSubMatches = False

  Dim i
  Dim lSubMatchCount

  lSubMatchCount = 0
  sRemainingText = sText

  dim intCount
  intCount = 0

  For i = 0 To UBound(aRegexes)
    sPattern = aRegexes(i)
    oRegex.Pattern = "^" & sPattern
    Set oMatches = oRegex.Execute(sRemainingText)
    if oMatches.Count = 0 then
        wscript.echo "exit function at regex: " & intCount
    end if
    If oMatches.Count <> 1 Then
      sRemainingText = sText
      Exit Function
    End If

    Set oMatch = oMatches(0)
    sRemainingText = Mid(sRemainingText, oMatch.Length + 1)

    ' save output If odd line, or only line.

    If i Mod 2 = 1 Then
      lSubMatchCount = lSubMatchCount + 1
      ReDim Preserve aSubMatches(lSubMatchCount - 1)
      aSubMatches(lSubMatchCount - 1) = oMatch.Value
    elseIf UBound(aRegexes)=0 Then
      lSubMatchCount = lSubMatchCount + 1
      ReDim Preserve aSubMatches(lSubMatchCount - 1)
      aSubMatches(lSubMatchCount - 1) = oMatch.Value
    End If
    intCount = intCount + 1
  Next

  GetSubMatches = True
  aCapturedSubMatches = aSubMatches

End Function

Function ExecuteCmd(strOptionToUse, strCmdToUse, boolReadOutput)
Dim ncControlcommand
Dim oShell
Dim curDir
Dim strExecOut

Set oShell = CreateObject("WScript.Shell")
curDir = oShell.CurrentDirectory
ncControlcommand =  "cmd.exe /C """ & QuoteWrap(strCmdToUse) & " " & strOptionToUse & " " &"""" 

IF boolReadOutput Then
    strExecOut = RunCmd(ncControlcommand,true)
Else
    strExecOut = RunCmd(ncControlcommand,false)
End If
ExecuteCmd = strExecOut
End Function

Function RunCmd(CmdString, boolGetOutPut)
    Dim wshshell
    Dim oExec
    Dim output
    Dim strOutPut

    Set wshshell = CreateObject("WScript.Shell")
    Set oExec = wshshell.Exec(CmdString)
    Set output = oExec.StdOut
    Do While oExec.Status = 0
         WScript.Sleep 100
         if output.AtEndOfStream = false then
            IF boolGetOutPut Then
                    strOutPut = strOutPut & output.ReadAll
                End IF
         else
              exit Do
         End If
    Loop
    IF boolGetOutPut Then
        strOutPut = strOutPut & output.ReadAll
    Else
        strOutPut = "1"
    End IF    

    If oExec.ExitCode <> 0 Then
         strOutPut = "Error"

    End If

    Set wshshell = Nothing
    RunCmd = strOutPut
End Function

Function QuoteWrap(myString)
      If (myString <> "") And (left(mySTring,1) <> Chr(34)) And (Right(myString,1) <> Chr(34)) Then
            QuoteWrap = Chr(34) & myString & Chr(34)
      Else
            QuoteWrap = myString
      End If
End Function

Function IsValidObject(ByVal oObject)
  IsValidObject = False

  If IsObject(oObject) Then
    If Not oObject Is Nothing Then
      IsValidObject = True
    End If
  End If
End Function

Posted in management packs, troubleshooting | Leave a Comment »

What is the optimal setting for my environment when it comes to missed heartbeats?

Posted by rob1974 on July 14, 2010

I’m probably not the first to write about heartbeats and its mechanism. So just a quick overview of how it works.

Heartbeat mechanisme: An agent sends out a heartbeat on the monitoring port 5723 to a management servers. The management server (rms actually) keeps track of the last received heartbeat of each agent in its management group. When an agent’s heartbeat hasn’t been received for X time alert “heartbeat failure” alert is generated. The heartbeat failure alert in turn triggers a ping on the agent-managed server’s FQDN. When this fails as well another alert will be generated, the “server unreachable” alert.

This blog is about the “X” time to wait before a heartbeat failure alert will be generated. X isn’t actually a direct configurable time. It’s a combination of 2 settings. The agent setting “the heartbeat interval” in seconds and the server setting “number of missed heartbeats allowed”. The default for this setting is 60 seconds and 3 missed heartbeats allowed. So “X” is default 60×3 = 3 minutes.

image

image 

I assume the default is ok in most cases, but if you have a large and complex network or reboot servers without setting maintenance it might generate a lot of noise as well. So here’s what we did to find the optimal setting for X without starting to experiment with settings themselves.

By running the query below you can see the number of alerts for “Health Service HeartBeat Failure” and “Failed to Connect to Computer” with a closed status.

Select alertstringname, count(*) as Number_of_alerts from AlertView
where ResolutionState = 255
and (alertstringname = ‘Health Service HeartBeat Failure’ or alertstringname = ‘Failed to Connect to Computer’)
group by AlertStringNAme
order by 2 DESC

When you haven’t changed retention for closed alerts it gives the number for about 1 week. For our environment it turned out to be around 3000 heartbeat alerts, which is about a weekly alert for each server. However most of these alerts are gone before someone looked at the “problem”.

In this post i’ve given some queries to identify the auto closing alerts already. I modified the query a bit to only see heartbeat failure and failed to connect to computer alerts.

By running the query below you can see the number of heartbeat failure alerts which were closed within 2 minutes after creation.

Select alertstringname, count(*) as Number_of_alerts from AlertView
where ResolutionState = 255
and ResolvedBy =’system’
and (alertstringname = ‘Health Service HeartBeat Failure’ or alertstringname = ‘Failed to Connect to Computer’)
and DATEDIFF(MI,TimeRaised,TimeResolved) <= 2
group by AlertStringNAme
order by 2 DESC

The result of this query in my environment was 1450 alerts were auto-closed within the first 2 minutes. So if X would have been 5, it probably would have prevented 1450 alerts.

I’ve plotted the X versus the expected number of heartbeat failures. Please note i left out quite a lot of values for X, but i haven’t adjusted the scale for this and after 1 hour i still would get a few heartbeat failures.

image

So what’s the optimal X for this environment?

Actually you still can’t say what the setting should be. It still depends on what is acceptable for your environment as we’re talking about how fast you can detect whether a server is down or not. Setting X to 60 would give us the least of heartbeats, but it wouldn’t make any sense either. I believe finding a balance between noise and when we have to take a look is more important, i’d say the optimal X for my environment is 7-8. This will leave about 800 heartbeats alerts weekly, but this is acceptable for us.

Also note you might miss unexpected reboots whatever the value for X is. If it’s important not to miss them, just pick up the event about unexpected reboot from the system eventlog by an alert rule and make that alert critical.

Posted in Agent Settings, Management Servers | 4 Comments »

The Moving Average threshold alerts.

Posted by rob1974 on July 14, 2010

A common tuning strategy is to look at the top 25 most common alerts. In the ’Microsoft ODR Report Library’ you can find this report. When i first started looking at this report i noticed i had several alerts in this topX, which i had never seen this in the console before. The only way i could see these alerts was by creating a “closed alert” view. Most of these alerts were closed by “system” within 1-2 minutes.

I wanted to know how many alerts there were in environment which were automatically closed within 5 minutes. The query below gives the count for those alerts.

select count(*) from AlertView
where ResolutionState = 255
and ResolvedBy =’system’
and DATEDIFF(MI,TimeRaised,TimeResolved) <= 5

In my environment where more then 1/3 of the total alerts (run only the first line) closed within 5 minutes. Most of these alerts aren’t being looked at, because they close so fast. So why are they generated in the first place?

With a few more queries I hoped to find some details about the alerts.To identify which alerts are generated the most and closed within the time limit (thanks to Brian McDermott for helping out with these).

Select top 10 alertstringname, count(*) from AlertView
where ResolutionState = 255
and ResolvedBy =’system’
and DATEDIFF(MI,TimeRaised,TimeResolved) < 5
group by AlertStringNAme
order by 2 DESC

To identify which objects generated the most of these errors.

select top 10 monitoringobjectfullname, count(*) from AlertView
where ResolutionState = 255
and ResolvedBy =’system’
and DATEDIFF(MI,TimeRaised,TimeResolved) < 5
group by monitoringobjectfullname
order by 2 DESC

I can’t really give a strategy to tune these alerts as it seems to be incidents where 1 server is near its threshold value for some time (just going over it, dropping under it again, etc). Based on the monitor and its configuration you should tune this. However the most alerts seems to happen with “moving average” type alerts a lot.

A moving average is an average over several samples, each new sample it will drop the oldest sample value for the new one. With a moving average the monitor compares its average value to it’s threshold at the same rate as the sample frequency and this can lead to a lot of state changes and thus alerts.

In the table below I’ve created fictional data to proof my point. The 3rd column display’s the moving average over 5 samples. As you can see after the 5th sample, it gives a value every sample. The 4th column takes just the average over 5 samples, takes 5 new samples and then creates a new value.

Suppose we have a 1 minute sample rate. For moving average it means getting 9 alerts in 30 minutes which exist for 1 to 3 minutes. When we just make an average over 5 minutes and then drop the results and collect 5 new points it would have been 3 alerts which exist 5 to 10 minutes. Still a lot, but because the alert exists as “new” longer, it might have been noted by some operator and some admin might have taken a look at the server and resolved the issue.

 image

Tuning “moving average” monitors is quite difficult and the only way to reduce alerts is to use a lower sample rate. A higher threshold value doesn’t work as it are incidents where a server is near the threshold value.

Posted in management packs, troubleshooting | 1 Comment »

another DNS tuning post!

Posted by rob1974 on July 6, 2010

 

Actually, this is my first DNS tuning post, but as the DNS mp has proven itself quite noisy you can find loads of other blogs with tuning tips. I haven’t found this one anywhere, so here goes my tip about the “DNS 200X Forwarder Availability Monitor” and why you should disable it or configure it properly.

What does this monitor do:

It just runs an A-record (ns)lookup for http://www.microsoft.com. So basically it assumes we have a forwarder in place and with http://www.microsoft.com we actually use this forwarder. Allthough many probably do (unless you use roothints), it’s actually a lot similar to the “DNS 200X External Resolution Monitor”, which does a NS-record lookup against http://www.microsoft.com (but override this to microsoft.com for a better result as that’s a correct ns-record lookup).

So the “forwarder availability monitor” doesn’t actually test forwarder availability, at least not which is already being tested by the external resolution monitor.

But there’s an use for this test. When you do use “conditional forwarding” in a DNS server, you can configure this test to lookup a domain record in the forwarding rule.

E.g. you have a conditional forwarding rule to “my2nddomain.com”. Set an override on the
“host” to a-valid-a-record.my2nddomain.com and actually make use of the forwarder availability monitor.

When you make use of conditional forwarders, configure this rule to actually use one of them. But when you don’t use forwarders or you use the “forward all domains” option, then just do like me and disable the monitor.

image

Posted in management packs | Leave a Comment »

Setting an override on a rule with PowerShell (OpsMgr snapin part 3)

Posted by MarcKlaver on May 12, 2010

A few examples on how to use the two cmdlets.

To try the cmdlets, type:

add-pssnapin mkropsmgr

This command adds the PowerShell snapin. This is required to use the cmdlets. You should only have to use this once in every session you run.

Get-RuleParameter -ClassCriteria "DisplayName like ‘%’" -RuleCriteria "DisplayName like ‘%’" –Verbose

This command will show you all overridable parameters from all rules. If you want to see the whole output, you should be patient 🙂 This command also needs to be run on the RMS server, since it will default to the “localhost” value to connect to the SDK service and the user running the command must have sufficient privileges to query the SCOM environment.

Get-RuleParameter -RootManagementServer "RMS01" -Credential $myCred -ClassCriteria "DisplayName like ‘SQL%’" -RuleCriteria "DisplayName like ‘Collect%’" –Verbose

This command will show you all overrideable parameters of all classes, whose display name starts with “SQL” and for all rules whose display name start with “Collect”.  The command will also try to connect to the RMS called “RMS01” and will ask for your credentials to use for connecting to this RMS. And finally it uses the –Verbose switch, to show additional information about the progress of the command.

add-pssnapin mkrOpsMgr

add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";

set-location "OperationsManagerMonitoring::";

$con = new-managementGroupConnection -ConnectionString:"RMS01" -Credential $myCred;

$mp = get-managementpack | where {$_.Name -eq "mpTestRuleOverride"}

Set-RuleOverride -ManagementPack $mp -ClassCriteria "DisplayName=’SQL 2005 DB’" -RuleCriteria "DisplayName=’Collect Database Free Space (%)’" -Parameter "Enabled" -Value "False" -Verbose –Enforce

These commands will load both the mkrOpsMgr and the Microsoft Operations Manager snapin followed  by a connection to the SDK on the server called “RMS01” (it will ask for credentials). It will retrieve the object for the “mpTestRuleOverride” management pack (which must exist) and will disable the rule with DisplayName “Collect Databse Free Space (%)” form the class with the DisplayName “SQL 2005 DB”. This is done by setting the parameter “Enabled” to “False”. This override is enforced with the –Enforce switch.

add-pssnapin mkrOpsMgr

add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";

set-location "OperationsManagerMonitoring::";

$con = new-managementGroupConnection -ConnectionString:"RMS01" -Credential $myCred;

$mp = get-managementpack | where {$_.Name -eq "mpTestRuleOverride"}

Set-RuleOverride -RootManagementServer "RMS01" -Credential $myCred -ManagementPack $mp -ClassCriteria "DisplayName=’SQL 2005 DB’" -RuleCriteria "DisplayName=’Collect Database Free Space (%)’" -DataSource "DS" -Parameter "IntervalSeconds" -Value "7777"

These commands will override the rule “Collect Database Free Space (%)” from the class “SQL 2005 DB” and set the value for the “IntervalSeconds” parameter to 7777. Note that running these command a second time, will generate an error indicating there is an override conflict on the rule. You can use the override name shown by the error to change the existing override, using the –OverrideId parameter of the Set-OverrideRule cmdlet.

Running this command the second time on my system, resulted in the error shown below:

: Verification failed with [1] errors:
——————————————————-
Error 1:
: Failed to verify Override [mkrOpsMgr.SetOverrideRule.0888eee2d8754f47820bac3e7887d7d5].
Override [mkrOpsMgr.SetOverrideRule.0888eee2d8754f47820bac3e7887d7d5] is a duplicate to Override [mkrOpsMgr.SetOverrideRule.9415ca506dd540d683c907bbdf1062aa] de
fined within the same ManagementPack. Please remove any one of the duplicate overrides.
——————————————————-

Failed to verify Override [mkrOpsMgr.SetOverrideRule.0888eee2d8754f47820bac3e7887d7d5].Override [mkrOpsMgr.SetOverrideRule.0888eee2d8754f47820bac3e7887d7d5] is
a duplicate to Override [mkrOpsMgr.SetOverrideRule.9415ca506dd540d683c907bbdf1062aa] defined within the same ManagementPack. Please remove any one of the duplic
ate overrides.

The blue OverrideId is the Id generated by the cmdlet when it was run for the second time. The red OverrideId is the existing one in the management pack. If you need to change an existing override, you should supply the existing OverrideId to the cmdlet:

add-pssnapin mkrOpsMgr

add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";

set-location "OperationsManagerMonitoring::";

$con = new-managementGroupConnection -ConnectionString:"RMS01" -Credential $myCred;

$mp = get-managementpack | where {$_.Name -eq "mpTestRuleOverride"}

Set-RuleOverride -RootManagementServer "RMS01" -Credential $myCred -ManagementPack $mp -ClassCriteria "DisplayName=’SQL 2005 DB’" -RuleCriteria "DisplayName=’Collect Database Free Space (%)’" -DataSource "DS" -Parameter "IntervalSeconds" -Value "8888" -overrideid "mkrOpsMgr.SetOverrideRule.9415ca506dd540d683c907bbdf1062aa"

These commands will now set the same override to the new value of 8888

I now realize I have not implemented a method to remove an override, so a new version will be build 🙂

Posted in management packs, PowerShell | 3 Comments »

Targeting rules and customers

Posted by MarcKlaver on May 6, 2010

We are managing several customers with one management group and one management environment. Some rules are very customer specific and we wanted to prevent rules specific for customer A being visible at customer B (and all other customers).

To accomplish this, we tried this:

  • We targeted a disabled rule to the windows computer object and enabled the rule for targets specific for customer A (using our computer attributes we also use for customer group population, see also this link). Unfortunately, this didn’t work. The rule ended up at all windows computer targets, regardless of the rule being enabled or disabled for the target. 
  • Next we tried a new management pack, with only disabled rules. Again the rules were distributed over all customers.

So our conclusion is that the agent itself determines if a rule is enabled/disabled, not the RMS. The RMS only checks if the agent has a target for a rule/mp.

To prevent the rules from being distributed to customer B (and all other customers), we needed to be able to target only to computers for customer A. The only option we found was to create a class to target computers at a customer level. Again we used our registry key we also use for the group population. But this time it is used to discover a class, specific to the customer.

We decided that it wasn’t a problem that a customer specific rule was available on all servers of a single customer, so we created one class for each customer:

image 

or in XML:

<TypeDefinitions>
  <EntityTypes>
    <ClassTypes>
      <ClassType ID="jama.CustomerId.CustomerA" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
      <ClassType ID="jama.CustomerId.CustomerB" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
      <ClassType ID="jama.CustomerId.CustomerC" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
      <ClassType ID="jama.CustomerId.CustomerD" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
    </ClassTypes>
  </EntityTypes>
</TypeDefinitions>

Next we created a discovery, to discover the “customer” classes:

image

The discovery will discover any customer class we have defined:

image

or in XML:

<DiscoveryTypes>
  <DiscoveryClass TypeID="jama.CustomerId.CustomerA" />
  <DiscoveryClass TypeID="jama.CustomerId.CustomerB" />
  <DiscoveryClass TypeID="jama.CustomerId.CustomerC" />
  <DiscoveryClass TypeID="jama.CustomerId.CustomerD" />
</DiscoveryTypes>

The discovery itself is done with a script. Originally I wanted to create this code:

strCustomerId = ucase(jamaGetAttribute(STR_ATTRIBUTE_CUSTOMER_ID)) ‘ retrieve customer id
if(strCustomerId <> "") then
    set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId." & strCustomerId & "’]$")
end if

But the authoring console didn’t like that (never checked it with the operator console, but I assume the import will fail also), because it didn’t know anything about the element “jama.CustomerId." & strCustomerId” 🙂 When checking the management pack, it also wants to resolve all “MPElement” names in the scripts, but it didn’t use the variables.

So I now created a switch statement, for all customer names in the management pack:

strCustomerId = ucase(jamaGetAttribute(STR_ATTRIBUTE_CUSTOMER_ID))

if(strCustomerId <> "") then 
     select case strCustomerId
         case "CUSTOMERA" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerA’]$")
         case "CUSTOMERB" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerB’]$")
         case "CUSTOMERC" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerC’]$")
         case "CUSTOMERD" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerD’]$")
     end select

    g_objClass.AddProperty "$MPElement[Name=’Windows!Microsoft.Windows.Computer’]/PrincipalName$", "$Target/Property[Type=’Windows!Microsoft.Windows.Computer’]/PrincipalName$"

    g_objClass.AddProperty "$MPElement[Name=’System!System.Entity’]/DisplayName$", "$Target/Property[Type=’Windows!Microsoft.Windows.Computer’]/PrincipalName$"
     g_objDiscoveryData.AddInstance(g_objClass)
end if

So when all combined, the final xml file will result in this:

<?xml version="1.0" encoding="utf-8"?><ManagementPack ContentReadable="true" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <Manifest>
    <Identity>
      <ID>jamaClass</ID>
      <Version>1.0.0.0</Version>
    </Identity>
    <Name>jamaClass</Name>
    <References>
      <Reference Alias="System">
        <ID>System.Library</ID>
        <Version>6.0.6278.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
      <Reference Alias="Windows">
        <ID>Microsoft.Windows.Library</ID>
        <Version>6.0.6278.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
    </References>
  </Manifest>
  <TypeDefinitions>
    <EntityTypes>
      <ClassTypes>
        <ClassType ID="jama.CustomerId.CustomerA" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
        <ClassType ID="jama.CustomerId.CustomerB" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
        <ClassType ID="jama.CustomerId.CustomerC" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
        <ClassType ID="jama.CustomerId.CustomerD" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.ComputerRole" Hosted="true" Singleton="false"/>
      </ClassTypes>
    </EntityTypes>
  </TypeDefinitions>
  <Monitoring>
    <Discoveries>
       <Discovery ID="jama.CustomerId.Server.Discovery" Enabled="true" Target="Windows!Microsoft.Windows.Server.Computer" ConfirmDelivery="false" Remotable="true" Priority="Normal">
         <Category>Discovery</Category>
           <DiscoveryTypes>
             <DiscoveryClass TypeID="jama.CustomerId.CustomerA" />
             <DiscoveryClass TypeID="jama.CustomerId.CustomerB" />
             <DiscoveryClass TypeID="jama.CustomerId.CustomerC" />
             <DiscoveryClass TypeID="jama.CustomerId.CustomerD" />
           </DiscoveryTypes>
           <DataSource ID="DS" TypeID="Windows!Microsoft.Windows.TimedScript.DiscoveryProvider">
             <IntervalSeconds>86400</IntervalSeconds>
             <SyncTime />
             <ScriptName>jama.CustomerId.Server.Discovery.vbs</ScriptName>
             <Arguments />
             <ScriptBody><![CDATA[‘
option explicit
setLocale("en-us")
on error goto 0

const EVENT_ID_FOUND_CLASS   = 110

const STR_ATTRIBUTE_CUSTOMER_ID = "customer_id"
const HKLM                      = &H80000002
const STR_ATTRIBUTE_KEY         = "SOFTWARE\jama00\OpsMgr2007\attributes"

dim g_objAPI
dim g_objClass
dim g_objDiscoveryData

jamaInitialize()
jamaCreateDiscovery()
g_objAPI.Return(g_objDiscoveryData)

function jamaInitialize()
    set g_objAPI           = CreateObject("MOM.ScriptAPI")
    set g_objDiscoveryData = g_objAPI.CreateDiscoveryData(0, "$MPElement$", "$Target/Id$")
end function

function jamaCreateDiscovery()
    dim strCustomerId

   strCustomerId = ucase(jamaGetAttribute(STR_ATTRIBUTE_CUSTOMER_ID))
    if(strCustomerId <> "") then
        jamaWriteLog EVENT_ID_FOUND_CLASS, "Discoverd class: jama.CustomerId." & strCustomerId
        select case strCustomerId
            case "CUSTOMERA" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerA’]$")
            case "CUSTOMERB" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerB’]$")
            case "CUSTOMERC" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerC’]$")
            case "CUSTOMERD" set g_objClass = g_objDiscoveryData.CreateClassInstance("$MPElement[Name=’jama.CustomerId.CustomerD’]$")
        end select
        g_objClass.AddProperty "$MPElement[Name=’Windows!Microsoft.Windows.Computer’]/PrincipalName$", "$Target/Property[Type=’Windows!Microsoft.Windows.Computer’]/PrincipalName$"
        g_objClass.AddProperty "$MPElement[Name=’System!System.Entity’]/DisplayName$", "$Target/Property[Type=’Windows!Microsoft.Windows.Computer’]/PrincipalName$"
        g_objDiscoveryData.AddInstance(g_objClass)
    end if
end function

function jamaGetAttribute(byval strAttribute)
    dim strResult
    dim objReg

    strResult = ""
    on error resume next
        err.clear
        set objReg=GetObject("winmgmts:{impersonationLevel=impersonate}!\\.\root\default:StdRegProv")
        if(err.number <> 0) then
            strResult = ""
        else
            objReg.GetStringValue HKLM, STR_ATTRIBUTE_KEY, strAttribute, strResult
            if(err.number <> 0) then
                strResult = ""
            elseif(isnull(strResult)) then
                strResult = ""
            end if
        end if
    on error goto 0

    jamaGetAttribute = strResult
end function

function jamaWriteLog(iEventId, strMsg)
    dim bEnabled : bEnabled = true      ‘ Set to false to disable logging.

    if(bEnabled) then
        dim objAPI
        dim strTemp

        set objAPI = CreateObject("MOM.ScriptAPI")
        if((iEventId >= 100) and (iEventId < 20000)) then
            call objAPI.LogScriptEvent(wscript.scriptname, iEventId, 4, strMsg)
        else
            strTemp = "Out of range event id found: " & cstr(iEventId) & ". Original message: " & strMsg
            call objAPI.LogScriptEvent(wscript.scriptname, 0, 4, strTemp)
        end if
    end if
end function
]]></ScriptBody>
          <TimeoutSeconds>300</TimeoutSeconds>
        </DataSource>
      </Discovery>
    </Discoveries>
  </Monitoring>
  <LanguagePacks>
    <LanguagePack ID="ENU" IsDefault="false">
      <DisplayStrings>
        <DisplayString ElementID="jamaClass">
          <Name>jamaClass</Name>
          <Description>Agent version: 3.1.0  build 442 </Description>
        </DisplayString>
        <DisplayString ElementID="jama.CustomerId.Server.Discovery">
          <Name>jama.CustomerId.Server.Discovery (name)</Name>
          <Description>jama.CustomerId.Server.Discovery (description)</Description>
        </DisplayString>
        <DisplayString ElementID="jama.CustomerId.CustomerA">
          <Name>jama.CustomerId.CustomerA (name)</Name>
          <Description>jama.CustomerId.CustomerA (description)</Description>
        </DisplayString>
        <DisplayString ElementID="jama.CustomerId.CustomerB">
          <Name>jama.CustomerId.CustomerB (name)</Name>
          <Description>jama.CustomerId.CustomerB (description)</Description>
        </DisplayString>
        <DisplayString ElementID="jama.CustomerId.CustomerC">
          <Name>jama.CustomerId.CustomerC (name)</Name>
          <Description>jama.CustomerId.CustomerC (description)</Description>
        </DisplayString>
        <DisplayString ElementID="jama.CustomerId.CustomerD">
          <Name>jama.CustomerId.CustomerD (name)</Name>
          <Description>jama.CustomerId.CustomerD (description)</Description>
        </DisplayString>
      </DisplayStrings>
    </LanguagePack>
  </LanguagePacks>
</ManagementPack>

And this must be adjusted for every new customer we define in our environment and yes this is fully scripted 🙂 The last thing to do is to seal the management pack, so we can use it in other management packs. This is done with the mpseal.exe from the installation CD:

 

MPSeal.exe jamaClass.xml /I .\mp /keyfile privatekey_JAMA.snk /company "JAMA" /outdir .\sealed

Note the “/I .\mp” option. I have created a subdirectory with all referenced management packs called “mp”. You must supply the mpseal.exe with the referenced management packs before it is able to seal the management pack. The file “privatekey_JAMA.snk” contains our private key we use for sealing all our management packs. The final result will be stored in the “sealed” sub directory.

 

So now I finally can select a target that only exists at a customer site:

image

Default disabling the rule, will give you the possibility to override the rule for one or just a few servers within the customer targeted (CustomerC in this example) without distributing the rule(s) to other customers.

Posted in grouping and scoping, management packs | 2 Comments »