JAMA00

SCOM 2007 R2

HP Proliant MP discovery is missing Networks, Storage and Management Processor

Posted by MarcKlaver on April 25, 2013

image

Recognize the above situation?

When you installed the HP Proliant Management Pack and used the SNMP based agents, some discoveries do not seem to work. To solve this, you must also configure your SNMP settings on your server. In order for the SNMP based agents to read all required information, an SNMP community string must be created.

You should configure two things within the SNMP Service Properties:

clip_image002

Under “Service” allow all.

 

clip_image002[5]

  • Under “Security” add a read-only community string and make sure that SNMP packets are accepted from the “localhost” host. Note that the :”Send authentication trap” option is not required, but optional.
  • Restart your SNMP Service
  • Restart the SCOM service

And after a few minutes the result should look like this:

image

When the HP Agents start, it will read the SNMP settings and use it to access the SNMP based information from the agent Smile.

Posted in management packs | Leave a Comment »

Configure Custom Service Monitoring, an alternative method.

Posted by rob1974 on June 12, 2012

We are running SCOM in a “service provider” solution. When you do this you want to have standardized environments and tune everything in a generic manner. However, our main problem is every customer is different and has different requirements. Simply changes like disk threshold or setting up service monitoring for specific services on specific computers can be quite time consuming. Also, it can have a big impact on performance of the environment (a simple override to one Windows 2008 logical disk gets distributed to all Windows 2008 servers!).

Of course, we could have authors and advanced operators for to do this job, but there’s no way of auditing on changes or forcing advanced operators to use certain override mp’s only. We’re convinced this will lead to performance issues and dependency problems because overrides will be saved everywhere. So we believe this can only be done by a small amount of people, who really understand what they are doing.

To counter this we use an alternative way for setting overrides for common configurations and overrides. We create a script that reads a registry key for a threshold value to use or as I will show in this blog a configuration value. An example of a threshold value: we’ve created a disk free space script that uses a default value (both % as an MB value), but when a override is present in the registry it will use that value instead. The registry override in turn can be set with a SCOM task with just normal operator rights (you can restrict task access for operators in their SCOM profile so not every operator can do this). The change is instant and without configuration impact on the SCOM servers.

Now to the service monitoring example of this principle. What we’ve done is create a class discovery that checks for a certain registry key being present. If it is present it will target 1 monitor and 1 rule to that class. The monitor runs a script every 5 minute and checks the registry key for service names. Every service name is this registry key will be checked for a running state. If one or more services aren’t in a running state the monitor will become critical and an alert will be generated. When all services are in a running state again, the monitor will reset to healthy and close the alert.

By running a task from the windows computer view you can setup the monitoring for the first time:

image

Overriding the task with the service name (not the displayname) will add the service to the registry.

image 

When the discovery has run the found instance will be visible in the jama Custom Service Monitoring view and more tasks will be available. When it really is the first service it might take up to 24 hours before the instance is found as we’ve set the discovery to a daily interval. But you can always restart the system center management service to speed up the discovery.

The new tasks are:

- Set monitored service. Basically the same tasks as the one available in the computer view, just the target is different. It can add additional services to monitor without any impact on the SCOM backend and this service will be monitored instantly as the data source will check the registry each run.

- List monitored service. Reads the registry and lists all values.

- Remove monitored service. Removes a service from the registry key and delete the key if the service was the last value. When the key is deleted the class discovery removes the instance next discovery run. Overriding the key with “removeall” will also delete the key.

- The “list all services on the computer” task doesn’t have real value for this management pack, just added for checking a service name from the SCOM console.

See below for some screenshots of the tasks and health explorer.

image

Task output of “List monitored services”:

image

The health explorer has additional knowledge and show which services have failed through the state change context:

image

 image

 

 

So what’s the benefit of all this:

- The SCOM admins/authors don’t have to make a central change, except for importing the management pack.

- The support organization doesn’t have to log a change and wait for someone to implement it, but they can make the change themselves with just SCOM operator rights (or with rights to edit a registry key on a computer locally) and it works pretty much instant.

- From a performance view the first service that is added will have some impact on the discovery (config churn), but additional services don’t have any impact.

However, there’s still no auditing. We’ve allowed this task to what we call “key users” and we have 1 or 2 key users per customer. This could give an idea of who changed it when you’d set object access monitoring on the registry key.

The performance benefit for this monitor is probably minimal. However using this principle for disk thresholds gives a huge benefit as that’s a monitor that is always active on all systems and overriding values through a registry key on the local system removes all override distribution (I might be posting that mp as well).

When you want to checkout this mp, you can download the management pack jamaCustomService here. I’ve uploaded it unsealed, but I recommend it to seal it if you really want to put this to production.

Posted in general, management packs | Tagged: , , | Leave a Comment »

Website monitoring, another gotcha!

Posted by rob1974 on February 13, 2012

Recently I had an issue with website monitoring in a SCOM demo environment. I had configured a website test (through the template) every 2 minutes. I had created a DNS zone hosting the FQDN for this website. Then I paused the DNS zone and waited for the HTTP test to fail. I expected the HTTP test to fail and have an alert in SCOM within 2 minutes. However, after 10 minutes I had nothing… Not really what you want when you do a live demo.

So what was happening here? Some of you might already got it, it’s DNS caching of the client running the HTTP test. So how to stop this? Well there are 3 things you can do.

 

1. Default the DNS Client service will be started on a windows machine. Simply stopping the DNS Client service and the caching will stop (dns queries will still resolve).

2. Increase the frequency of the HTTP test. Anything more than 15 minutes will do…

3. Decrease the default cache time for the queries to something less than your test frequency.

 

As I was giving a demo, option 2 was not an option for me. But I would seriously thing about this when I do website tests in production. Obviously service level agreements should play a part in this, but a delay of max 30 minutes on a SLA of 8 hours would definitely be acceptable for me.

Option 1 was no option either. Beside caching the dns client service also registers domain joined hosts in dns, so not something I would recommend either. Besides caching helps performance wise, not sure if I ever wanted to disable this.

So left with option 3, but how to do this? In HKLM\SYSTEM\CurrentControlSet\services\Dnscache\parameters create the DWord MaxCacheTtl (and MaxNegativeCacheTtl if it’s not already on 0 with “NegativeCacheTime”) and give it a value of below the frequency of the website test. For a 2 minutes test I used a value of 90 (seconds).

dnssettings

Normally I think option 2 will be the best to go for. No use to run tests more than you would need to. However if you have a dedicated host for running website tests and you run those tests more often than every 15 minutes, consider reducing the max. cache time of the dns client.

Posted in general, troubleshooting | Tagged: , , | Leave a Comment »

Stop storing data (partial or temporary) into the data warehouse database

Posted by MarcKlaver on January 19, 2012

In order to facilitate the use of the data warehouse database, there are 3 default overrides for an environment that has it’s data warehouse enabled.

image

If you (partial or temporary) need to stop storage to the data warehouse, you can just override the default overrides (again) to set the Drop Items parameter to true. This will, after propagation to the management server, cause the items to be dropped (and not stored into the data warehouse database).

Note that while this is possible, I assume it is a non supported configuration Smile

Posted in Uncategorized | Leave a Comment »

Agent proxy

Posted by MarcKlaver on July 1, 2011

Until now we have set the Agent proxy for an agent only when required and we used a script to do this. See this link for more information. But now Microsoft has come up with something new in the Exchange 2010 management pack. It will not discovery anything until you have set the agent proxy on for the Exchange servers (so we can’t do this afterwards anymore). So this meant for us we need to make a choice:

  1. Manually enable the proxy agent setting for all Exchange 2010 servers (now and in the future). Which means an Exchange 2010 server will not be discovered until we actually do.
  2. Enable the proxy agent for all agents.

Counting at the moment around 60 percent of the agents already has the proxy functionality enabled. So what’s the advantage of not setting this setting on default for all agents? Looking at security, you have to enable this setting already for all security important servers (AD, Exchange, ISA, Citrix, etc.). And since we have no knowledge of when an Exchange server is connected to our environment, we decided to enable it for all agents.

This is the script to do it:

$rootMS="RMS.TEST.LOCAL"

#——————————————————————————-
# Add operations manager snapin and connect to the root management server.
#——————————————————————————-
add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";
set-location "OperationsManagerMonitoring::";
new-managementGroupConnection -ConnectionString:$rootMS;

## set proxy enabled for all agents where it is disabled
$NoProxy = get-agent | where {$_.ProxyingEnabled -match "False"}
$NoProxy | foreach {$_.ProxyingEnabled = $true}
$NoProxy | foreach {$_.ApplyChanges()}

See also Kevin Holman’s blog

Posted in Agent Settings | 2 Comments »

State changes from disabled monitors

Posted by MarcKlaver on May 18, 2011

While we try to reduce the number of state changes, we stumbled into a bug in the agent software. We were investigating our top 50 of state change monitors, using Kevin Holman’s queries.

When we looked at the results, we did see some strange things. First, the number of state changes were equal for a lot of monitors.

An example of this:

image

Having exact the same number of state changes for two different monitors? When we looked into these monitors, they were disabled by default in the management pack were the monitor was defined. Digging further showed that there was no override present in the system, which would enable this monitor.

We now had a problem while from our top 50, 48 of them were monitors that were default disabled, without any override to enable the monitor.

This turns out to be a bug in the agent software. The moment the agent (re-) initializes either by starting or coming out of maintenance mode it will detect the monitor and initialize it. When realizing it should (default) disable the monitor, it will send a state change for the the disabled monitor.

This is also the case for monitors that are default enabled, but are disabled using a custom override. Unfortunately a fix for this issue is not as easy as it sounds (according to Microsoft support) and a fix will not be realized in the R2 version of SCOM.

So if you have tuned a lot (like us) and find these monitors, just skip them. You can’t fix this one :)

Below is a list of monitors we found that were default disabled during this investigation and that we now exclude from the query.

Microsoft.Windows.Server.2003.LogicalDisk.AvgDiskSecPerWrite
Microsoft.Windows.Server.2003.LogicalDisk.AvgDiskSecPerRead
Microsoft.Windows.Server.2003.NetworkAdapter.NetworkAdapterConnectionHealth
Microsoft.Windows.Server.2008.LogicalDisk.AvgDiskSecPerWrite
Microsoft.Windows.Server.2008.LogicalDisk.AvgDiskSecPerRead
Microsoft.SystemCenter.Ping
Microsoft.SystemCenter.AgentManagement.EndToEndEventMonitorError
Microsoft.SystemCenter.HealthService.ConfigurationStateWarningLevel
Microsoft.SystemCenter.HealthService.ConfigurationProcessing
Microsoft.SystemCenter.HealthService.Security.DataIntegrityCheck
Microsoft.SystemCenter.HealthService.ConfigurationStateCriticalLevel
Microsoft.SystemCenter.AgentManagement.EndToEndEventMonitorWarning
SMSv4_dependent_service_running__Background_Intelligent_Transfer_Service_16_Rule.AdvancedAlertCriteriaMonitor
SMS_2003_dependent_service_running__Background_Intelligent_Transfer_Service_13_Rule.AdvancedAlertCriteriaMonitor
Microsoft.SQLServer.2005.DBFile.DiskFreeSpace
Microsoft.SQLServer.2005.Database.Configuration.RecoveryModel
Microsoft.SQLServer.2005.Database.TransactionLogSizeMegabytesMonitor
Microsoft.SQLServer.2005.Database.Configuration.TrustWorthy
Microsoft.SQLServer.2005.Database.Configuration.AutoUpdateStatAsync
Microsoft.SQLServer.2005.Database.Configuration.TornPageDetection
Microsoft.SQLServer.2005.Database.DBSizePercentMonitor
Microsoft.SQLServer.2005.Database.Configuration.AutoCreateStat
Microsoft.SQLServer.2005.Database.Configuration.DBChaining
Microsoft.SQLServer.2005.Database.TransactionLogSizePercentMonitor
Microsoft.SQLServer.2005.Database.DBSizeMegabytesMonitor
Microsoft.SQLServer.2005.Database.Configuration.AutoUpdateSet
Microsoft.SQLServer.2005.Database.Configuration.AutoShrink
Microsoft.SQLServer.2005.Database.DBSizePercentageChangeMonitor
Microsoft.SQLServer.2005.Database.Configuration.AutoClose
Microsoft.Windows.Server.2003.TerminalServerRole.InactiveSessions
Microsoft.Windows.Server.2003.TerminalServerRole.ActiveSessions
Microsoft.Windows.Server.2003.TerminalServerRole.CPUPerSession
Microsoft.Windows.InternetInformationServices.2003.WebSite.WebServiceCurrentISAPIExtensionRequests.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebSite.WebServiceCurrentConnections.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebSite.WebServiceBytesTotalSec.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebSite.WebServiceBytesSentSec.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebSite.WebServiceBytesReceivedSec.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebSite.WebServiceISAPIExtensionRequestsSec.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebServer.WebServiceISAPIExtensionRequestsSec.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebServer.ASP.NETRequestsQueued.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebServer.ASP.NETRequestsCurrent.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebServer.ASP.NETWorkerProcessRestarts.Monitor
Microsoft.Windows.InternetInformationServices.2003.WebServer.WebServiceBytesReceivedSec.Monitor

 

If you want to exclude any (default) disabled monitor that you found, exclude it in the query as shown below:

use OperationsManager
go

select distinct top 50 count(sce.StateId) as NumStateChanges, m.MonitorName, mt.typename AS TargetClass
from StateChangeEvent sce with (nolock)
join state s with (nolock) on sce.StateId = s.StateId
join monitor m with (nolock) on s.MonitorId = m.MonitorId
join managedtype mt with (nolock) on m.TargetManagedEntityType = mt.ManagedTypeId
where m.IsUnitMonitor = 1
and m.MonitorName not in (

‘Microsoft.Windows.Server.2003.LogicalDisk.AvgDiskSecPerWrite’,

‘Microsoft.Windows.Server.2003.LogicalDisk.AvgDiskSecPerRead’

)
group by m.MonitorName,mt.typename
order by NumStateChanges desc

Posted in Agent | Leave a Comment »

What property is discovered?

Posted by MarcKlaver on May 17, 2011

I think we all used these Kevin Holman queries to handle the config churn in our environment. But what if you still have config churn issues, but don’t see any issues with these queries.

We still had config churn, but running the queries form Kevin Holman did not point us to the reason for those config churns. But we got some information with a support case.

First we can retrieve exact what is changed, using this query (run on against de data warehouse database):

use OperationsManagerDW
go

select * from dbo.ManagedEntityProperty
where DWCreatedDateTime > dateadd(hh,-24,getutcdate())
order by DWCreatedDateTime, ManagedEntityRowId

This will result in output similar to this:

image

It will give you all properties that are changed within the last 24 hours and what exactly is changed. Now when you “click” on a PropertyXML value or DeltaXml, a new windows will be opened showing you exact which properties there are and which are changed.

But now we don’t have any idea were to find this in a management pack, but we will get there. From the above output, take the ManagedEntityRowId and place this in the next query:

use OperationsManagerDW
go

select * from ManagedEntity
where ManagedEntityRowId = 121403

This will result in output similar to this:

image

The ManagedEntityGuid is what we need here. Place it in the next query (which will run against the operations database):

use OperationsManager
go

select * from BaseManagedEntity
where BaseManagedEntityId = ‘3B9F6E60-02B5-8369-859F-8047093CE33F

The result is:

image

The next thing we need is the BaseManagedTypeId

use OperationsManager
go
select * from ManagedType
where ManagedTYpeId = ‘10C1C7F7-BA0F-5F9B-C74A-79A891170934

Which results in:

image

Here you can find the TypeName that is discovered (Microsoft.SQLServer.Database in this case). Use the ManagementPackId to get the actual management pack:

Use OperationsManager
Go
Select * from ManagementPack where ManagementPackId = ‘BCD6DCCF-C46C-A1F5-3C8D-BB4E99E2A6A3

And the final result will be:

image

So we now know that the property is of type “Microsoft.SQLServer.Database” and that it is discovered in the “Microsoft.SQLServer.Library” management pack (aka “Microsoft SQL Server Core Library”).

Note that if you are only interested in the actual management pack name, you can also use this query (which uses the ManagedEntityGuid from the first query against the operations database):

use OperationsManager
Select * from ManagementPack
where ManagementPackId = (
    select ManagementPackId from ManagedType
    where ManagedTYpeId = (select BaseManagedTypeId from BaseManagedEntity
        where BaseManagedEntityId = ‘3B9F6E60-02B5-8369-859F-8047093CE33F
    )
)

This will result in the same output as the last screenshot (but now you don’t know the type for the data). When you have this information, you can look up the corresponding discoveries so you can fine tune them if required.

Posted in config churn | Leave a Comment »

Updating manual installed agents from the console

Posted by MarcKlaver on May 9, 2011

We have created a management pack, that will update the manual installed agents, from a task in the operations console. However, before you can use this management pack you should have implemented the JDF framework for file distribution. This management pack depends on the framework, so if you are not capable of setting up the framework, this management pack is useless to you.

 

Below is the final XML file for your management pack called jamaAgent.Update.xml:

<?xml version="1.0" encoding="utf-8"?><ManagementPack ContentReadable="true" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <Manifest>
    <Identity>
      <ID>jamaAgent.Update</ID>
      <Version>0.4.0.0</Version>
    </Identity>
    <Name>jamaAgent.Update</Name>
    <References>
      <Reference Alias="MSWL">
        <ID>Microsoft.Windows.Library</ID>
        <Version>6.1.7221.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
      <Reference Alias="MSSCL">
        <ID>Microsoft.SystemCenter.Library</ID>
        <Version>6.1.7221.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
    </References>
  </Manifest>
  <Monitoring>
    <Tasks>
      <Task ID="jamaAgent.Update.ConsoleTask.AgentUpdate" Accessibility="Public" Enabled="true" Target="MSSCL!Microsoft.SystemCenter.HealthService" Timeout="300" Remotable="true">
        <Category>Custom</Category>
        <WriteAction ID="PA" TypeID="MSWL!Microsoft.Windows.ScriptWriteAction">
          <ScriptName>jamaTaskUpdateAgent.vbs</ScriptName>
          <Arguments />
          <ScriptBody><![CDATA[
'-------------------------------------------------------------------------------
' File   : jamaTaskUpdateAgent.vbs
' Use    : Script for the jama Agent Update task.
' SVN    : Revision: 136
'          Date: 2011-04-12 09:14:33 +0200 (Tue, 12 Apr 2011)
'
' Note(s): 1) ---
'-------------------------------------------------------------------------------
option explicit
on error goto 0
setlocale("en-us")

const INT_RETRIES           = 5
const INT_DEFAULT_WAIT_TIME = 120
const STR_DOWNLOAD_PATH     = "/files/opsmgr/updates/agent/"

jamaMain()

'-------------------------------------------------------------------------------
' jamaMain
'
' Use    : Main entry for this script.
' Input  : ---
' Returns: ---
' Note(s): 1) ---
'-------------------------------------------------------------------------------
function jamaMain()
    dim jdf
    dim iMinutesToWait
    dim bForceUpdate

    if(not jdfLoadFramework(jdf, null, null, null, null, null)) then
        wscript.echo "The Jama Distribution Framework could not be loaded. No update is performed."
    else
        wscript.echo "Jama Distribution Framework loaded" & vbNewLine & vbNewLine & jdf.GetInfoString(wscript.fullname)
        '---------------------------------------------------------------------------
        ' Your code here!
        '---------------------------------------------------------------------------
        if(jamaInitialize(iMinutesToWait, bForceUpdate)) then
            if((bForceUpdate = false) and jamaRestartRequired()) then
                wscript.echo "Update could not be scheduled due to dependent services." & vbNewLine & _
                             "Use the force option to override this behaviour and force the update to run."
            else
                if(jamaScheduleUpdateIn(jdf, iMinutesToWait) = true) then
                    wscript.echo "Update is scheduled to run in " & iMinutesToWait & " minutes."
                else
                    wscript.echo "Update could not be scheduled. No update will be performed!"
                end if
            end if
        else
            wscript.echo "Initilization failed. No update will be performed!"
        end if
    end if
end function

'-------------------------------------------------------------------------------
' jamaInitialize
'
' Use    : Initialize the script.
' Input  : iMinutesToWait - integer - Number of minutes to wait before
'                                     activating the schedule (output)
' Returns: Boolean - TRUE  - No errors detected.
'                    FALSE - An error was detected.
' Note(s): 1) ---
'-------------------------------------------------------------------------------
function jamaInitialize(byref iMinutesToWait, byref bForceUpdate)
    dim colNamedArguments
    dim bResult
    dim bFound

    bResult = false
    bFound  = false
    bForceUpdate = false
    set colNamedArguments = WScript.Arguments.Named

    if(colNamedArguments.Exists("minutes")) then
        iMinutesToWait = cint(lcase(colNamedArguments.Item("minutes")))
        bFound         = true
        bResult        = true
    end if

    if(colNamedArguments.Exists("force")) then
        if(lcase(colNamedArguments.Item("force")) = "true") then
            bForceUpdate = true
        end if
    end if

    if(not bFound) then
        iMinutesToWait = INT_DEFAULT_WAIT_TIME
        bResult        = true
    end if

    jamaInitialize = bResult
end function

'-------------------------------------------------------------------------------
' jamaRestartRequired
'
' Use    : Checks if an update of the agent will trigger a restart or reboot.
' Input  : ---
' Returns: Boolean - TRUE  - A restart is required.
'                    FALSE - A restart is not required.
' Note(s): 1) If the dependency could not be determined, this function will
'             return true.
'          2) For windows 2000 this function will always return true.
'          3) When true on a 2003 server, a reboot is required but not forced.
'          4) When true on 2008 or higher, the dependent services could be
'             restarted by the installer service.
'
'-------------------------------------------------------------------------------
function jamaRestartRequired()
    dim strCmd
    dim iResult
    dim bResult
    dim fso
    dim fh
    dim strTempFile
    dim strLine

    strTempFile = "jamaTaskUpdateAgent.$$$"
    bResult = true
    strCmd  = "tasklist /fo csv /m EventCommon.dll /FI ""imagename ne HealthService.exe"" /FI ""imagename ne MonitoringHost.exe"" > " & strTempFile
    jamaRun(strCmd)

    set fso = CreateObject("scripting.filesystemobject")
    if(fso.FileExists(strTempFile)) then
        on error resume next
            err.clear
            set fh = fso.OpenTextFile(strTempFile, 1)                           ' Open for reading.
            if(err.number = 0) then
                do while(not fh.AtEndOfStream)
                    strLine = lcase(fh.ReadLine)
                    if(instr(strLine, "info: no tasks") > 0) then               ' No dependencies found.
                        bResult = false
                    end if
                loop

                set fh = fso.GetFile(strTempFile)
                fh.delete(true)                                                 ' delete file.
            end if
        on error goto 0
    end if

    jamaRestartRequired = bResult
end function

function jamaRun(byval strCmd)
    dim objShell
    dim iResult

    iResult      = 0
    set objShell = wscript.createObject("wscript.shell")
    iResult      = objShell.run("cmd /c " & strCmd, 0, true) ' hidden and wait for result
    set objShell = nothing
    jamaRun      = iResult

    jamaRun = iResult
end function

'-------------------------------------------------------------------------------
' jamaScheduleUpdateIn
'
' Use    : Retrieve and schedule the required update.
' Input  : jdf            - object  - JDF object
'          iMinutesToWait - integer - Minutes to wait for the schedule.
' Returns: Boolean - TRUE  - No errors.
'                    FALSE - An error was detected.
' Note(s): 1) ---
'-------------------------------------------------------------------------------
function jamaScheduleUpdateIn(byref jdf, iMinutesToWait)
    dim strSourceFile
    dim strTargetFile
    dim strTargetDir
    dim iCount
    dim iResult
    dim bResult
    dim strCmd
    dim fso

    iCount = 0
    bResult = false
    set fso = CreateObject("scripting.filesystemobject")
    strSourceFile = "jamaAgentUpdate" & jdf.Platform & ".exe"
    strTargetDir  = jdf.ExpandString("$JDF_IN_DIR$" & "jamaAgentUpdate\")
    if(jdf.CreateDirectory(strTargetDir)) then
        strTargetFile = strTargetDir & strSourceFile
        bResult = fso.FileExists(strTargetFile)
        if(not bResult) then
            do
                iResult = jdf.GetFile("jdfBaseDistribution", STR_DOWNLOAD_PATH & strSourceFile, strTargetFile)
                iCount  = iCount + 1
            loop while((iCount < INT_RETRIES) and (iResult <> 0))
        end if
        if(iResult = 0) then
            bResult = fso.FileExists(strTargetFile)
            if(bResult) then
                strCmd  = strTargetFile & " /verysilent"
                bResult = jdf.ScheduleTaskIn(strCmd, iMinutesToWait)
            end if
        end if
    end if
    jamaScheduleUpdateIn = bResult
end function

'-------------------------------------------------------------------------------
' From template.vbs
'-------------------------------------------------------------------------------
'-------------------------------------------------------------------------------
' jdfLoadFramework
'
' Use    : Load and initialize the JDF framework.
' Input  : objJDF               - object - Object passed back with framework.
'          bInitialize          - bool   - true or null for initializing.
'          strVersion           - string - Minimum framework version required.
'          bForceVersion        - bool   - true, false or null.
'          strCustomerId        - string - Customer id or null.
'          strDefaultUploadPath - string - Default upload path.
' Returns: bool - TRUE  - Initialization of the framework succeeded.
'                 FALSE - Initialization of the framework failed.
' Note(s): 1) strVersion = null
'                 Any version of the framework will be accepted. The value of
'                 strForceVersion will be ignored.
'
'          2) strVersion = "x.x.x"
'                 Integer values, seperated by dots, e.g. : "3.5.11"
'
'          3) bForceVersion = null/false
'                 The given version is a minimum version required for the
'                 framework.
'
'          4) bForceVersion = true
'                 The framework version must exactly match with the given
'                 version number.
'
'          5) strCustomerId = null
'                 The customer id will be retrieved from the registry. See
'                 the documentatiohn for more information.
'
'          6) strDefaultPath = null
'                 The default upload path will be set to the root: "/". This
'                 argument can hold JDF variables (both default and custom
'                 provided in the jdf.jdp file). See the documentation for more
'                 information.
'
'          7) Normal use is:
'                 jdfLoadFramework(jdf, null, null, null, null, null)
'                 where jdf is the object you pass to the function.
'-------------------------------------------------------------------------------
function jdfLoadFramework(byref objJDF, byval bInitialize, byval strVersion, byval bForceVersion, byval strCustomerId, byval strDefaultUploadPath)
    dim bResult, fso, strFrameworkFile, objReg, strResult

    bResult   = false
    strResult = ""
    on error resume next
        err.clear
        set objReg=GetObject("winmgmts:{impersonationLevel=impersonate}!\\.\root\default:StdRegProv")
        if(err.number = 0) then
            objReg.GetStringValue  &H80000002, "SOFTWARE\Company\jama\jdf", "path", strResult     ' If you use another registry location, change this value!
            if((err.number = 0) and (strResult <> "") and (not isnull(strResult))) then           ' We got something, so try it.
                strFrameworkFile = strResult & "jdf.wsc"                                          ' This now should be the full path to the framework file.
                set fso = CreateObject("Scripting.FileSystemObject")
                if(err.number = 0) then
                    if(fso.FileExists(strFrameworkFile)) then
                        set objJDF = GetObject("script:" & strFrameworkFile)                      ' File exist, try to load it.
                        if(err.number = 0) then                                                   ' Framework found and object loaded.
                            if(bInitialize = false) then                                          ' No initialization.
                                bResult = false
                            else                                                                  ' Initialize the framework for use.
                                bResult = objJDF.InitializeFramework(strVersion, bForceVersion, wscript.scriptname, strCustomerId, strDefaultUploadPath)
                            end if
                        end if
                    end if
                end if
            end if
        end if
    on error goto 0
    jdfLoadFramework = bResult
end function
]]></ScriptBody>
          <TimeoutSeconds>300</TimeoutSeconds>
        </WriteAction>
      </Task>
    </Tasks>
  </Monitoring>
  <LanguagePacks>
    <LanguagePack ID="ENU" IsDefault="false">
      <DisplayStrings>
        <DisplayString ElementID="jamaAgent.Update">
          <Name>jamaAgent.Update</Name>
        </DisplayString>
        <DisplayString ElementID="jamaAgent.Update.ConsoleTask.AgentUpdate">
          <Name>jama Agent Update</Name>
        </DisplayString>
      </DisplayStrings>
    </LanguagePack>
  </LanguagePacks>
</ManagementPack>

Now the script in this management pack expects a few things.

 

const STR_DOWNLOAD_PATH = "/files/opsmgr/updates/agent/"

This constant is used to specify the path on the remote SCP server, where the update files can be retrieved. If you use another location, change the value of the variable.

 

strSourceFile = "jamaAgentUpdate" & jdf.Platform & ".exe"

This value is used to generate the name of the update to retrieve. We have re-packaged the two update files required for the agent update into a single executable (see also this article from Kevin Holman for more information). You can download an archive with three packages (for all suported platforms) on this location. If you create you own, just make sure the final name of the update package corresponds with the final strSourceFile variable value:

jamaAgentUpdateIA64.exe –> For the itanium platform
jamaAgentUpdateX64.exe –> For the x64 platform
jamaAgentUpdateX86.exe –> For the x86 platform

 

What we actually do is very simple.

  • We detect the platform we are running on and construct the correct file name of the file to retrieve (this update package contains the required .msp files for the agent).
  • We retrieve the update package.
  • We schedule the update package to run in ## minutes and to run unattended.
  • We hope for the best :)

If something goes wrong with the installation, the failure can be found in the log file created by the update. There are two logfiles:

%TEMP%\jamaAgentUpdate.<platform>.log
%TEMP%\jamaAgentUpdate.ENU.<platform>.log

Also the package itself will again detect we if we are running on the correct platform and only start the update if the package is the correct platform version.

 

As you can see in the .XML file, we use the jdfBaseDistribution account for retrieving the file. So you don’t need to configure the SCP server for every customer you have. Just make the file available for the jdfBaseDistribution account. The task itself is targeted against the health service. Select the “Agent By Version” leaf in the console, to get an overview of all your agents and their version:

image

If you select a health service, the task pane will show you the new update task:

image

Note that the task is not version aware, so it will always be available and will run. So you can do an update over a current installed agent (which does work, without issues). After selecting the task, the tasks pane will be shown and you can change the arguments for the script.

image

If you don’t change anything, the task will try to schedule the update to run in 120 minutes. If you require the another time, use the /minutes:## argument. The task will be scheduled ## minutes in the future. So if I want the update to run within 5 minutes:

image

 

 

The first thing the task does is checking for dependencies (for more information about the dependencies, see this link). If no dependencies are found, the job is scheduled ## minutes in the future (or 120 if no override is given):

image

 

 

If a dependency is found, the tasks will not schedule the update:

image

But as the output shows you can override the dependency detection, by adding the /force:true option in the argument list. If set the update will always run, even if there are dependencies detected.

 

After the update is finished, the agent should reflect the new version information:

image

You now can update your manual installed agent, using the operations console :)

Posted in Agent | Leave a Comment »

JDF: Jama Distribution Framework

Posted by MarcKlaver on March 28, 2011

This blog describes our framework for file distribution between a SCOM agent and our central SCOM environment. After this framework is installed, you will be able to transfer files between your SCOM agent and a central location.  Before you continue, be warned. I can not deliver a single file, which will do all that I just promised. I will however guide you through the work that needs to be done to setup a file distribution framework for SCOM, but you will need to change scripts, compile the management pack and install and configure a secure copy server. Changes will be minimized as much as possible and most actions are automated with scripts. But first things first, let’s describe the general idea behind the framework……

 

Transferring files

The idea behind the framework is to write an set of scripts, that would make it possible to transfer files to and from the SCOM agent. The main trigger for this wish was our installed base, which are manual installed agents only. Updating these agents manually is a long job for every CU update. If we could just schedule an update from the operations console and sit-back, that would be great. But in order to do that, we need to be able to get the required update files to the SCOM agent. Keep in mind that we do not have a single customer but multiple customers, all with (or without) their own file distribution method. Being able to distribute and update from the operations console itself would be ideal.

But transferring files should also be secure. Our solution: Secure Copy, based on private / public key pairs for data transfers.

And since we have multiple customers, we need to make sure that Customer A is not able to see or change any data we retrieve from Customer B (and vice versa). Therefor we have to implement a secure way to copy files to and from the SCOM agent and separate data based on customers. Our solution: A Secure Copy implementation, that is capable of creating virtual directory structures, based on a customer (secure copy) account.

We did not want to change any firewall rules (with the exception of adding a new management server). So we basically wanted to use communication from the SCOM agent on TCP port 5723 and only initiate communication from the SCOM agent. Our solution: A Secure Copy server configured to listen on port 5723 for secure copy clients.

Finally it needed to be implemented on all our agents, not just a subset of agents. Our solution: A Microsoft Windows Scripting Component, written in vbscript, which will run on all Windows versions we support (Windows 2000 and higher).

So if we draw this in a high overview, it would look like this:

image

And yes it looks very simple :) So the first thing we need is a secure copy server.

 

Secure Copy server

The secure copy server can be any implementation, as long as it is capable of implementing private/public key pairs and able to generate virtual directories. We use the bitvise winsshd service. Now before we start implementing this service, we need to make some rules.

  1. Only logins with private/public ssh key pairs are allowed.
  2. Only secure copy is allowed (no sftp or ssh).
  3. There will only be one general (shared) account and this account has read only access only to it own virtual directories. We call the shared account: jdfBaseDistribution This will prevent accounts from uploading data to a “shared” location. This is also the account that will be used to distribute the framework.
  4. Data will be compressed as much as possible, but the framework will not rely on the secure copy implementation and will “zip” the files to upload into a “package”.
  5. When a “package” is downloaded, it is expected to be a “zip” file, which can be extracted at the SCOM agent side by the framework.
  6. Files (not “packages”) that will be downloaded, must be compressed files to reduce network traffic (this is not forced by the framework).
  7. Customer accounts will have the naming convention: jdfCustomer_<CustomerName>
  8. The root (virtual) directory for every account will always be empty. All directories will be read only, with the exception of the “upload” directory. But it will not be possible to remove files from the “upload” directory. The following structure will be used for every customer

image 

Off course you can implement your own directories and rules for read/write, but these are the rules we use to explain the framework. If you start with these, the examples and code will work correctly.

 

Note: The shared account (jdfBaseDistribution) will not have a /upload directory!

A seperate account for each customer

We also need an account for every customer we support. Each customer will have the same folder layout (as show above), but this will be virtual directories. This will result in the same “view” for every customer, but data will be uploaded and downloaded from customer specific locations on the Secure Copy server.

image

Note: Files can not be deleted from the upload directory.

As can be seen, the account jdfCustomer_JAMA has four virtual mount paths. And with the exception of the root path (which should never be used), all data is ‘re-directed’ to a directory specific for that account. So if we create a second account (jdfCustomer_JAMA00), you can see that it has its own ‘re-directed’ directories.

image

This results in the same view for both customers (the Virtual mount path), but data being stored and retrieved from physical different locations.

We only allow the secure copy protocol and only logins with private/public key pairs. On how to configure winsshd with private/public key pairs, see this link.

 

ssh key pairs

For each customer we need to create an ssh public/private key pair. We use puttygen.exe to create the key pairs. The public key is saved as “jdfCustomer_<CustomerName>.public” and the private key is saved as “jdfCustomer_<CustomerName>.private”.

Note: You can not use a password phrase for the private key file (not supported by the framework). So be sure to only distribute the private key file to the correct customer.

When creating the keys with puttygen.exe, you should use the default Paramters, as shown below:

image

When your implementation of the secure copy server is working correctly you can continue with the next step: configuring the framework. Just make sure you can use the secure copy server with private/public key pairs before you continue.

 

The JDF Framework

The JDF Framework itself needs to be configured for use in your environment. What is required is fully described in the documentation of the framework. Although I tried to create a generic framework, still some environment dependent settings are required. But first you should download the framework :)

The framework can be downloaded here

The documentation can be downloaded here

 

What’s next

If you have setup the framework, you can start testing it. The framework download includes two examples on how to use the framework. The first thing we created with this framework was the ability to update our manual installed agents from the console (we have this working in our test environment). In my next blog I will create a management pack which will update your (manual) installed agents to CU4. Just make sure you got the framework up and running :) If you fail to do so just write a comment and I will try to answer your questions.

Posted in management packs | 1 Comment »

SCOM’s “un-discovery”. What doesn’t work here… And how to correct it.

Posted by rob1974 on January 26, 2011

 

SCOM’s main benefit of monitoring imho is it’s ability to discover what is running on a server and based on that information start to monitor the server with the appropriate rules. When you follow Microsoft’s best practices you’ll first perform a lightweight discovery to create a very basic class and have the more heavy discoveries run against that basic class. This is pretty good stuff actually. it helps quite a lot for the performance of an agent as it will only run heavy discoveries if the server has an application role and never run on servers which have nothing to do with that application.

However, I’ve recently found out a drawback with this 2 step discovery, which I can probably explain the best with a real world example:

Discover the windows domain controllers on “windows computers” (the management pack from where this discovery runs in is an exception. usually it’s in the application mp itself; apparently MS thought of domain controllers being basic info. similar discoveries for workstation and member servers can be found in this mp as well). For this discovery a wmi query is used to determine if the “windows computer” is a domain controller as well (SELECT NumberOfProcessors FROM Win32_ComputerSystem WHERE DomainRole > 3; if this returns something, it’s a dc)

image

When it is a “windows domain controller” it will run a few other discoveries to determine more info.image

Just by looking at the classes you can imagine it’s not really lightweight anymore.

image

So far so good, on all my windows computer I run a simple query and if that query returns something SCOM will also run a script that founds more interesting stuff about the DC.

But here’s the catch with this kind of discovery. Suppose I don’t need a certain DC anymore, but I still need to keep the server as it’s running some application I still need to use and monitor. What will happen? The lightweight discovery will do its job. It will correctly determine that the server is not a “windows domain controller” anymore and as a result it won’t run the script-discovery anymore.

You might ask, why is that bad, we didn’t want that, did we? Yes you are correct, we didn’t want to run this discovery against servers that aren’t DC’s, but SCOM doesn’t unlearn the discovered classes automatically. Because this discovery never runs again SCOM never unlearns this server doesn’t have the “Active Directory Domain Controller Computer Role” anymore. And this is the class that is used for targetting rules and monitors. So allthough SCOM knows the server isn’t a “windows domain controller” anymore, it still is monitoring the “Active Directory Domain Controller Computer Role”. This will result in quite a lot of noise (script errors, ldap failures, etc).

For now, there’s just a workaround available. You will need to override the 2nd discovery for that particular server. As the first discovery doesn’t include this server as an object of class, you can’t override the discovery for a “specific object of class: Windows Domain Controller”. You’ll need to create a group and include the server object. Then use the override the object discovery “for a group…” and choose the group you’ve just created.

image

What’s the point of disabling a discovery that didn’t run anyway? Well now you can go to powershell and run the “Remove-DisabledMonitoringObject” cmdlet. This will remove the discovered object classes for this discovery and all of the monitoring attached to those classes.

Discoveries make SCOM stand out from other monitoring tools, but it needs to work both ways. Finding out this took me about 1 day. And that’s just 1 issue with 1 server (DNS was also installed on this server and had the same issue). Loads of servers might change role without me knowing about it and when it’s not being reported to me I’ll just have extra noise in SCOM. I’m just not sure if this can be picked up within SCOM itself or that the “un-discovery” needs to be done by the mp’s logic. For the AD part it needs to be picked up by Microsoft anyway, but if the logic is build in the management pack then it will have an impact on all the custom build mp’s by all you SCOM authors out there.

Posted in general, management packs, troubleshooting | 2 Comments »

 
Follow

Get every new post delivered to your Inbox.