November 2007 - Posts
MOM 2005 Management Server Agent Stuck Queue One of the nicer aspects of MOM 2005 is its agent’s ability
to cache information incase there is a loss of network connectivity. This ability is available to both
the Agents as well as the management server. The sizes of the Queues space are easily changed under the
global settings for both the agent settings and Management server Properties when you right click on the
global settings icon under the administration folder in the MOM Administrator console.
I would recommend keeping the default settings for the queue size or increasing it in small increments
of a few megabytes as needed. I would not recommend having a queue larger than 90 megabytes for the
Management servers and no larger than 8 megabytes for agent computers. Doing so could introduce other problems.
We had been trouble shooting a problem where our management server queue would fill up.
Using performance monitor I added all the management servers Queue Space Percentage Used counter
to see what percentage of the queue space was used. To my surprise all the Management servers queues
were either 85% in use or higher. It stayed this high even when the mom service was restarted properly
over an hour or so. This is where I was introduced to the problem of a 'stuck queue', where something
in the queue has caused the agent to not process items/transactions after a certain point.
The fix is to clear the queue or cache, by stopping the mom service on your management server and then
delete the folder
c:\documents and settings\all users\application data\microsoft\microsoft operations manager\*your management group name*
Then restarting the mom service will recreate the folder and files.
If you use the Manualmc.txt file to discover computers have you ever gotten this error in event viewer?
Event Type: Error
Event Source: Microsoft Operations Manager
Event Category: None
Event ID: 21074
Date: 10/16/2007
Time: 10:08:31 AM
User: N/A
Computer: MOMMANAGEMENTSERVERNAME
Description:
Computer Discovery is ignoring incorrect machine name "AGENTCOMPUTERNAME " in
"ROOTDRIVELETTER:\Program Files\Microsoft Operations Manager 2005\ManualMC.txt".
The entry does not validate to a format that can be used in this text file.
Please specify entries in one of the supported formats.
Supported Formats:
Netbios Computer Name
Domain\Computer Name
FQDN name
Well the answer is in the event description:
Computer Discovery is ignoring incorrect machine name "AGENTCOMPUTERNAME "
There is an extra space that was pasted in with the computer name, and the computer will never be discovered until the extra
space is removed. These are always fun at 2:00 am. Enjoy!
If you have multiple management servers in the same time zone, which are members of the same management group and have noticed that your SQL Server is getting over taxed during the discovery process, stagger the times when your management servers run discovery. In the MOM Admin console under computers -> management servers right click a management server and select properties. Click the discovery tab and uncheck the use global settings button and change the time when discovery is run by like an hour before, so one management server will run discovery at 1 am the other management server will run discovery at 2 am.
Earlier this week I created a group with a "&" in the name. In the right hand display pane, Q&A would display as QA in the list view on the left side it would be displayed as Q&A.
This was on a new OpsMgr 2007 with SP1 RC management group.
Two good bits on targeting in Operations Manager 2007.
Part one is about the differences between MOM 2005 and Operations Manager 2007. http://blogs.technet.com/momteam/archive/2007/10/31/targeting-series-part-1-differences-between-2005-and-2007.aspx
Part two is about why targeting computers groups can fail in Operations Manager 2007. http://blogs.technet.com/momteam/archive/2007/11/14/targeting-series-part-2-why-targeting-a-computer-group-fails.aspx
Check out this site, they have gotten more active the past few months.
The following text is from an email blast from the Operations Manager Product team.
The product team is excited to announce the availability of the Operations Manager 2007 Service Pack 1 Release Candidate, on Microsoft Connect.
The key goals of this public distribution of the Service Pack 1 RC is to gain additional feedback from production deployments, to satisfy customers who are awaiting a stable product before the end of the year, and assure that the final RTM is of the highest possible quality. The RTM of Service Pack 1 is targeted for mid-February 2008, and will:
· Address critical customer initiated issues, including a roll-up of all hotfixes.
· Improve the ability to deploy and use Operations Manager 2007.
· Enhance the supportability of Operations Manager 2007.
This Service Pack 1 RC is a publicly available release that is fully supported, and which will be fully upgradable to the final RTM version of the Service Pack. Support options include:
· A dedicated SP1 newsgroup at Microsoft.public.opsmgr.sp1 (available under http://www.microsoft.com/communities/newsgroups/en-us/default.aspx)
· Microsoft Customer Support Services (CSS)
To learn more about the service pack and to download the release candidate do the following:
· Go to http://connect.microsoft.com and sign-in with your Live ID
· Click on the "My Participation" link
· Select the "System Center Operations and Service Management" link
For customers who were not already part of the OpsMgr beta they can sign-up via the http://connect.microsoft.com/systemcenter page.
We ran into this problem a few weeks ago. For a little more than a week our MOM 2005 daily statistics that were being collected nightly were the exact same. I found this a bit strange so I ran the stat collection in the morning which gave the same results.
After a bit of digging I found that several key SQL Jobs for the OnePoint database were no longer running as they were scheduled to be running. It seems that the last round of patching stopped the SQL Jobs from running on their proper schedule. (Patching took place right after the last time the SQL Jobs ran) Once I got the three SQL jobs that start with "OnePoint - TodayStatisticsUpdate*" the statistics started updating as expected.
Have you ever wanted quit your IT job, and sell all of your belongings to travel the world and fish? Well a co-worker of mine did just that. Follow Jason's adventure at http://www.onfurlow.com