Shaun Cassells at MyITForum.com

Systems Management Server (SMS) 2003, System Center Configuration Manager (SCCM or ConfigMan) 2007, PowerShell, scripting and security (including patching), Finance, Fitness and Fun

May 2007 - Posts

How to Determine SMS Thread Responsible for High CPU Use [sms] Q234508

This article has been retired off of Microsoft’s KB archive.  This technique works perfectly for SMS 2003 and several other server applications.  I thought posting it here again would be more useful and searchable.
 
SMS: How to Determine SMS Thread Responsible for High CPU Use [sms]
ID: Q234508    CREATED: 03-JUN-1999   MODIFIED: 30-JUN-2000
winnt:2.0
winnt
PUBLIC | kbSMS200

======================================================================
SUMMARY
======================================================================
If System Management Server 2.0 or 2003 causes high processor utilization on the site server, there is a way to track the SMSEXEC thread responsible in order to begin troubleshooting the problem.

MORE INFORMATION
======================================================================
To determine the SMS thread responsible for high CPU utilization, run Performance Monitor from a remote server, and then add the counters in the following table:

+------------------------------------------------------------------------+
| Object    | Counter          | Instance(s)                             |
+------------------------------------------------------------------------+
| Processor | % Processor Time | <all instances>                         |
+------------------------------------------------------------------------+
| Thread    | % Processor Time | <all instances needed> example: smsexec |
+------------------------------------------------------------------------+
| Thread    | ID Thread        | <all instances needed>                  |
+------------------------------------------------------------------------+
| Thread    | Thread State     | <all instances needed>                  |
+------------------------------------------------------------------------+
Definition of Thread State:
   The current state of the thread:
   0 = Initialized
   1 = Ready
   2 = Running
   3 = Standby
   4 = Terminated
   5 = Wait
   6 = Transition
   7 = Unknown

   A Running thread is using a processor; a Standby thread is about to use one. A Ready thread wants to use a processor, but is waiting for a processor because none are free. A thread in Transition is waiting for a resource in order to execute, such as waiting for its execution stack to be paged in from disk. A Waiting thread has no use for the processor because it is waiting for a peripheral operation to complete or a resource to become free.

Using Performance Monitor in Report (CTRL+R) view:
   Locate the column with the thread(s) using the greatest processor time(s) and note the ID Thread number.

Using Performance Monitor in Chart (CTRL+C) view:
   Locate the line with the thread(s) using the greatest processor time(s), and note the Thread Instance.
   Locate the ID Thread counter with the same instance number. Highlight (right-click) that counter.
   The number in the boxes Last, Ave, Mix, and Max will all be the same. This is the ID Thread number.

Convert the decimal version of the ID Thread (given on the report) to hex using the Calculator (Calc.exe).

To begin troubleshooting the root of your high processor utilization, find the log file in the SMS\logs directory on your site server using the command Findstr /i <HexID>*.log.

SQL Perf

1) Setup Performance Monitor Logs:

For every connection to SQL server, a new SPID (SQL Server Process Identifier) will be created.  For each SPID, a new thread of the SQLSERVER.EXE process is created, which is also given a KPID ("KPID" is the kernel-process ID. Under SQL Server for Windows this is the thread ID number, also known as "ID thread," and is assigned by Windows when the thread is created).

A> To identify which threads of a process are causing high CPU utilization, perform the following steps (steps assume Windows 2000/XP):

1.  On start menu, select 'Run', and type "Perfmon"
2.  Ensure you have "System Monitor" selected.  If any counters are added by default, remove them from display.
3.  In the toolbar, click the "+" to add counters to your display.
4.  (select the proper machine if running perfmon against a remote machine)
5.  Under "Performance Object", select 'Thread'
6.  Accept the default choice for "Select counters from list", "%ProcessorTime"
7.  Under "Select instances from list:", scroll down to the first SQL Server thread, displayed as "sqlservr/0".  Click on "sqlsrvr/0", then scroll down to the last SQL Server thread (may be different name), for example "sqlsrvr/9".   Hold down the "Shift" keyboard key, and click on the last instance, so that all SQL Server thread instances are selected.  You may wish to scroll up and down the list to verify that you have them all selected.  When you have all SQL Server thread instances selected, click "Add"
8.  Click "Close" on the "Add Counters" dialogue box
9.  Click the lightbulb icon on the toolbar to enable highlighting.
10. In its idle state, all threads should use less than 5% processor utilization over time.  The counter line for the current thread selected should now be highlighted in the display.  If any threads display that they are significantly above 0% processor utilization over time, and do not seem to release the processor, note their "Instance" number.  This is referred to as the "Thread ID", and is the thread responsible for increased processor utilization by SQL Server.
11. Minimize the performance monitor console.

B> To identify which component of SMS is responsible for the increased SQL Server processor utilization, perform the following steps (steps assume Windows 2000/XP):

1.  On start menu, select 'Run', and type "Perfmon"
2.  Ensure you have "System Monitor" selected.  If any counters are added by default, remove them from display.
3.  In the toolbar, click the "+" to add counters to your display.
4.  (select the proper machine if running perfmon against a remote machine)
5.  Under "Performance Object", select 'Thread'
6.  Under "Select counters from list:", select "ID Thread"
7.  Under "Select instances from list:", select all SQL Server Thread IDs noted above, in the previous step.  Note, you may have multiple thread ids identified.  If this is the case, then select the first thread id, then use "CTRL-Click" to select multiple other threads.  Click "Add" when all appropriate thread ids are selected.
8.  Click "Close" on the "Add Counters" dialogue box.
9.  For each Thread ID, there will be a value specified just under the graph display, such as "Last", "Average", etc.  Note the number in "Last".  This is the KPID of the thread.
10. Minimize this instance of Performance Monitor.
11. Launch the SQL Query Analyzer, attached to the target instance of SQL Server.
12. Run the following query: select SPID, KPID, status, hostname, dbid, cmd from master..sysprocesses
13. The results pane of the Query Analyzer should populate with the results of the query.  Note that for each SPID, there is a KPID.  Locate the KPID identified in step 9 above.  Note the SPID for each KPID identified in step 9.
14. Run the following query: sp_who2.  Each row will contain a distinct SPID.  Locate the rows for each SPID identified above.  Scroll the results window the right, and note the "ProgramName" column.  The data contained in this column will identify BY NAME the component of SMS which has initiated a connection to the SQL Server to perform a work item.  You will see “ProgramName"s like "SMS_COLLECTION_EVALUATOR", which are the names of the threads within SMS that  would attach to the database.

You have now identified which thread in SMS is causing the SQL server to "peg" the CPU(s) of the SQL Server. To see what the work item is that SMS is issuing to the SQL Server which is causing the performance issue, run the following query in Query Analyzer:

DBCC INPUTBUFFER ( SPID ) for the SPID noted above.  The output, under the "Event Info" column will contain the first 255 characters of the TSQL in the work item.  This should give you clues as to what the SMS component is attempting to do which is pegging the processor(s) on the SQL Server.

2) Monitor the ENT site to determine rather or not CPU is consumed at 100% for an extended amount of time. It’s normal at times for the CPU to spike to 100% as long as it is not for an extended period (this is subjective depending on your server). The goal here is to create a baseline by monitoring this server.

Reference Articles:

234508 SMS: How to Determine the SMS Thread Responsible for High CPU Utilization
http://support.microsoft.com/?id=234508
117559 INFO: How to Correlate Spid, Kpid, and Thread Instance
http://support.microsoft.com/?id=117559
121555 Compression Performance Tuning in Systems Management Server
http://support.microsoft.com/?id=121555
224587 HOW TO: Troubleshoot Application Performance with SQL Server
http://support.microsoft.com/?id=224587
150934 How to Create a Performance Monitor Log for NT Troubleshooting
http://support.microsoft.com/?id=150934

What Anti-Virus scanning exclusions should be considered for system and servers?

Shaun's Note: I am not the orginal author.  I have updated some of the info:

Consider the following file scanning exceptions for your Anti-Virus software where applicable:

NOTE: The %systemroot% is normally the C:\WINDOWS or C:\WINNT directory depending on your OS. 

NOTE: the %systemroot% variable will not work as an exclusion for some OSs.  So make sure to spell out full path in your exclusion files (GPO or via AntiVirus Server)

1.) %systemroot%\System32\Spool (and all the sub-folders and files)
2.) %systemroot%\SoftwareDistribution\Datastore
Refer to the following article for information:
KB822158 - Virus scanning recommendations for computers that are running Windows Server 2003, Windows 2000, or Windows XP http://support.microsoft.com/kb/822158
3.) Any Network Drives that are mapped.


The following steps are Server Role specific:
==========================================================
1.) If your system is also a Domain Controller (DC) / DNS / DHCP also exclude the following from Anti-Virus Scanning:
a.) %systemroot%\Sysvol folder (include all the sub-folders and files)
b.) %systemroot%\system32\dhcp folder (include all the sub-folders and files)
c.) %systemroot%\system32\dns folder (include all the sub-folders and files)
d.) %systemroot%\ntds
Refer to the following article for information:
KB822158 - Virus scanning recommendations for computers that are running Windows Server 2003, Windows 2000, or Windows XP http://support.microsoft.com/kb/822158

2.) If File Replication (NTFR) service is running on your system, make sure your Anti-Virus software is compatible:
KB815263 - Antivirus, backup, and disk optimization programs that are compatible with the File Replication Service
http://support.microsoft.com/kb/815263
And exclude:
a.) %systemroot%\ntfrs folder (include all the sub-folders and files)
b.) Files that have the .log and .dit extension

3.) If you have IIS installed, exclude:
a.) The IIS compression directory (default compression directory is %systemroot%\IIS Temporary Compressed Files)
b.) %systemroot%\system32\inetsrv folder
c.) Files that have the .log extension
Refer to the following knowledge base articles for reference:
KB817442 - IIS 6.0: Antivirus Scanning of IIS Compression Directory May Result in 0-Byte File
http://support.microsoft.com/kb/817442
KB821749 - Antivirus software may cause IIS to stop unexpectedly
http://support.microsoft.com/kb/821749

4.) If you have SQL installed, you may want to exclude the SQL folder and databases files (or database file types) from scanning for performance reasons:
KB309422 - Guidelines for choosing antivirus software to run on the computers that are running SQL Server
http://support.microsoft.com/kb/309422

5.) If you have Exchange installed, perform the relevant file-based scanning exclusions listed in Knowledge Base articles:
KB328841 - Exchange and antivirus software
http://support.microsoft.com/kb/328841
KB823166 - Overview of Exchange Server 2003 and antivirus software
http://support.microsoft.com/kb/823166
KB245822 - Recommendations for troubleshooting an Exchange Server computer with antivirus software installed
http://support.microsoft.com/kb/245822

6.) If you have Cluster services, make sure your Anti-Virus software is compatible:
KB250355 - Antivirus Software May Cause Problems with Cluster Services
http://support.microsoft.com/kb/250355
NOTE: If you have a SQL cluster, make sure that you exclude these locations from virus scanning:
a.) Q:\ (Quorum drive)
b.) %systemroot%\Cluster
c.) SQL Server data files that have the .mdf extension, the .ldf extension, and the .ndf extension

7.) If you have Sharepoint installed, you should exclude:
a.) Drive:\Program Files\SharePoint Portal Server
b.) Drive:\Program Files\Common Files\Microsoft Shared\Web Storage System
c.) Drive:\MSDEDatabases (particularly on SBS) (where Drive: is the drive letter where you installed SharePoint Portal Server)
Refer to the following knowledge base articles for reference:
KB320111 - Random Errors May Occur When Antivirus Software Scans Microsoft Web Storage System
http://support.microsoft.com/kb/320111
KB322941 - Microsoft's Position on Antivirus Solutions for Microsoft SharePoint Portal Server
http://support.microsoft.com/kb/322941

8.) If you have a Systems Management Server (SMS), you should exclude folders:
a.) SMS\Inboxes
b.) SMS_CCM\ServiceData
Refer to the following knowledge base articles for reference:
KB327453 - Antivirus programs may contribute to file backlogs in SMS 2.0 and in SMS 2003
http://support.microsoft.com/kb/327453

NOTE: If you exclude the SMS\Inboxes directory from virus scanning or remove the antivirus software, you may make the site server and all clients vulnerable to potential virus risks. The client base component files reside in the SMS\Inboxes directory.

9.) If you have a MOM (Microsoft Operations Manager) Server, you consider excluding:
a.) Drive:\Documents and Settings\All Users\Application Data\Microsoft\Microsoft Operations Manager
b.) Drive:\Program Files\Microsoft Operations Manager 2005 (where Drive: is the drive letter where profiles are located)

10.) If you have an Internet Security and Acceleration Server (ISA) Server, you should exclude:
a.) The ISALogs folder. By default, the ISALogs folder is located in the folder where you installed ISA Server. Typically, this location is Drive:\Program Files\Microsoft ISA Server.
Refer to the following knowledge base articles for reference:
KB887311 - Event ID 5, event ID 14079, and event ID 14176 are logged in the Application log on your Internet Security and Acceleration Server 2000 computer
http://support.microsoft.com/kb/887311

11.) If you have a Windows Software Update Services (WSUS) Server role, you consider excluding:
a.) Drive:\MSSQL$WSUS
b.) Drive:\WSUS
(where Drive: is the drive letter where you installed Windows Software Update
Services)
Also refer to the following knowledge base articles for reference:
KB900638 - Multiple symptoms occur if an antivirus scan occurs while the Wsusscan.cab file is copied
http://support.microsoft.com/kb/900638


MORE INFORMATION:
KB49500 - List of antivirus software vendors
http://support.microsoft.com/kb/49500
KB129972 - Computer viruses: description, prevention, and recovery
http://support.microsoft.com/kb/129972

Small Business Server (SBS):
========================================
KB885685 - How to troubleshoot the POP3 Connector in Windows Small Business Server 2003
http://support.microsoft.com/kb/885685

SOX050603700001 - How do I exclude a file from AV scanning?
SOX040212700018 - Anti Virus Software and System State Backup
SOX060301700048 - ISA 2004 Firewall Service crashes intermittently with Event ID: 5 Source: Microsoft Firewall
SOX060307700037 - MOM 2005/ File level Anti-virus scanners
SOX061205700029 - MOM Agent Installation fails with -2147023277

KB837932 - Event ID 2108 and Event ID 1084 occur during inbound replication of Active Directory in Windows 2000 Server and in Windows Server 2003
http://support.microsoft.com/kb/837932
Anti-Virus folder exclusions have not been configured (Exchange)
http://www.microsoft.com/technet/prodtechnol/exchange/Analyzer/9fb755f5-5f0b-4817-a584-70c76a62eae2.mspx
Process: Manage Antivirus Software on Domain Controllers
http://www.microsoft.com/technet/solutionaccelerators/cits/mo/winsrvmg/adpog/adpog3.mspx#EHBBG

Keywords:
AV scanning, Scan exceptions, Antivirus scanning, first level scanning exclusions, first level scanning exceptions, Server Roles, Server scanning