The way your clients behave and how you would like them to behave are not necessarily the same thing. That isn’t always bad, but if you understand how they behave and can explain that, then you (and your stakeholders) can be reasonably confident that everything is under control. If not, then it’s time to do some investigation and possibly make some adjustments.
For example, the following two graphs are how two populations of clients report ConfigMgr heartbeat discovery:
You’ll notice the graphs are quite different. Part of the reason is that the first population has clients reporting heartbeat every day, and thus most clients report within a day, a good number more in two days, and the rest progressively over time. This would be reasonable for a client-base that is supposed to report heartbeat daily and is mostly powered up in the office each day (online to the ConfigMgr servers) with a small fraction of machines being mobile or being rebuilt over time. If that’s expected behavior, then all is well in this environment.
The second population has a small fraction of clients reporting in during the first two days, a quiet period (presumably a weekend), and then linearly more clients reporting heartbeat until day 7. From there the ‘laggards’ report in similarly to the first population, but at a slightly steeper rate. In this case the heartbeat discovery rate is set to 7 days, so that curve makes sense.
Both populations look to be reasonably healthy, in that a very high fraction of the clients report in within their expected heartbeat discovery cycles. When the heartbeat rate is set to daily then that conclusion can be reached more quickly, and any future disturbances in this curve would be caught more quickly. So it is advisable to use a frequent heartbeat cycle if possible (if servers and network allow it).
How can you produce a baseline client activity curve for your clients? The following SQL will do it for you:
declare @olddate datetime, @NullVal datetime, @today datetime
set @olddate=DATEADD(day,-7, @today)
set @NullVal = CONVERT(datetime,'1/1/1980')
select datediff(DAY, AgentTime, @today) 'DaysAgo', count(DISTINCT Name0) 'Clients' into #temp1
from v_R_System sys full join (select ResourceId, MAX(AgentTime) as AgentTime from v_AgentDiscoveries where agentname<>'SMS Discovery Data Manager' AND agentname not like '%!_AD!_System%' ESCAPE '!'group by ResourceId) disc on disc.resourceid=sys.resourceid where client0=1 and obsolete0=0
group by datediff(DAY, AgentTime, @today)
-- thanks to http://www.sqlteam.com/article/calculating-running-totals for the basis of this part:
SELECT a.DaysAgo, SUM(b.Clients) 'Total Clients', a.Clients ' Daily Clients'
FROM #temp1 a CROSS JOIN #temp1 b WHERE (b.DaysAgo <= a.DaysAgo)
GROUP BY a.DaysAgo,a.Clients ORDER BY a.DaysAgo,a.Clients
Plug the results into Excel and produce the graph from there. Now you understand your clients better!
p.s. This kind of analysis gets really interesting when you compare key client service activities (such as patch installation, inventory reporting, software distribution status, etc.) vs. these baseline curves. If any of those deviate significantly from the baseline, then there’s probably something wrong with those subsystems. The queries for that analysis are not very different from those above.
Those of us that have been in the ConfigMgr (SCCM/SMS) business for a while have had the joy (and challenge) of learning the new versions as they’re developed and then released. There are many approaches to that learning curve, and I’m a fan of all of them. You can learn top-down, meaning you review the marketing and product documentation that highlights the differences and then focus on whatever details are relevant to you. You can experiment with the new version, seeing how it works, what’s changed, and what’s challenging - that’s a middle ground approach to me. And then there’s the bottom-up approach of looking at the technical changes and trying to understand why they were made and how you can use them.
For some reason I actually like the latter approach most of all. By seeing the technical implementation details I can understand what’s really changed. A good example is log files, both client-side and server-side. If a new log is introduced and it has some substance, then that must be an important component, and it must add some important functionality. The high-level details give us the context of that importance, but the low-level details give us the clues to make it work well.
As I’ve mentioned in the past, the first moment at which I fell in love with ConfigMgr was looking at the database. That’s hard to explain, but I still maintain that ConfigMgr is a data-centric system and thus the database is very important. When I’m learning a new version of ConfigMgr, I start by focusing on the database changes.
An easy approach to looking at database changes is to fire up SQL Server Management Studio, connect to your new ConfigMgr database, expand the Views node, and look for changes. With up to 1,000 views that’s problematic, though you can ease the process by ignoring the collection views, inventory views, and ‘secondary views’ (those that seem to be intermediate views). But that all depends on your memory of the previous version’s database schema and what’s important, so it’s easy to miss fun new views.
Another alternative is to query for the differences. Those queries (assuming you’re doing them from the database server with the previous version) would be:
-- lost views
SELECT * from sysobjects where type='V'
and name not like 'v_CM_RES_COLL_%' and name not like '_RES_COLL_%' and name not like 'v_HS_%'
and name not in
(SELECT name from [new_version_server.new_version_database].dbo.sysobjects where type='V'
and name not like 'v_CM_RES_COLL_%' and name not like '_RES_COLL_%' and name not like 'v_HS_%')
order by name
-- new views
SELECT * from [new_version_server.new_version_database].dbo.sysobjects where type='V'
and name not like 'v_CM_RES_COLL_%' and name not like '_RES_COLL_%' and name not like 'v_HS_%'
and name not in
(SELECT name from sysobjects where type='V'
and name not like 'v_CM_RES_COLL_%' and name not like '_RES_COLL_%' and name not like 'v_HS_%')
order by name
You’ll notice the queries ignore the collection and hardware inventory history views. Collections are a constant and I’d rather compare hardware inventory changes by looking at the SMS_def.mof changes.
The lost views will likely be a manageable number but the list of new views could be quite numerous, depending on how big the version differences are between the versions you’re comparing. Service packs are likely to introduce few changes, major versions will introduce a lot, and minor versions will be somewhere in between.
If there are a lot of changes then you’ll want to do variations on the above queries, filtering out groups of views as you understand them. Views do often have naming conventions that group related views to each other and so if you understand the significance of the group then you can eliminate them from your list of views to study.
Admittedly, interpreting what a new view adds can be tricky. The name of the views and the columns will give clues. Doing queries against the view when it’s got some data will help further. To a large degree you just have to follow your instincts and focus on those that seem most interesting. And this is just one way to learn the new version, so don’t spend too much time focusing on this technique.
Finally, you might ask what can be seen by using this technique to compare ConfigMgr 2007 with ConfigMgr 2012. My suggestion is that it’s too early to jump to conclusions. ConfigMgr 2012 will change as time goes on (otherwise it would be released already). Exciting view changes are blog posts for future dates.
p.s. I focus on views here, as opposed to tables, because views are what we have always been encouraged to use as ConfigMgr customers. Generally that works, but sometimes there are interesting table additions that don’t get reflected in views. For that reason you may want to also look at tables, but that’s an easy extension of the above concepts.
Throughout my career I’ve been amazed by the diversity of the environments that you and I work in or serve. People sometimes ask why computer management is often so complex, but to me the answer is simply that there’s no one-size-fits-all solution. That’s part of what makes this field so much fun!
Furthermore, each of the variables is a scale, with different organizations falling at different points in each dimension. In no particular order:
- size by count of clients - 100 clients to 300,000+ clients
- size by number of server sites - 1 site to 18,0000 sites
- size by number of locations – often much more than the number of server sites
- size by the size of individual sites or locations - 2 clients to 10,000 clients at an individual site or location
- 'stability' of clients - all users at designated, fixed machines to many users swapping many machines and often on the road at many locations (possibly connecting by various means). Rebuilding machines or having multiple clients per user can further complicate this variable
- features used - such as inventory collection, operating system deployment, software distribution, patching, remote control, software metering, and configuration management
- sophistication of feature use - out of the box defaults to highly customized inventories, extensive reporting, numerous and fancy packages, etc.
- network complexity - no links or all good links, to a mish-mash of link types, including complex scenarios such as satellite links
- administration and engineering organization – one unit, centralized and working well together, to many units, decentralized and centralized, not working well together
- business organization (stakeholders) – centralized to many business units with widely varying goals, philosophies and needs
- security - out of the box, generic to highly locked down with third party tools and complete paranoia
- domain/forest models - one domain to multi-master domain model, or multi-domain model
- administration and engineering sophistication - barely know what they're doing to gurus with programming skills ready to change or fix anything
- management support – “don’t screw up”, to “I trust you, and if the users give you a hard time – why don’t they take computer management seriously?”
- sensitivity to problems / risk tolerance - go with the flow to it must work first time, every time, forever
- history - starting clean, or heavily invested in a competitor, to heavily invested in ConfigMgr
- vendors - pure Microsoft shop to mix of Microsoft, Linux, 3rd parties, etc.
- hardware – all one kind of high-end PCs/laptops to a mish mash of low end to high end PC's from various vendors plus Apple. Devices further complicate this story
- operating systems – mostly XP or Win7 to a sprinkling of everything ever offered
- budget - money is not an object to do it all for free (or get the vendor to pay for it)
- management wisdom - total wisdom to hasn't got a clue
- time frame - take the time it needs to “you've got until Friday”
- experience doing projects
- job security – cover my ass and hope for the best, to let’s make it happen and I can handle the risks
What other variables would you add?
I just had the most amazing day with my new co-workers at 1E! Yes, that means I’m now working at 1E, and thus not working at Microsoft. I had a wonderful 12 years at Microsoft, working with many great people in various roles, but I decided it was time to take my career to the next level.
Those of you that have known me throughout my career know that I have always loved ConfigMgr (originally SMS) and also know that it has been all about the community for me. We, the community (customers, Microsoft, and partners) have made ConfigMgr great, through good times and bad. We’ve continually learned from each other, shared tools, found solutions, etc. I started as a customer (at the Government of Ontario), and have participated in the community through books, my magazine column, this blog, presentations at MMS, forums, etc. At Microsoft it was a huge honor to be a technical writer for several years and then to be part of the team that deployed ConfigMgr to 300,000 clients in a complex set of environments as the key internal customer of ConfigMgr. That included ‘dogfooding’ ConfigMgr through many releases and thus working with the product team to ensure it was ready for your production environments.
So working at 1E is the right move for my next career stage. Working with MANY customers in person at the 2010 Partner of the Year will allow me to take my passion for ConfigMgr and computer management to new highs and to contribute in new ways to the community and 1E in particular. Working with the all-star team here at 1E is incredibly inspiring.
It’s always sad to leave, but because the team I left at Microsoft is so great, I’m more than confident they will continue to dogfood ConfigMgr at the high level we all expect and will be very successful with the ongoing operations of ConfigMgr at Microsoft and related environments.
1E, as most of you know, has been a great partner company for ConfigMgr for over a decade. 1E provides consulting services and proven content distribution, power management, user-driven software deployment, and client health solutions, deployed at many ConfigMgr customers, including many of the largest customers. That history of innovation is getting even better, with products such as NightWatchman Server Edition and AppClarity. See http://www.1E.com/ for more background.
I look forward to working, talking, and sharing with you in my new role. This is gonna be great!!
MMS 2011 is just days away, and while there are many things that make a conference great, mementos are key amongst them. Yes, swag!
I’m sure there will be some great stuff at MMS, but my latest addition will be hard to beat:
This awesome collector’s item is a “WBEM” team mug from about a dozen years ago, where WBEM means Web-based Enterprise Management. WBEM is the standard that WMI is based on (which in turn is Windows Management Instrumentation). WMI is central to ConfigMgr and grew out of the same team that now brings us ConfigMgr (SMS back in those days). WBEM is the standard that makes WMI a management platform for more than just Windows. WBEM is hard to pronounce literally, and thus the close-enough “web-M” is used. Notice the web theme on the mug.
Thanks go to Margaret Boos.
This mug will, of course, get a prominent place on may wall-of-swag: http://myitforum.com/cs2/blogs/pthomsen/archive/2010/04/17/146119.aspx
Inactive clients = offline clients + unhealthy clients.
That’s something I’ve said in words many times in the past, but the formula is a powerful way to articulate it. We all did at least basic algebra in high school and thus understand that if you know 2 variables in such an equation then you can solve for the third variable. So if we know two of the client health values confidently then we know the third one confidently – that’s a breakthrough!
To recap, those of us that do computer management client health know that fully unhealthy clients and offline clients share a key trait: they don’t report any data to the servers. Offline clients are ‘good’ in the sense that they can’t do anything bad to other computers (like infecting them with viruses), and they can’t be managed, so they should be removed from the denominator when calculating computer management success (such as patch management success). But fully unhealthy clients do the same thing (don’t report data to the servers). Therefore we can’t directly distinguish between offline clients and fully unhealthy clients.
If we have a small number of inactive clients then the whole issue is insignificant. But if we have too many of them then we do care about this issue. For example, if we do a patch (software update) management deployment and 96% of the clients are reporting as compliant after 3 weeks while our SLA is 98% then we would be worried. If we checked the status and found that 2.3% are reporting “Enforcement state unknown”, 1.5% failed to install the patches (for the usual patch installation issues), and all other ‘buckets’ are very small, then obviously the '”Enforcement state unknown” clients are the important ones. As client health administrators we have to explain what that bucket means.
"Enforcement state unknown” brings us back to the formula. These are generally inactive clients, which is easily demonstrated with a simply query or report. But are they offline or unhealthy? If we could reliably distinguish the offline clients (‘O’ in the formula) then we could take them out of the patch management success calculation. Or we could focus on the unhealthy clients and fix them. In a future post I’ll discuss the offline client problem but for now let’s agree that there is no good solution for it at this time.
The good news is that with ConfigMgr 2012 we should be able to reliably identify the unhealthy clients (‘U’ in the formula). We know ‘I’ and “U’ and therefore can calculate ‘O’! The unhealthy clients are determined by the ConfigMgr 2012 “ccmeval.exe” program, which has few dependencies and runs on the clients themselves, and therefore should be very reliable. We already have ‘I’ from the server-side data. Therefore the formula allows us to know ‘U’ with great accuracy. In my case I found that 7% of the ‘Enforcement state unknown’ clients were unhealthy and therefore the the SUM administrator could be confident that most of the “Enforcement state unknown” clients were simply offline. He could adjust his compliance calculation accordingly and know that he had more than 98% compliance, meaning that he had indeed achieved his SLA.
p.s. Credit goes to Josh Pointer (Principal PM, ConfigMgr product team) for emphasizing this concept to me. He didn’t word it this way but the concept is the same.
I’m thrilled to see that the powers-that-be accepted my latest proposal for a Microsoft Management Summit presentation. Kristina Ashment and I will be presenting BG02, “Client Health in Configuration Manager 2012 – How Microsoft IT is Using It”. Kristina is the ConfigMgr product group program manager currently responsible for client health and she will present the design details. I’ll present the ‘how we use it’ details, and thus practical information on how you’ll benefit from the ConfigMgr 2012 (AKA v.Next) client health feature. Kristina and I have enjoyed working together in various capacities for about 3 years so I’m sure we’ll give you a great presentation!
ConfigMgr 2012 takes client health to a new level and so there’s plenty to talk about. All our research tells us that client health (no matter how you define it) is amongst the hottest topics in the computer management world, so I’m sure the conversation will be important to all of us.
As you might expect, I’ll get into some of the details in this blog but the presentation will be a more cohesive story. We also want to talk with you about client health (or client management generally) throughout the week, so be sure to say Hi wherever you see either of us.
p.s. BTW – I’ve been to MMS before, and spoken there as well. If you happen to be interested in that story, this link will fill you in on the details. Admittedly I’m proud of my contributions but I’m even more pleased to be seeing everyone again and to be able to engage in the conversations about how we can make computer management even better! (ok, I’ll say it: 14!…)
Any “client health” team actually does many things other things as well, including client reach, client installation, client movement, and (occasionally) client deinstallation or downgrade. Recently I had to do some client downgrading and movement. You would think that’s easy enough, but as usual (at least in our large environment), there were some tricks. Hopefully the following details will save you some effort when you have to do something similar.
This scenario should be rare – usually you can do many of these operations via simple software distribution. That was my first approach, but for whatever reason it didn’t work (I won’t bore you with the details). So if that fails and you have privileges on the machines, and the number of machines is reasonable, then a more direct approach may be best.
First of all I adapted my main log analysis script to look for online clients that we had privileges on (more details on that in a future post, since it’s a generally useful tool – the core points are to parse a list, do some pinging for online status, and verify privileges). Then the script executes a batch file, which is where things get complex.
You would normally hope to do everything remotely (from your console), but a crucial issue is that not all of the relevant clients will be online at the same time. So you’ll have to run the script multiple times against the same clients (since you don’t necessarily know which were successful the first time). You don’t want to downgrade the clients repeatedly, so you have to check their version to ensure they’re not the lower version (and thus don’t need to be downgraded). How do you check the client version remotely? – via WMI (either in a class or in the registry). But WMI remote access is commonly blocked by the firewall and so that probably won’t work. Thus you have to check client-side. That means two scripts – one to prepare the client and another to do the work, including the version check.
So the first batch file (executed by the script mentioned above), does operations such as:
xcopy <source>\smsv4 \\%computer%\c$\temp\ccmsetup_v4 /s /y
xcopy <source>\smsv5 \\%computer%\c$\temp\ccmsetup_v5 /s /y
copy <source>\RemoveSitecodeRegkeys.reg \\%computer%\c$\temp\ccmsetup_v4
copy <source>\klist_XP2003\klist.exe \\%computer%\c$\temp\ccmsetup_v4
copy <source>\regcheck.vbs \\%computer%\c$\temp\ccmsetup_v4
copy <source>\remove_client_part2.bat \\%computer%\c$\temp\ccmsetup_v4
<source>\PsExec.exe \\%computer% cmd.exe /c c:\temp\ccmsetup_v4\remove_client_part2.bat
Hopefully you don’t need to use GPO-based site assignments, in which case you won’t need the .reg or klist files (e-mail me if you need those details). And I’m quite sure that klist.exe doesn’t work on XP, which is why I include a shutdown command below (a rare option that happened to be available and relevant in this scenario). An important point is that in my case our remote privileges account does not have ‘log on as a service’ (or similar) privileges. All I can do is copy files to the client and run psexec (and we’re glad to be able to do that). So we’re fairly limited in what we can do.
The second batch file (executed on the client), does operations such as:
cscript.exe //B c:\temp\ccmsetup_v4\regcheck.vbs
IF ERRORLEVEL 1 GOTO DONE
regedit /s c:\temp\ccmsetup_v4\RemoveSitecodeRegkeys.reg
c:\temp\ccmsetup_v4\ccmsetup.exe SMSSITECODE=<sitecode> FSP=<FSP FQDN>CCMLOGMAXSIZE=100000 CCMENABLELOGGING=TRUE CCMLOGLEVEL=0 DISABLESITEOPT=TRUE DISABLECACHEOPT=TRUE CCMLOGMAXHISTORY=5 SMSCACHESIZE=10000
rem one more time, just in case (but it only helps temporarily on XP):
regedit /s c:\temp\ccmsetup_v4\RemoveSitecodeRegkeys.reg
rem we have to wait for the install to complete:
rem rmdir /s /q c:\temp\ccmsetup_v4
rmdir /s /q c:\temp\ccmsetup_v5
shutdown /r /t 0 /f
And to do that client version checking, you’ll need a simple VBscript (regcheck.vbs) such as:
Set oShell = CreateObject("WScript.Shell")
version = oShell.RegRead( "HKLM\SOFTWARE\Microsoft\SMS\Mobile Client\ProductVersion")
if version = "4.00.6487.2000" then
wscript.echo "exiting with " & result
(The ERRORLEVEL behavior was a little non-intuitive to me but testing proved that this worked and I didn’t have time to argue with it.)
Over time we got all the clients downgraded and moved back to where they should and all was good again. But the effort was not quite as trivial as we originally hoped.
One of my pet peeves is reports (of any sort) that are long lists of computers. But the reality is that many people use such reports. I’ve found that people like to take such lists, import them into Excel, and do their own analysis. I must admit that’s cool in that people are analyzing data and getting work done. (My own approach is to build reports that summarize data or to query directly and adjust the queries until I get the information I need).
Once our internal customers have such lists of computers with suspicious client health issues they come to my team and ask us what’s going on with those computers. If the list of computers is reasonably small then you can use a simple “IN” clause such, as:
select count(*) from v_R_System where active0=1 and name0 in ('computer1', 'computer2', 'computer2')
But what if the list is very long, as in thousands of clients or more? In that case you can import the computer names into a temp table and query against that. For example:
CREATE TABLE #temp1( name varchar(30) )
BULK INSERT #temp1
The trick is obviously to do a bulk import from the file with details as to how the file is formatted. A simple trick, once you know it. The only complication is that the file must be available to the SQL Server service on the server itself, as opposed to your console.
Then you can do queries such as:
select count(*) from v_R_System where active0=1 and name0 in (select name from #temp1)
So working with lists of computers is possible, no matter how long they are. The one caution I’ll offer is that if the list was produced significantly long before you do the investigation then be sure to go back to the original source. Otherwise too many of the clients will have changed states (for various reasons) and thus you won’t be able to draw any meaningful conclusions.
p.s. If you have shorter lists of computers and they’re also in long lists (as opposed to being comma delimited with single quotes (why don’t people do that?…)), then you might like to use Notepad2.exe or a similar program to easily adjust the lines. In Notepad2 you can select the block (Ctrl-A) and then do a block “Modify Line” followed by a “Join Lines”. That only take seconds.
When you’re investigating computer management client health issues you may benefit from knowing the history of the computer. When was it last rebooted? Did it crash at that time? Did someone stop certain services? Retrieving such details from the event logs on the client via WMI is very easy and is well documented – just bing it (by which I mean, of course, that you should use your favorite search engine). But what if the client health issues are probably causing WMI to be broken, or even RPC generally. For example, when you try that method you get error messages such as “The remote server machine does not exist or is unavailable”.
Fortunately PSexec.exe can help (someday I’ll have to investigate what magical protocol enables PSexec). You will need administrator privileges on the remote computers, but assuming that, the following batch file should do the trick:
copy \\<server>\<share>\EventVwr_query.xml \\<client>\c$\windows\temp
<path>\psexec.exe \\<client> cmd.exe /q /c "wevtutil query-events /structuredquery:true /f:Text c:\windows\temp\EventVwr_query.xml > c:\windows\temp\temp.txt"
So it copies an event query file to the client, uses wevtutil to run it (in the system context), removes the query file, and uses the output (which you should also delete when you’re done).
A key question is: how do you create the query file? I couldn’t find any documentation on how to do it manually but it’s easy to do interactively. Just start Event Viewer on any computer as you normally would, right-mouseclick, and create a custom view (which really means to create a query). Specify your options such as which Windows log, the source (such as WMI), event IDs, or other details. Now switch to the XML tab of that dialog box and copy the XML code you will see there. Paste it into the “EventVwr_query.xml” file and you’re ready to go.
p.s. In my case I wrap the above with a vbscript in order to inspect the event history of many clients, and that includes parsing the output file. I look forward to sharing more of those details in future posts.
If you have a complex computer environment (typical of large companies, but true elsewhere I’m sure), then you probably have multiple Active Directory Organizational Units (OUs). If you have multiple ConfigMgr hierarchies, you may intend that certain clients go into certain OUs, each corresponding to the relevant OU (and then apply GPOs to get the clients into the right hierarchies). Or maybe clients end up in the wrong OUs by accident. In such cases client health problems could boil down to confirming that computers are in the right OU. Or if they’re not in any OU of the intended domain then you have another problem, though with similar effect. Thus checking the OU of a computer, or bunch of computers, can often help to you understand why you’re missing expected clients.
So how do you check the OU of the computer(s)? There’s plenty of ways, including scripts (amongst my favorite), but sometimes a command line solution is the best bet. It’s quick and easy. In that case, you might create a batch file to run the following command, taking the computer name (or computer name pattern, as here) as a parameter.
ldifde -f computers.ldf –s <domain.company.com> -d "dc=domain,dc=company,dc=com" -r "(&(objectCategory=computer)(cn=<computer_name_pattern>*))" -l cn,ou
You won’t need the “-s” parameter if the “-d” domain is the same as the one that the computer you’re running the command on is joined to. The CN can be a specific computer name or a pattern (with “*” for the wildcard), though the command is much faster on a large domain where you know the first part of the computer name at least.
LDIFDE has plenty of articles on the internet so it’s easy to find examples for similar problems, or the details on how to figure it out for yourself. LDIFDE is available on domain controllers, but you can also install an ‘AD lite’ on any Windows Server 2008 R2 server (and others?) by adding the “Active Directory Lightweight Directory Services” role (which doesn’t make it into a domain controller). Or you can grab the relevant files and use them on Windows 7 (I did that long ago, and thus forget the details).
p.s. Sorry to my Facebook friends who would rather not be spammed on such topics. I’m trying to figure out how to disconnect my blog from Facebook (it got linked long ago).
I hope that those of you evaluating/beta testing ConfigMgr 2012 (previously known as v.Next) are checking out the wonderful client health additions. I’ll get into more details on those soon but for now one issue you may encounter is that when you’re looking at the ccmeval.exe results (in the v_CH_EvalResults view) you’ll find that the “Result” column is numeric. That’s fine but what do those numbers mean? There’s no lookup table (at least not yet), so all you can do is guess.
My research suggests the following values, which you can easily add to your queries as I’ve done in a CASE clause. The final terminology will likely be different but you get the idea (I hope).
when 1 then 'TBD'
when 2 then 'n/a'
when 3 then 'test failed' -- and thus fix not tried and/or fix not available
when 4 then 'fix failed' -- and the test must have failed too, in order for the fix to be tried
when 5 then 'n/a - dependent test failed'
when 6 then 'fix worked' -- so the test must have failed
when 7 then 'all tests passed'
else 'unexpected result'
end 'result', count(distinct netbiosname) 'clients' from v_CH_EvalResults
group by healthCheckDescription, Result
order by count(*) desc
One of my more common needs is to analyze log files (which are really just text files) for recurring issues. If lots of clients have the issue, or some clients have the issue a lot, then it's worth pursuing (if it's rare then it's just 'one of those things'). So how do we do such analysis? We could spend a lot of time reading such files, or delegate that work to someone, but the more practical solution is to get a computer to do it - it's actually quite easy.
So how do we do that? The following code gives a starting point. Basically you open the file, split it into lines, find the lines you're interested in, and then do something with the parts that are useful. The code doesn't do all of that but it does the core bits. Finding the interesting lines and interesting parts are left to you (think "instr" and "mid" functions especially).
set fso = CreateObject("Scripting.FileSystemObject")
set logfile = fso.opentextfile( filename )
content = logfile.readall
log_lines = split( content, vbCRLF )
for j=0 to ubound(log_lines)
values = split( log_lines(j), “,” ) ' it's possible your files are not comma delimited...
subroutine values(0), values(1) 'do something with the data
ConfigMgr v.Next has a lot of wonderful improvements, and I look forward to talking about my favorites over time. But often the small ones are very delightful, and I’m pleased to share my thoughts on those as well. One of them is that the ConfigMgr v.Next site settings are entirely stored in the database, and thus can be queried. Historically they’ve been stored in the site control file, and thus required manual or tricky file parsing to read. In ConfigMgr 2007, if not earlier, there was a database representation of those values but that took a lot of parsing so that wasn’t easy either. In v.Next they’re only in the database and are largely already parsed for you.
The following query should make them reasonably easy to read if you’re looking for client-specific settings. There’s about 174 such settings, so that’s a good start. But if you want other settings then you’ll need to do variations on this query to get them (and I hope to cover them in future blog postings).
select ClientComponentName 'Agent', Flags ‘Enabled’, Name 'Property',
when 'REG_SZ' then Value2
when 'REG_DWORD' then cast(Value3 as varchar(20))
when '' then cast(Value3 as varchar(20))
from dbo.SC_ClientComponent agents join SC_ClientComponent_Property props on agents.ID=props.ClientComponentID
You’ll see lots of details related to software inventory, hardware inventory, software metering, software updates, etc. Very useful stuff. For example, you can confirm all your sites are consistently configured. You can confirm your predecessor configured things reasonably. Stuff like that.
The trickiest problem you’ll soon notice is that properties like agent schedules are stored in WMI tokens, which mean a lot to WMI but not so much to you and I. I don’t know of a SQL mechanism to translate them, so that’s when I revert to vbscript. The following scriptlet gives you an idea of how to do that. Just substitute the relevant values.
Set loc = CreateObject("WbemScripting.SWbemLocator")
Set WbemServices = loc.ConnectServer(server, "root\sms\site_" & sitecode)
Set clsScheduleMethods = WbemServices.Get("SMS_ScheduleMethods")
Interval = "0001200000100018" 'insert your token here
clsScheduleMethods.ReadFromString Interval, avTokens
For each vToken In avTokens
Some coworkers of mine (Partha Chandran, Chandra Kothandaraman, and Jitendra Kalyankar) recently released a whitepaper on power management which we hope you will find useful. In it they include the following report which I think is especially informative:
A bit of trivia is that during the last couple of years one of my biggest projects was a power management solution evaluation. I didn’t contribute to the ConfigMgr R3 power management solution but I did help to look for solutions that would help us to save power dollars and CO2 like any other company. Early on I came up with the idea of graphing the power consumption data over the average day, as shown in this report. I hadn’t seen that done by anyone before, so I may well have ‘invented’ that idea. If so it is one of my favorite contributions to the computer management field.
My coworkers explain some of the benefits of this report in the whitepaper but there’s a few points I think are worth making:
- if maximizing power savings was your only goal then the ideal scenario would be for the computer and monitor lines to be flat along the X axis – i.e. no power consumption. Of course that’s not true because computers do provide considerable value when used properly so the trick is to find the right curve
- most computers are not shared amongst shift workers so the computer power consumption should reflect people’s work patterns. If they work 8 hours a day, 5 days per week, then the curve should reflect that if you’re only looking at workdays
- most computers are used by users, as opposed to being used by ‘service’ programs such as server applications, test software, ‘build’ software or other uses. Therefore when the user is not present the computer should be ‘off’.
- users almost always use their computers via the monitor, so if the monitor is on then it’s reasonable for the computer to be on. If the monitor is off then there’s rarely reason for the computer to be on.
- users generally work 9 to 5 (more or less) so late night hours (or weekends) should mean the monitor is off and thus the computer is off.
So in an ideal world:
- both the monitor consumption and computer consumption would be almost flat along the X axis except from 9AM to 5PM (or whatever hours your workers work, and assuming you’re using local time)
- any space between the monitor consumption and the computer consumption lines are wasted opportunities for savings (i.e. the user is not using the computer and yet it’s on)
- exceptions can be made for a middle-of-the-night maintenance window and for an early morning get-the-computer-ready-for-the-user early power-up
- exceptions can also be made for the fact that complex computers (as opposed to simple devices) do require some time to get everything up to speed and thus it’s reasonable that during work hours when users are often away from their computers for short periods that the computer stay powered up. The lunch hour may be the only reasonable time during which power consumption could commonly go down
- latency between the monitor line going down and the computer line going down reflects your power management policies. During the work day it’s reasonable to have a large latency because users come and go to meetings, lunch, hallway conversations, etc., but after hours that’s less likely
- exceptions can also be made for special computers that are used for automated testing, various server functions, remote access by power users, etc., so those two lines in reality won’t quite converge and are very unlikely to ever get quite to the X axis.
From the above report we can conclude:
- power management did save money in that both lines generally did move closer to the X axis and also moved closer to each other
- there’s a lot of opportunity for further power savings but we have to remember that this is Microsoft which is by its nature a company of power users who often use their computers more like servers and really do access them remotely after hours. How much more savings can be made is difficult to judge in this case.
Even if you don’t impose power management on your users, I believe that doing such a report on your machines will be quite informative.
p.s. Note that that this post applies to any power management solution that provides data-by-hour details, directly or otherwise.
More Posts Next page »