A look at some computer management reports produced by scripts
Summary: in recent posts we've examined some techniques for automatically generating and delivering highly customized reports. Here's some examples (especially the graphs).
The techniques I posted enable you to automatically produce highly formatted, analytical computer management reports, complete with graphs and e-mailing to your internal customers. But before you invest the effort into producing them you'll want to visualize what they could look like, and how that benefits you. This posting gives a variety of examples. (Some time ago I blogged a couple of graphs, and occasionally I've shown some in various presentations, so they may look familiar.)
This is a very typical report format example:
The "active" (left) graph is ConfigMgr client activity as seen at the central site on one of our hierarchies after a week. The "healthy" (right) graph is after two weeks. You'll notice the sudden dip in mid November. Clearly we had a serious infrastructure problem at that time. These reports often help to alert us to such issues, so that if our server monitoring doesn't catch the event, we can quickly go looking for it (and in this case we can see our Infrastructure Team fixed it within a day or so). There's a smaller dip at the end of December, which would be the holiday period. Because these reports show activity after a week or two, the dips don't become obvious right away (unless the issues is especially bad). You'll also notice that the table gives the numbers for the day the report was produced, and even highlights less-than-optimal number by giving them a yellow background.
The following snippet is for another hierarchy which happens to be smaller and more stable, as is typical for most people's "production" hierarchies. The swings in activity are less dramatic. The holiday dip is the only real anomaly.

This is a graph for a particular site:

This site obviously had a serious problem in late November. It also had the holiday dip in late December. Since then the software inventory and advertisement activity have been very consistent with each other and at a respectable 80% level (in one week). The heartbeat (green) activity is much higher (90%), but that's much more frequent, and so computers that are online for even a brief period will report heartbeat where they might not report the other activity. It's also simpler, so clients that are 'partially healthy' might report heartbeat where they wouldn't report other activity.
The most important point with that graph is the descending blue line - there's no good reason to explain that descent. So obviously something's wrong with hardware inventory at this site. Someone should be investigating it.
From those graphs we're obviously seeing a trend that no matter which site or hierarchy we look at, there's a dip in late December. And yet these are 'percentage' reports, rather than absolute numbers. So you might wonder why the percentages would dip in December - sure, there's less activity then, but there should also be less computers online. So the activity as a percentage of online computers should be about the same. But these reports don't account for offline computers - they just report on the basis of the total ConfigMgr (SMS) population. That's why I call these "activity" reports - it's the total activity for the client population. We'd like to distinguish online vs. offline, and offline vs. broken, but those are difficult challenges in themselves. It's much easier to just report on the whole population. I've talked about those issues a bit in previous posts, and will get into them more in future posts.
Client health reports are not always about 'client activity'. Here's one for 'client reach' (how many clients are in the hierarchy):

We can see that in early November there was a nice (and gradual) increase in the client count. That reflects a new effort to increase the client reach (a change of boundary strategy, in particular). Through December the numbers declined a little bit, probably reflecting that users were decomissioning computers faster than they were buying new ones - people generally don't start projects when holidays are imminent, and they do cleanup their environment. In mid January there was a sudden drop of about 20,000 clients - this corresponds with a new 'dogfooding' (beta) project where we needed to seperate a site from the hierarchy. So that's an intentional drop.
The above reports have largely looked at activity for hierarchies and sites over the previous week or two for each day. What if we look at it one day at a time? Here's a datacenter site where the heartbeats are set to every 2 days, on a fixed (as opposed to relative) schedule:

There seems to be a couple of days where the heartbeat activity dipped dramatically, but then recovered. That might be a reporting problem, as opposed to an infrastructure problem - that's one downside of complex, script-based reporting: subtle anomalies can expose bugs in the scripts, causing a failure to record data (until the bug is fixed or the anomaly passes).
Here's a day-by-day report for a desktop (non-datacenter) site with more typical agent settings:

That's a good example of why it's generally not a good idea to look too closely at client health or client activity - there are too many anomalies, making it difficult to extract meaning from the reports. But we have started reporting activity at the daily level over the last few months, and it has proved useful in some cases. That's especially true when a serious problem occurs and some activity drops to zero, and then we fix it and want to verify it goes back to normal (at least above zero).
For contrast, here's a datacenter site, looking at 2-week activity. It's hard to get closer to 100% success than that. Of course data center servers don't move around, are powered up all the time, aren't rebuilt frequently, etc. But we can dream of 'desktop' client health looking like this:

From the above discussion you might say that these are actually reports of site and hierarchy health, rather than client health. There's a lot of truth to that, but the client health story is still contained in these reports. Server-side problems affect all your clients all at once, and so they tend to be dramatic, but they're also relatively easy to fix (fix the configuration, disk, or whatever, and you're done). Environmental and client-side problems take longer to fix because there's many more moving parts, and your options for making the repair are more limited.
The important point is that by establishing trends over 'normal' periods, you can tell when client-side or environmental issues get worse (or better). If your clients generally show 85% activity over 1 week, then if you start seeing 80% activity then a change in strategy may be needed. If you go to 90%, then you must be doing something right. And if you've gotten 85% activity for the last year, in a well managed shop, then you can confidently tell management that 100% is never going to happen (unless they chain the users to their desks).
So there's some examples of client health reporting using 'complex' reporting. I've got others that are in 'prototype' stages. And there are other subtleties to the reports above. But you get the idea. What's most important is that you think about your reporting needs and consider applying the techniques I've detailed to make your computer management even more successful.
I've focused on client health here, but the same techniques can easily be used for patch management compliance reports, software distribution reports, OSD reports, DCM reports, etc. The queries will be different, but the technical implementation will be almost identical. More importantly, the business benefit (to your internal customers) will be comparable as well. So it is worth your time to invest a little effort in these techniques.