February 2007 - Posts

Summary: one frustrating element of working in a high tech field with lots of smart people is that it's easier to come up with ideas than it is to implement them. Besdies which, you're already busy with planned projects, typical escalations, your own priorities, etc. How do you evaluate ideas to ensure the good ones are pursued when they should be?

As techies we're naturally going to be inclined to think in terms of the technical merits - we have an issue that boils down to some technical nastiness, and we can see technical solutions. As keen problem solvers, we may want to jump into implementing that technical solution right away. But should we?

The solution may not be the best, for a variety of reasons. It might cause more grief than it solves, in the short term or the long term. It might divert efforts from even more important issues. It might be more expensive than the problem it's solving.

But at the same time you don't want to miss out on the good ideas, or ideas that will be good when the time is right. So how can you differentiate between the good ideas and the not so good, or the ideas that should be pursued now vs. the ideas that should be pursued later? I like the following form as a way to capture the ideas and quickly weight the merits.Other team members can review the form to confirm the thinking. Would you add sections to it for consideration? Are any inappropriate?

Of course you don't want to discourage creativity by introducing red tape, so the intention is that the details in each section should be limited to 100 words. This is not meant to be a requirements documents, a design document, or similar engineering document. It's just a quick capture and analysis of the idea.

Title:

Problem to be Solved:

Business Impact of Problem:

Urgency of Solution:

General Design of Solution:

Detailed Design of Solution:

Side-Effects of Solution (positive and negative, and how they’re mitigated):

How do we know the solution will solve the problem:

Risks:

Is the solution consistent with general team principles, policies and goals (if so or if not, which ones):

Cost of Solution (design, build, test, implementation, documentation, ongoing use, ongoing maintenance, etc.):

Process for implementation of solution:

User and Internal Customer Impact (and need for user communications, helpdesk communications, etc. if relevant):

Administrator, Process, and Tools Impact:

Which other projects will be delayed while relevant people work on this solution:

Alternative Solutions Considered (and why they were rejected):

Summary of why the solution must be put in place now:

Posted by pthomsen | 1 comment(s)
Filed under:

Sumary: we're not all the same as each other. That's obvious, but when it comes to computer management issues, I see 22 categories of differences. So we have to be careful in choosing and deploying computer management solutions.

  

One reason I've dedicated most of my career to computer management is because of its complexity. No matter how long I do it, I learn something new every day, and have a variety of exciting challenges at all times. My employers value my skills enough to pay me a decent wage. That's great for me, but not so good for my employers or other customers that want a simple solution that 'just works'. After all, most organizations have a real business to run, and computer management is not it. So a 'silver bullet' that keeps the costs very low and gives them the solutions they need when they need them would be ideal.

 

I sympathize with their goal, but reality is what it is. I've seen a lot 'silver bullets' come and go over the years. And there are a lot of products available in the computer management business, each of which promises to be the ultimate solution. Some of my managers, or other techies I've talked to, have advocated various ultimate solutions of their own. Supposedly, all we have to do is document the various challenges we might face and their solutions, and then we're all set (which probably means outsourcing or getting a junior guy to do it). Or we write a bunch of wonderful scripts or web pages that provide everything a user could need and cost us next to nothing. Or we just tell the users that they can't have everything they want - they'll take what we give them, from a limited list of solutions we can deliver at low cost. 

 

There are various ways to articulate the challenges of computer management (yet more blogs to come), but I like the following approach. That is that we each vary from each other in various ways. I count 22 ways (dimensions, if you will). None are binary, and few have a universal 'the left is best is and the right is wrong' scale. All points on each dimension could be the 'right' position for any organization. Sometimes an organiation will make the wrong choices for its own benefit, and sometimes not, but the computer management solution provider cannot know which is which. So all solutions in each dimension are 'acceptable'. If you agree that we have 22 dimensions, and maybe there's 10 significant ponts on each dimension, then that's 10 to the 22nd power, which is 10,000,000,000,000,000,000,000 valid varients (someone correct my math, please, if I'm wrong). Now that's a challenge.

 

This is my list of the 'dimensions':

  1. size by client - 100 clients to 300,000 clients
  2. size by number of locatiosn (sites) - 1 site to 18,0000
  3. size by size of individual sites - 2 clients to 10,000 clients at an individual site
  4. 'stability' of clients - all users at designated, fixed machines to many users swapping many machines and often on the road at many locations (possibly connecting by various means)
  5. features used - specific features (such as inventory collection, software distribution, patch management, remote control, metering, DCM, NAP, etc.) to all features or even added-on features (such as asset mgmt)
  6. sophistication of feature use - out of the box defaults are fine to highly customized inventories, extensive reporting, numerous and fancy packages, etc.
  7. network complexity - no links or all good links, to a mish-mash of link types with an emphasis on bad links
  8. techie organization - one group, centralized and working well together, to many units, decentral and centralized, not working well together
  9. business organization (users, as opposed to end-users) - central head office only to many business units with widely varying goals, philosphies and needs
  10. security - out of the box, generic to highly locked down with third party tools and complete paranoia
  11. domain models and similar infrastructure complexity - one domain to multi-master domain model, or multi-domain model
  12. techie sophistication - barely know what they're doing to gurus with programming skills ready to change or fix anything
  13. sensitivity to problems - go with the flow to it must work first time, every time, forever
  14. history - starting clean to heavily invested in SMS 2003
  15.  vendors - pure Microsoft shop to mix of Microsoft, UNIX, NetWare, etc.
  16. hardware - all one kind of high-end PC and servers to a mish mash of low end to high end PC's from various vendors plus Mac's
  17. OS's - all Windows XP SP2 or a sprinkling of everything ever offered
  18. budget - money is no object to wanting to do it for free (or get the vendor to pay for it)
  19. management wisdom - total wisdom to hasn't got a clue
  20.  time frame - take the time it needs to 'you've got until Friday'
  21. experience doing projects - have set up 5 PC's in one day to has deployed a variety of different kinds of systems at a variety of different kinds of organizations
  22. risk tolernace - naively optimistic to completely paranoid

The computer management challenges also cut in the other direction. Developing a computer management solution is a heck of challenge in itself. It has to take in mind all the complexities. A lot of history, in a lot of customer environments is a going to be an advantage. Having staff who have seen the challenges will increase the odds that the most significant ones are addressed. The most common combinations of the dimensions will have been hit, and if the issues were serious enough then the fixes will be in place.If you also happen to be a large complex organization and thus can test a broad sprectrum of the combinations in your own home environment, you will have yet another advantage.

 

I happen to have a favorite product that I feel meets those criteria, and that's why I choose to work where I do. I like to think history and the market support my point of view. But it's an ongoing debate. In ythe end the market will prove who is right.

 

p.s. I originally wrote this article back in 1999. So it's been developing for a while. But if you disagree, or want to add dimensions, I would be very pleased to hear them. We need more debate in this blog (Garth is doing a great job on that point, but I want to hear you as well)
Posted by pthomsen | 1 comment(s)
Filed under:

Summary: I got confirmation (well, close enough - it's posted at mms2007.com) that I'll be speaking at MMS again this year.

You can call it pathetic, but the highlight of my year for the last 10 years has been the SMS User's Conference / Microsoft Management Summit. Yes, this will be the 10th anniversary of the conference, and thus the 10th time I have attended and (most humbling) the 10th one I will have been a speaker at. I have the bags to prove it. Life is good...

SY22 System Center Configuration Manager 2007: Microsoft IT’s ExperiencesTrack(s): Systems ManagementSession Type(s): BreakoutProducts(s): Configuration Manager 2007
Paul Thomsen describes how Microsoft IT planned for Beta 2 of SCCM 2007 and then upgraded their largest production sites to manage 150,000 clients with SCCM 2007 Beta 2. Core services (SUM, SWD, Inventory, etc.) plus NAP, DCM, OSD, Internet-based scenarios, and even some device management scenarios. Learn why Microsoft IT changed their infrastructure and processes, the challenges they faced, technical tricks they learned, cool insights, business benefits and much more.

I think you'll find the session interesting and useful for your SCCM 2007 planning. It may be a little less technical than my usual topics around scripting, hardware inventory extensions, etc. but at a comparable level to my sessions on patch management, best practices, etc. The timing is such that it will be a lot more current than my previous sessions. There will be Q&A, but feel free to talk anytime throughout the week about SMS - it's my favorite subject!

What's so great about MMS? Well, I should clarify that I do value family events, activities with friends, vacations, religious events, national events, etc. But MMS has a lot of that and more - it's a family and friends event of sorts (we are a community), it's a break from the norm in a fun city (so a vacation, kindof), and brings people from around the world (it's supra-national). Ok, it's a little hard to tie in a religious element, but we do get a chance to remember the wonderous technologies that make computer management possible.

More to the point, the conference is a wonderful chance to chat with peers and associates, share ideas, learn new stuff that can be applied to our everday work lives, find solutions to tricky problems, and meet new people. Mix in some good food and beverages, in some wonderful environments, and you're bound to have a good time. And let's not forget the swag!

In any case, I look forward to seeing you there. I haven't confirmed hotel bookings yet, but if it works the same as last year, I'll be at the San Diego Marriott Hotel & Marina, Sunday to Friday. At the base of what I think is their north tower they have a nice lounge. Between 8PM and 10PM (depending on other events) a bunch of us would usually meet up and relax each day. That was a great chance to chat without the risk (or excitement) of making a run for the border. And they have a very nice IPA... I'll see you there if not elsewhere.

Summary: Microsoft IT beta tests new versions of Microsoft software to help make them customer-worthy. That's what we call "dogfooding". What does that mean specifically...?

I've been in Microsoft IT for over 4 years and have been through a bunch of dogfood cycles. Some major (SMS 2003, SCCM 2007) and many less major (SP1, SP2, R2, ITMUv2, ITMUv3, DMFP, etc.) releases. They're not all the same, but there are some general truths that I can share. I won't give any details about the current cycle - it's a very fluid story. The observations from our current dogfooding efforts that have lasting value will be blog topics in future months (when they're not 'secrets'). And my observations may not be true for the dogfooding done by other MSIT teams (such as Exchange or AD). SMS is central to dogfooding other products, such as Windows and Office, but I don't personally get involved in those, so my observations may not apply to them either.

The dogfooding cycle basically consists of the following:

  • the product team tells us what's coming. This is the most 'secret' stuff, but is also the most fluid. That's about 2 years before the action happens
  • we tell the product team what we need. So do other customers. And the product team and senior management have their own ideas.
  • we learn the details of new features from specifications, e-mails, conversations, and similar sources. The details often change over time.
  • we and the product team settle on "shared goals" and "exit criteria". Basically we get into the specifics in a long series of boring meetings
  • we write requirements documents, design specifications, test plans, and similar documentation.
  • we fine-tune the plan
  • the testing begins, including production deployment to servers and clients (thousands to hundreds of thousands of clients). Much bug filing ensues.
  • we adjust the plan according to reality, but only a little, because the product must ship
  • the product goes to "TAP" customers, who do their own form of dogfooding. But they expect a reasonable degree of quality, and the MSIT experience proves it's there.
  • the product may go to a larger group of beta testers, on varying scales. This is for lab-only work, but the quality must be higher still because a lot of first impressions are being formed.
  • repeat the cycle of plan fine-tuning, testing, adjustment, and beta testing as we hit each milestone - beta 1, beta 2, release candidate 1, release candidate 2 (sometimes), release to manufacturing (RTM)

This is all done while we do normal production work - the show must go on, after all. Microsoft IT is a real production shop, with real users, SLAs, processes, etc. The production work for our team includes escalations from our Infrastructure and Services SMS admins, typical projects necessary to adapt to changing realities, and various 'cool' ideas from misc. sources (often management). It's rather like planning to put on an opera while you're performing a symphony, during the symphony itself. 

My experience is that the phase that really matters is the "testing". That's when the rubber meets the road. In a major cycle, a lot of us work 14 hour days 6 days per week for 2 or 3 weeks in a row during this period. And the following weeks are only a little better. Everyone is very interested in all the activities, so the attention is intense. Slippage on the plan is not really tolerated. All your technical and organizational abilities are tested as you try to rapidly investigate MANY tricky issues. Your family must be tolerant, and your manager must be adept at delivering tricky news. Your diplomatic skills are applied as you debate with the product team as to why certain issues are important, or with your admin peers as to why other issues are not important. In the meantime you're trying to learn how complex new features really work in the real world, but with only limited tidbits of information. The team is strained. Stress is a given. Occasionally you get pizza and/or beer.

Of course you might expect that the Microsoft products are perfect as soon as our brilliant developers write them. Or that any issues are worked out by the product team's serious and diligent professional testers. And largely that is true - they do a wonderful job on both fronts, and I have the utmost respect for them. But the reality is that the range of production environments for computer management products is hugely complex (yet another blog to come). The product team testers have a wonderful arsenal of hardware, software, people, time, and professional skills to apply to testing, but there are lots of scenarios they can't test in the labs. You would test the product in your environment, and probably do, but do you want to find seriuos issues? So that's where Microsoft IT comes in. We catch most of the big nasty issues, and innumerable small ones, before you get a chance. Admittedly we won't catch them all, but hopefully whatever gets through is tolerable and can be readily corrected.

I should also mention that during the testing phase of dogfooding, the product team is very much in the trenches with us. MSIT finds the issues, but they wrestle them to the ground and fix them.

Does anyone remember SMS 2.0? Enough said. 
Admittedly in those days Microsoft IT didn't do a lot of SMS, but in those days Microsoft also saw IT as a 'production operation', just as you do. A fearless leader, named Rick Devenuti, came along a few years ago as our CIO. Somehow he had the brilliant idea that we would both run a world-class production operation and run dogfood at the same time. Counterintuitive, to say the least, but it worked. It's not surprising that we haven't always dogfooded, but I'm sure glad we do now.

What's the scale? For us 10,000 clients is small, and we'll do all possible clients (say 250,000) in the release candidate stages. Beta 2 could be something like 100,000 or 150,000. ALL production functions are tested over the test cycle, and the magnitude will increase with each phase. So we'll do some software deployments, patching, inventory, etc. in early parts of the dogfooding, but we'll do ALL production services for a couple of months toward the end. In the early stages we may hive off part of the infrastructure for dogfooding, but in the end we'll bring it all back together again to have just a few (or one) huge hierarchy. All normal production monitoring is done throughout, including MOM monitoring, Infrastructure Team monitoring, Services monitoring (including Release Management), and Engineering testing and monitoring. There are many ways to find issues, and we apply them all. [and yes, when I say 250,000 clients, I mean 250,000 30-day cycle clients - see an earlier blog on that topic]

So why do we do it? Well, there's the obvious answers about product quality, market share, stockholder value, and so forth. But those of us in the trenches see those as abstract concepts. We really do it for you, our customers, peers, and friends. We all look at ourselves frequently and say 'man, it's good that bug didn't get out the door'. That's not to say that we're any more noble than anyone else - we all make our contributions where we can. But I hope you sleep better at night knowing that somewhere someone has eaten a lot of dogfood.

Posted by pthomsen | 3 comment(s)
Filed under: ,