The routing group quorum is a particularly interesting one because its what is used to decide if a user will see the message “Limited Functionality” message on their Skype client when a pool failover is “NOT” taking place. We can use the example that we have 12 Front – End servers in a pool. The fact that our user routing group could actually exist on any three of the twelve Front-End servers is where this becomes interesting.
F1| F2| F3 | F4| F5| F6 | F7| F8 | F9 | F10 | F11 | F12
I’ve use color-coded numbers in the scenario above to show a 12 server Front – End Pool and how there could be various user accounts that are a part of a routing group that sits on various servers. For the sake of the conversation, imagine that my routing group where my Skype user account resides sits hypothetically on servers F3, F6, and F8 which are the servers that are colored black.
If I were to reboot nodes F1 and F2 at the same time it wouldn’t have any effect on my Skype account. If I were to reboot nodes F1, F2, and F3 at the same time my routing group wouldn’t be affected (remember, my routing group is marked by the black servers) either. In addition, rebooting servers F1-F3 wouldn’t have any affect of the pool quorum either due to the fact that I would still have 9 out of the 12 nodes available in the quorum which is more than the majority.
Rebooting nodes F1, F2, F3, and F4 at the same time would be a risky move. I wouldn’t be affecting the pool quorum due the fact that I still have the majority nodes up and running with 8 out of the 12 nodes up and running. However, by doing this I run the risk of losing routing group quorum for particular Skype accounts which could be assigned to those servers that are rebooted at the same time.
The routing groups in question that I could lose would be the Red routing groups because they sit on nodes F1 and F4, which means the only remaining server node that the Red routing groups has still standing would be on node F9. With the loss of 2 out of 3 replicas a routing group goes into “Limited Functionality” mode and the client is in a degraded state.
Lesson learned from this: We should now understand why Microsoft’s suggested approach is to reboot Skype Front – End servers one at a time to avoid any potential feature functionality loss for Skype users even if it doesn’t impact the Pool quorum itself.