Lync server 2013 started the concept of needing a minimum of three Front End (FE) servers to make a pool and routing group quorum. Over the course of versions from Lync 2013 to Skype 2015, stories have changed (apparently), not really quite sure why. I have heard talks of needing to have an odd number of servers in a Lync 2013 or Skype environment, which is not true. The specific need to have three for FE servers came about due to how the Windows fabric works and the necessity to make quorum; however needing to stay at an odd number of FE servers is not true. Microsoft has stated that the maximum number of FE servers in a pool is 12. First off, that is not an odd number, in addition we can have anywhere from 3-12 FE servers in a pool to make pool or routing group quorum.
The only difference with having an even number of FE servers in a pool (4, 6, 8, 10, or 12) is that the SQL backend server comes into play to be the deciding factor for “pool” quorum decisions. If you have an odd number of servers in the pool (3, 5, 7, 9, or 11) then SQL is already coming into play as a voter in those scenarios.
What makes a Pool Quorum?
SQL server when it comes to being a tiebreaker or decision maker for an even number of FE servers in a pool is not a difficult job; its main function in that scenario is to be ready if called upon. The perfect way to explain this would be to understand what makes quorum. The chart below explains from the pool point of view how many FE servers are need to make quorum.
|Total number of Front End Servers in the pool||Number of servers that must be running for pool to be functional|
|8-9||Any 4 of the first 7 servers|
|10-12||Any 5 of the first 9 servers|
What makes a Routing Group Quorum?
A routing group quorum almost follows the same line of thought as a pool quorum, one the key differences is the need to have three replicas (copies) of data and nothing more. What we have now since Lync 2013 and Skype 2015 are routing groups. A user is assigned to a routing group once their Skype account is created; which routing group they belong to is beyond the control of the administrator. We have what we consider a primary routing group for a user and two secondary copies. Routing groups prefer to have three copies and if they don’t have three the users routing group is still functional. However if a user where to fall to a single routing group then the user would be in “Limited Functionality Mode”. This could occur to rebooting too many FE servers in a single pool at the same time.
Where does SQL come into play?
Let us take a scenario where we have three FE servers in a pool, the SQL server does not come in to play then. Why not? Well we have an odd number of FE servers so if one FE server goes down we still have quorum with the two remaining FE servers. If the second FE server goes down then we lost quorum. The important factor here is that SQL server is a voter in the scenario.
Let us take another example where we have four FE servers in a pool and lost one. Even with losing a single FE server, we still have quorum, but if we lost two then we do not have quorum, for quorum is majority and we do not have that. That is where the SQL server comes into play for it is the decision maker or tiebreaker.
So in second example where we had four FE servers, if we lose two FE servers and have 2 FE servers left then SQL is a decision maker and has a vote and it states that its vote counts and now there are 3 voters in the pool. With three votes out of five (recall we lost 2 FE servers) then majority rules and we still have pool quorum.
Let me see this “Fabric”
During installation, Windows Fabric creates a local configuration file at C:\ProgramData\Windows Fabric\<server.domain.com>\Fabric\ClusterManifest.current.xml. This is a location change from Lync Server 2013 which stored the file at C:\Program Files\Windows Fabric\bin\ClusterManifest.current.xml.
<Node NodeName=”FE01.CONTOSO.COM” IPAddressOrFQDN=”fe01.contoso.com” IsSeedNode=”true” NodeTypeRef=”FrontEndNode” FaultDomain=”FD:/FAULTDOMAIN1″ UpgradeDomain=”UD:/UPGRADEDOMAIN1″ />
<Node NodeName=” FE02.CONTOSO.COM ” IPAddressOrFQDN=”fe02.contoso.com” IsSeedNode=”true” NodeTypeRef=”FrontEndNode” FaultDomain=”FD:/FAULTDOMAIN2″ UpgradeDomain=”UD:/UPGRADEDOMAIN2″ />
<Node NodeName=” FE03.CONTOSO.COM ” IPAddressOrFQDN=” fe03.contoso.com ” IsSeedNode=”true” NodeTypeRef=”FrontEndNode” FaultDomain=”FD:/FAULTDOMAIN3″ UpgradeDomain=”UD:/UPGRADEDOMAIN3″ />
Odd or Even Shouldn’t Matter
At the end of the day, whether I have three FE servers of eight should not matter if I want to go with odd or even number of servers in my pool, but rather based on capacity reasons and justifications. The more we deal with the Skype FE pool the better understanding many of us who skipped the clustering class began to understand that the underlying layer of “Fabric” and how it relates to Skype is a core Windows concept of how clustering is and what makes up a quorum. So the next time someone mentions clustering with regards to Windows, take head for you don’t know down the road the next application that could use some of those important key concepts.
Looking for an awesome, no-nonsense technical conference for IT Pros, Developers, and DevOps? IT/Dev Connections kicks off in Dallas, Texas in 2018!