Meaning of "minimum servers" in Application Request Routing

The product team is currently working on ARRv1 RTW and online help, in addition to the walkthroughs, will be part of this release effort.

That said, it has come to my attention that the "minimum servers" (available on Health Test page) has not been documented very well. (More info on ARR is here.  You can also download x86 ARR here and x64 ARR here.) 

As you know, one of the goals of ARR is to provide high availability among application servers.  ARR achieves this by 1) monitoring the health of the application servers and 2) forwarding requests only to the healthy servers.  Sounds simple enough, but it is actually a bit more complicated than that.  Let's suppose that you have 10 application servers in a server farm where ARR is responsible for load balancing the requests.  Given your traffic load, let's also suppose that you need at least 5 healthy servers at any given time in order to sufficiently handle the normal amount of traffic.

In a such environment, if 2 of the servers become unhealthy, there is no real impact on your environment because you still have 8 healthy servers that can handle the traffic.  However, what happens if you have less than 5 healthy servers, say only 3 servers are healthy?  What is the desired behavior?  Given that you need at minimum 5 healthy servers to service the normal amount of traffic, having only 3 healthy servers isn't sufficient.  I am over simplifying the problem but largely, there are one of the two outcomes:

  1. Honor the health of application servers and send all requests to 3 healthy servers.  In this case, you are overloading the 3 remaining healthy servers and most, if not all, of your end users will have bad experiences.  Also, depending on the resiliency of the application servers, the experience may get even worse over time.
  2. Disregard the health of application servers and load balance as usual.  In this case, some requests are being routed to the healthy servers and some are being routed to unhealthy servers, based on the load balance algorithm.  While is is not ideal, this is a better alternative than the first because only some of your end users will have bad experiences.

In order to address such situations, ARR has introduced the concept of "minimum servers" which represents the minimum number of healthy servers that you must have in your environment before it starts to disregard the health of the application servers to provide services to some end users without bringing down the entire service. In above example, the "minimum servers" value would be set to 5.

No Comments