IIS7 and Failover Clustering

We recently published an article on how to enable failover clustering for IIS7 and Microsoft Cluster Service (MSCS). MSCS provides failover and increased availability of applications by failing over to a second machine that is on stand-by or by just taking the sick machine out of the network so that no requests are routed to it anymore.

There is a fundamental problem with configuring failover clustering for IIS however: IIS in itself is not an application; IIS is an application platform and it's hard to come up with a clear yay or nay when to fail over. We don't know enough to decide if an application is terminally sick and can't recover. Is it when a web application returns 500 errors, when the application is overloaded and TCP connections can't be established anymore, when its Application Pool crashes repeatedly? What if more than one application is hosted on the IIS machine? Should IIS only fail over when W3SVC is terminated? Should IIS fail over if one of the web applications is dead, when 50% of the apps are dead? 90%? 

There is no good way to make these decisions without involving the people managing the IIS machine and the web application developers that run code on the IIS box. And that's why we decided to publish sample code that somebody can take and adjust for their specific needs.

The sample code in the article is a script that can be configured as a Generic Script Resource in the MSCS service. The script monitors the run-time status of the IIS7 "Default Web Site" and IIS7 "DefaultAppPool" and will trigger a fail over if one of them is stopped. It is trivial to change the Site or Application Pool name in the script and it should be easy for a developer familiar with VBScript to add some custom functionality to this script.

Happy fail over!

No Comments