Crippled Relate+ 2 R2 web site, low CPU hang, DebugDiag, Windbg and a happy ending
Whether you want it or not, sooner or later you'll probably wind up in front of a crippled production environment without any meaningful logs to hold hand with. The pressure is high, let this heavily shortened story begin..
This particular website is hosted on four different web servers, where two are located on a DMZ. After upgrading to Relate+ 2 R2 the websites hosted on the DMZ servers randomly starded to show the unwanted symtoms of low CPU hangs. When browsing the web site you'll end up with a "ReadResponse() failed: The server did not return a response for this request.". And the eventlogs, EPiServerlogs, httperrlogs all say - Nothing!
Tortured by the memories of earlier unexplainable web server errors I realized that it once again was time to grab DebugDiag and try to get a memory dump from at least one of the servers. With the dump on disk it was time to start WinDbg and put on my Tess Ferrandez googles.
After some dirty work and precious time lost (just to get WinDbg to work with a dump from a different OS version is a challange) I stumbled on this webpage:
To shorten the story even more (time is precious):
The root cause to the low cpu hang was the assemblies configured in the compilation section in web.config. From the webpage linked above:
"When the CLR loads an assembly which has an Authenticode signature, it will always try to verify the signature by connecting to the Internet to generate Publisher evidence for the assembly. This is true even if there is no internet connectivity available."
In this particular case the url requested was http://crl.thawte.com/ThawtePrem. Since our hosting partner has a very strict firewall configuration for outgoing http traffic (and more), the unexpected request was blocked which in turn triggered the startup process to halt completely and wait for the verification call to timeout.
To fix this problem you either have to modify the Windows\Microsoft.Net\FrameworkXX\2.0...\Aspnet.config file with
<configuration> <runtime> <generatePublisherEvidence enabled="false"/> </runtime> </configuration>
or open the firewall for outgoing request.
If you bump in to this low CPU hang error, I'll problably can give you some more concrete advices in the debug process. This post was just an attempt to clear my thoughts, the details are left out.