Event System Bug in CMS 6
There is a (minor) bug in the EPiServer Event Management System when using tcp – this blog post describes how the bug behaves and what we did to fix the problem.
For those new to EPiServer, the event system provides a mechanism for distributing events within an EPiServer CMS site, between EPiServer CMS sites on the same physical server (enterprise), and between EPiServer CMS sites on separate servers (load-balanced standard and enterprise). Read more about it here.
If my case the customer had 3 servers, one backend and two frontend servers, all running EPiServer CMS 6 (6.0.530.1), and we were using TCP instead of UDP.
A while after first setup of our test environment, we started to find warn messages in the log files:
Exception calling IEventReplication::RaiseEvent on remote object. System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host.
How the bug manifests itself
- Backend server starts
- Frontend server starts
- Content change on backend, event sent to frontend. Cache invalidation OK
- Frontend recycles
- Content change on backend, event sent to frontend.
Error, connection forcibly closed. Cache not invalidated. Warning message in log file.
- Content change on backend, event sent to frontend. Cache invalidation OK.
As you can see, the cache update in step 5 is lost. Next update (step 6) will work as the connection is recreated and the whole cache is cleared. In other words, this will only affect first content change after a restart of front server(s).
How we got around it
We solved the problem by using wsHttpBinding instead of netTcpBinding. This is a bit problematic, as it runs a security check before opening a port (while tcp don’t). This means that you need to run the following command on the frontend servers (30001 is port used for cache invalidation, + equals “localhost” and IIS_IUSRS is application pool user):
netsh http add urlacl url=http://+:30001/ user=IIS_IUSRS
The problem has been registered as a bug by EPiServer support:
#64186: Cache update fail at first time publish in load balance environment using tcp.