I remember seeing it before but I don't know if we ever got any wiser on the cause of it.
We had major issues with it when performing deploys as it seemed that after restart it tried to sync with the other server which took a hit and pretty much became unresponsive.
We had three servers and we ended up with a cascading effect where we had to juggle the servers as they became none-responsive one-by-one and it could take over an hour until everything settled down and all servers was stabile. Nightly/scheduled iis-resets could trigger the cascading to start anew.
I don't really have a solution for you, only a recommendation:
Setup a new environment where the servers are on an isolated subnet and configure remote events to use udp-multicast instead of tcp. We never seen the issue since we made that move and the servers are stabile and reliable.
Thanks @Erik. Yes its very frustrating and quite a strange issue :( ... onwards the troubleshooting goes, may have to try configuring UDP as you mentioned.
We have exactly the same issue, for quite a lot of time. We use load balanced environment with 2 nodes of EpiServer. However, our configuration differs a little bit.
Node 1
<system.serviceModel> <serviceHostingEnvironment aspNetCompatibilityEnabled="true" multipleSiteBindingsEnabled="true" /> <services> <service name="EPiServer.Events.Remote.EventReplication"> <endpoint name="RemoteEventServiceEndPoint" contract="EPiServer.Events.ServiceModel.IEventReplication" binding="netTcpBinding" bindingConfiguration="RemoteEventsBinding" address="net.tcp://hostname-01:1337/RemoteEventService" /> </service> </services> <client> <endpoint name="RemoteEventServiceClientEndPoint" address="net.tcp://hostname-02:1337/RemoteEventService" binding="netTcpBinding" bindingConfiguration="RemoteEventsBinding" contract="EPiServer.Events.ServiceModel.IEventReplication" /> </client> <bindings> <netTcpBinding> <binding name="RemoteEventsBinding" portSharingEnabled="true"> <security mode="None" /> </binding> </netTcpBinding> </bindings> </system.serviceModel>
Node 2
<system.serviceModel> <serviceHostingEnvironment aspNetCompatibilityEnabled="true" multipleSiteBindingsEnabled="true" /> <services> <service name="EPiServer.Events.Remote.EventReplication"> <endpoint name="RemoteEventServiceEndPoint" contract="EPiServer.Events.ServiceModel.IEventReplication" binding="netTcpBinding" bindingConfiguration="RemoteEventsBinding" address="net.tcp://hostname-02:1337/RemoteEventService" /> </service> </services> <client> <endpoint name="RemoteEventServiceClientEndPoint" address="net.tcp://hostname-01:1337/RemoteEventService" binding="netTcpBinding" bindingConfiguration="RemoteEventsBinding" contract="EPiServer.Events.ServiceModel.IEventReplication" /> </client> <bindings> <netTcpBinding> <binding name="RemoteEventsBinding" portSharingEnabled="true"> <security mode="None" /> </binding> </netTcpBinding> </bindings> </system.serviceModel>
We have noticed that the issue happens after we deploy a new version. I guess after that some kind of initialization happens, causing the issue. Any help/advice would be much appreciated.
Hi,
Having a pretty major problem with the EpiServer Events Service and cannot get to the bottom of it...
This problem is causing pretty major stability issues on the site. Running two IIS servers (V8.5) in load a balanced environment, with events provider configured over TCP.
The configuration is the same on both servers, as follows:
The site is running fine most of the time but occasionally the events service seems to encounter problems shutting down and failing to restart!
Receiving exceptions relating to the event service being unavailable and messges being dropped from server A. The events service has died/does not start on server B. The application cannot be started with errors on initialization relating to the ASP.Net compatibility mode. Eventually does start and can be brought back online.
The error has happened on both servers at different times.
Can anyone help on understanding why takes happens? Or perhaps problems with the configuration?
Any help would be greatly appreciated...
Regards,
Cevin
The stack trace from the unhandled async exeption is below...
[Pasting files is not allowed]