InitializationModule loop problem

Vote:

expand_less 0 expand_more

Hi All,

One of our Epi (version 9.5) apps (load balanced) is experiencing an issue, where one of the web servers is getting into some sort of initialization loop. Last time around, in the space of 12 minutes, the InitializationModule were fired 23 times. Most of the time, during this incident a fair amount of RAM is released. However the app pool is not being recycled.

Anyone experienced simillar issue?

Thanks,
Maciej

#151290

Jul 14, 2016 10:26

Daniel Ovaska

Vote:

expand_less 0 expand_more

Add some logging to the module and see if it fails to run for some reason. I think episerver will try to run it again if it fails so that might be a reason...

#151291

Jul 14, 2016 10:48

Vote:

expand_less 0 expand_more

Hi Daniel,

But they shouldn't be running at all as the application is up and running.

Maciej

#151293

Jul 14, 2016 10:54

Daniel Ovaska

Vote:

expand_less 0 expand_more

Yup, but if the module throws an exception during initialize it might run again on next request...and so on...

#151295

Jul 14, 2016 11:23

Vote:

expand_less 0 expand_more

Hi,

But why it would have to initialize again if it has been runing for many hours?

#151296

Jul 14, 2016 11:25

Daniel Ovaska

Vote:

expand_less 0 expand_more

It never ran the initialize method successfully...and will try again and again until it does.

Just guessing so turning on logging with log4net and adding some extra logging to initialize method of the failing module should be able to answer if my guess is correct or not.

Add a try catch to see if the initialize method throws any exception and an Log.Info at the end of initialize method to see that it actually gets there.

I know I ended up having a hundred or so event handlers for publish event because my module had an error in it somewhere and I didn't handle it correctly.

#151297

Edited, Jul 14, 2016 11:41

Vote:

expand_less 0 expand_more

Daniel,

The site was running for a number of hours before it did go into this initialization loop. Just to be on the safe side, I will add logging to all my custom initialization modules.

Thank you very much for your help.

#151298

Jul 14, 2016 11:46

Daniel Ovaska

Vote:

expand_less 0 expand_more

Might be related to load balancing as well. Might be that you get all traffic to a single server(?) and that the other server only starts after 10 mins or so when it gets its first request. Turning on logging will show that as well...

#151299

Jul 14, 2016 12:33

Vote:

expand_less 0 expand_more

Hi Daniel,

I think I wasn't clear enough when it comes to describing the problem. The issue is not present during the initial application start. It does happen very randomly and as far I as can see in the event log it is not related in any way to application pool recycle.

Maciej

#151300

Jul 14, 2016 12:54

Daniel Ovaska

Vote:

expand_less 0 expand_more

I understand. It is strange that this should happen after 10 mins. The only reasons I can think of are the above.

#151303

Jul 14, 2016 13:59

Anders Schliemann

Vote:

expand_less 0 expand_more

Did you ever get to the bottom of this. We are regularly experiencing this on a load balanced environment as well. The InitlializationEngine continues to loop many times and the site never becomes accessible before a manual IIS reset. I'm pretty sure that it is not because of heavy load to a single server, since it also happens at night when load is minimal.

I have added a link containing the contents of the Episerver log when initialization loop occurs:

https://ufile.io/6fe8

#171067

Edited, Nov 01, 2016 9:46

Daniel Ovaska

Vote:

expand_less 0 expand_more

Probably worth adding a support ticket. Only thing I see in logs that looks a bit weird is this

2016-11-01 03:24:53.337,INFO,EPiServer.Events.Providers.EventProviderService,Cancel sending event message as the EventProviderService doesn't have any configured providers.,

Since you have a load balanced site, the remote events needs to be configured to make it run properly. Have you configured those? Caching invalidation seems to work ok?

#171069

Nov 01, 2016 10:24

Vote:

expand_less 0 expand_more

Hi,

In order to troubleshoot try: http://www.epiwiki.se/tools/application-restart-detector

In our case setting fcnMode (https://msdn.microsoft.com/en-us/library/system.web.configuration.fcnmode%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396) to disabled did the trick.

Thanks,

Maciej

#171074

Nov 01, 2016 11:05

Anders Schliemann

Vote:

expand_less 0 expand_more

Hi Maciej,

Thanks for your follow up. I'm struggling to understand how fcnMode = disabled prevents the loop. When I look at our files on the website none of them changed at the recycle time. Which files did you find that actually changed at the recycle of the application pool? Was it modules?

Best regards

Anders

#171076

Nov 01, 2016 11:41

Vote:

expand_less 0 expand_more

Hi Anders,

I couldn't see any files being changed. However (as far as I can remember), the Application Restart Dectector was returning ConfigurationChange value.

Thanks,

Maciej

#171077

Nov 01, 2016 11:55

Daniel Ovaska

Vote:

expand_less 0 expand_more

Checking for application restart seems like a great first step...

If that is the case you can use a memory dump or similar after to find out exactly why something bugged...

#171085

Nov 01, 2016 13:41

Syed Shah

Vote:

expand_less 0 expand_more

Hi,

We are having the same issue on Azure web apps. The application would be running, all of a sudden it will decide to replace an instance and then go into infinite loop. Azure support thinks that the cached objects are not properly released based on their dump analysis. How can we confirm that the cached objects are properly released?

Thanks,

Syed

#171337

Nov 08, 2016 12:24

Daniel Ovaska

Vote:

expand_less 0 expand_more

I've seen it happen a few times. Every time has been a developer mistake where they have gotten a silly amount of objects and then cache them wrongly. Had one client who got the entire AD worth of users and cached that every 10 minutes causing memory to spike. Checking through the logic where you do caching it's almost always possible to build it in a smarter way that doesn't need to cache as much data. Using recursive GetChildren or similar in Episerver to crawl through the entire page tree unneccessary is not an uncommon mistake. If you cache stuff, avoid caching objects that have connections to plenty of other object (like PageData). Store only what you need in a separate flat object instead. Normally this will both result in a much smaller cache and make it easier for .NET to garbage collect when neccessary. And avoid using static dictionaries or similar to build your own cache. Then you are basically screwed when memory is running out. Do use Episervers caching system. Or at least .NET standard.

#171339

Edited, Nov 08, 2016 12:59