InitializationModule loop problem

Vote:
 

Hi All,

One of our Epi (version 9.5) apps (load balanced) is experiencing an issue, where one of the web servers is getting into some sort of initialization loop. Last time around, in the space of 12 minutes, the InitializationModule were fired 23 times. Most of the time, during this incident a fair amount of RAM is released. However the app pool is not being recycled.

Anyone experienced simillar issue?

Thanks,
Maciej

#151290
Jul 14, 2016 10:26
Vote:
 

Add some logging to the module and see if it fails to run for some reason. I think episerver will try to run it again if it fails so that might be a reason...

#151291
Jul 14, 2016 10:48
Vote:
 

Hi Daniel,

But they shouldn't be running at all as the application is up and running.

Maciej

#151293
Jul 14, 2016 10:54
Vote:
 

Yup, but if the module throws an exception during initialize it might run again on next request...and so on...

#151295
Jul 14, 2016 11:23
Vote:
 

Hi, 

But why it would have to initialize again if it has been runing for many hours?

#151296
Jul 14, 2016 11:25
Vote:
 

It never ran the initialize method successfully...and will try again and again until it does.

Just guessing so turning on logging with log4net and adding some extra logging to initialize method of the failing module should be able to answer if my guess is correct or not.

Add a try catch to see if the initialize method throws any exception and an Log.Info at the end of initialize method to see that it actually gets there.

I know I ended up having a hundred or so event handlers for publish event because my module had an error in it somewhere and I didn't handle it correctly.

#151297
Edited, Jul 14, 2016 11:41
Vote:
 

Daniel,

The site was running for a number of hours before it did go into this initialization loop. Just to be on the safe side, I will add logging to all my custom initialization modules.

Thank you very much for your help.

M

#151298
Jul 14, 2016 11:46
Vote:
 

Might be related to load balancing as well. Might be that you get all traffic to a single server(?) and that the other server only starts after 10 mins or so when it gets its first request. Turning on logging will show that as well...

#151299
Jul 14, 2016 12:33
Vote:
 

Hi Daniel,

I think I wasn't clear enough when it comes to describing the problem. The issue is not present during the initial application start. It does happen very randomly and as far I as can see in the event log it is not related in any way to application pool recycle.

Maciej

#151300
Jul 14, 2016 12:54
Vote:
 

I understand. It is strange that this should happen after 10 mins. The only reasons I can think of are the above.

#151303
Jul 14, 2016 13:59
Vote:
 

Did you ever get to the bottom of this. We are regularly experiencing this on a load balanced environment as well. The InitlializationEngine continues to loop many times and the site never becomes accessible before a manual IIS reset. I'm pretty sure that it is not because of heavy load to a single server, since it also happens at night when load is minimal.

I have added a link containing the contents of the Episerver log when initialization loop occurs:

https://ufile.io/6fe8

#171067
Edited, Nov 01, 2016 9:46
Vote:
 

Probably worth adding a support ticket. Only thing I see in logs that looks a bit weird is this

2016-11-01 03:24:53.337,INFO,EPiServer.Events.Providers.EventProviderService,Cancel sending event message as the EventProviderService doesn't have any configured providers.,

Since you have a load balanced site, the remote events needs to be configured to make it run properly. Have you configured those? Caching invalidation seems to work ok?

#171069
Nov 01, 2016 10:24
Vote:
 

Hi,

In order to troubleshoot try: http://www.epiwiki.se/tools/application-restart-detector

In our case setting fcnMode (https://msdn.microsoft.com/en-us/library/system.web.configuration.fcnmode%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396) to disabled did the trick.

Thanks,

Maciej

#171074
Nov 01, 2016 11:05
Vote:
 

Hi Maciej,

Thanks for your follow up. I'm struggling to understand how fcnMode = disabled prevents the loop. When I look at our files on the website none of them changed at the recycle time. Which files did you find that actually changed at the recycle of the application pool? Was it modules?

Best regards

Anders

#171076
Nov 01, 2016 11:41
Vote:
 

Hi Anders,

I couldn't see any files being changed. However (as far as I can remember), the Application Restart Dectector was returning ConfigurationChange value.

Thanks,

Maciej

#171077
Nov 01, 2016 11:55
Vote:
 

Checking for application restart seems like a great first step...

If that is the case you can use a memory dump or similar after to find out exactly why something bugged...

#171085
Nov 01, 2016 13:41
Vote:
 

Hi,

We are having the same issue on Azure web apps. The application would be running, all of  a sudden it will decide to replace an instance and then go into infinite loop. Azure support thinks that the cached objects are not properly released based on their dump analysis. How can we confirm that the cached objects are properly released? 

Thanks,

Syed

#171337
Nov 08, 2016 12:24
Vote:
 

I've seen it happen a few times. Every time has been a developer mistake where they have gotten a silly amount of objects and then cache them wrongly. Had one client who got the entire AD worth of users and cached that every 10 minutes causing memory to spike. Checking through the logic where you do caching it's almost always possible to build it in a smarter way that doesn't need to cache as much data. Using recursive GetChildren or similar in Episerver to crawl through the entire page tree unneccessary is not an uncommon mistake. If you cache stuff, avoid caching objects that have connections to plenty of other object (like PageData). Store only what you need in a separate flat object instead. Normally this will both result in a much smaller cache and make it easier for .NET to garbage collect when neccessary. And avoid using static dictionaries or similar to build your own cache. Then you are basically screwed when memory is running out. Do use Episervers caching system. Or at least .NET standard.

#171339
Edited, Nov 08, 2016 12:59
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.