Scheduled jobs in the DXC & The Autoheal Policy & Architecture
Scheduled jobs are a great way to write code that can run periodically and perform actions on your CMS or Commerce system through the Episerver API framework https://world.episerver.com/documentation/developer-guides/CMS/scheduled-jobs/ I have often used these to much success but in our latest project we needed to do some heavy CRUD syncing of data from 2 of our clients large systems in to the Episerver Commerce catalog structure.
We chose to use Scheduled jobs as we could use the API and we needed to do a few other things, we knew the jobs would be intensive but when we came to test some of this on the DXC we kept getting an issue with our largest fresh import of data.
The jobs were getting aborted, so we contacted Episerver support and were informed that Azure Auto Heal was turned on https://blogs.msdn.microsoft.com/appserviceteam/2017/08/17/proactive-auto-heal/ for the environments. Auto Heal will work out if instances have an issue and restart those instances however one of the thresholds it checks was memory usuage and for us the large fresh import was hitting this issue with memory due to the massive data set we were working with.
There are 2 options for this and around architecturing jobs that will cause heavy processing.
- Request episerver to disable the auto heal policty (We are doing this for now)
- Use separate dedicate web apps for the jobs. There's a great article here on this https://world.episerver.com/blogs/Sergey-Vorushilo/Dates/2017/12/scheduled-jobs-setup-in-dxc-service/ .This will allow you to control this policty and make it so your jobs do not affect the web site and can leave the auto heal policty on.
To note Episerver Support told me if you have a commerce site they can create another web app for this at no extra cost.
SO As a thought I'd suggest whenever using Scheduled jobs to consider how much data you are processing and how intensive they are and to consider separating jobs out as standard on the DXC to help mitigate any of these issues.
nais! I just love these healing policies. we are too lame and lazy to deal with high memory usage (or actually it was app intent to use as much as possible and troll GC) but there is always an easy way - just restart damn thing and move on.. :)
ICYMI there has been a long standing bug in ASP.NET which prevents cache trimming to run correctly, see my blog post here: https://world.episerver.com/blogs/Magnus-Rahl/Dates/2017/11/bug-in-aspnet-cache-causes-problems-with-azure-proactive-auto-heal/ This results in ever-growing cache when an application runs under memory pressure. Things in epi are cached with timeouts, invalidated when updated etc, but when churning through big data sets it does depend on the LRU algorithm of the cache trimmer to free up memory. The ASP.NET bug has been fixed in .NET Framework 4.7.2 but that fix is still not rolled out to Azure app service in all regions (but should be any day now).