Feb 26, 2013
  2632
(2 votes)

CMS editor server instance crashing main site

 

Recently I was involved in supporting a new large scale implementation for a global client.

The implementation had gone through full testing and passed with flying colours. There seemed to be the usual cosmetic bugs and certain “Known bugs” that were acceptable by the client.

We finally got the go ahead to switch the new site on, this is where the real fun started.

Once the Load Balancer changes were made and the site made publicly available, we saw issues which were causing key functionality to have huge performance issues and ultimately crashing the site. When I say crashing the site, I mean the servers stopped responding.

The configuration of the implementation was to have the following:

  • 1 CMS editor server
  • Front end presentation servers with CMS access switched off.
  • Akamai CDN

It seemed a standard set up and nothing new.

What we found, was Cached objects were being invalidated for data sources from a different provider. After a while the front end presentation servers would stop responding and cause a failover.

Initial investigation was made, however, this was inconclusive and was pointing at Akamai not caching correctly.

The goose chase started and a series of remedial steps were taken:

  1. Monitor the DB
  2. Look at deactivating the many languages.

Monitoring the DB showed masses of Deadlocks and the dreaded SQL Query that is initiated by FindPagesWithCriteria were causing the DB to lock up.

However, this was not actually the issue.

If we think about it, the CMS editor application was being bought up causing the Presentation servers and DB to max out.

Further thinking it seems logical to think that the CMS was causing the issue.

To prove the point, we stopped the CMS server and the issue went away immediately and the Site was responding as expected.

After looking at the implementation it was identified that the Remote Events that are standard EPiServer built in functionality and I have never seen cause an issue.

What was realised was that the CMS and front end instances were configured to listen to sites that were not actually active. (There were 2 other sites that weren’t actually switched on)

So what was happening, the CMS was sending out events to all the presentation servers announcing it self and trying to register itself which was causing the Presentation servers to invalidate their cache. With users still trying to browse the site, this was causing them all to go back to populate their caches which cause the DB to max out and causing the sites to crash.

To fix this, the following is what to do:

  1. Set the scheduler to false for all servers and all sites in episerver.config.
  2. If other sites are not running in ISS make sure to comment them out in the episerver.config sites section.
  3. Delete the content of automaticSiteMapping section, an attribute in episerverFramework.config on all server. Its recreated automatically on site startup.
  4. Delete table tblsiteConfig, recreated on startup
  5. Make sure no scheduled jobs are active in admin mode.
  6. Restart frontend sites.

 

The point of the above is, when switching a site to live, make sure only that is running what is needed.

Feb 26, 2013

Comments

Please login to comment.
Latest blogs
Optimizely Opal: How to Build Effective Workflow Agents

If you're building workflow agents in Optimizely Opal, this post covers how specialized agents pass context to each other, why keeping agents small...

Andre | May 20, 2026

ReviewPR: An Azure Function That Reviews Your Azure DevOps Pull Requests With Claude

A while back I wrote about an  Azure Function App for PDF creation that we use to offload PDF rendering from our Optimizely DXP site. That same...

KennyG | May 19, 2026

Accelerating Optimizely CMS and Commerce upgrades with agentic AI (Part 2 of 2)

The Real Transformation in Optimizely CMS 13: Why the Upgrade Itself Is the Easy Part. A field-tested playbook for enterprise teams moving from...

Hung Le Hoang | May 18, 2026

Is the most powerful AI model really the best value?

Artificial Intelligence is already becoming part of everyday software development. Developers now use AI tools to generate code, write documentatio...

K Khan | May 16, 2026

Optimizely London Dev Meetup 2026

Well, everyone, it's that time of the year again, and we have another London Developer meet up coming for this summer. The date is set for the 2nd ...

Scott Reed | May 15, 2026

Semantic Search - Deep Dive

Deep dive into semantic search with Optimizely Graph

Michał Mitas | May 14, 2026 |