How A/B testing affects performance

Vote:
 

Hi, 

We are currently investigating some performance issues for a client, and it seems like the issues only appears once one or more A/B tests are running on the site.

When we set up A/B tests we get a lot of the following exceptions in the log files:

System.InvalidOperationException: ValueFactory attempted to access the Value property of this instance.
   at System.Lazy`1.CreateValue()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Lazy`1.get_Value()
   at EPiServer.Marketing.Testing.Web.TestHandler.EvaluateKpis(Object sender, EventArgs e)

When checking the performance issues in detail, we have also a few database queries that use up to 10-11 seconds to complete when there's heavy load on the site. These queries are queries that involves tblABTest, tblABVariant, tblABKeyValueResult, tblKeyConversionResult, tblABKeyFinancialResult and tblABKeyPerformanceIndicator tables. 

In sum this leads me to believe that there are some performance challenges with the EPiServer A/B testing module. 

Has anyone else seen this?

#180232
Jul 03, 2017 21:42
Vote:
 

I decided to run some tests on an empty EpiServer site with the following 3 scenarios:

  • Without EPiServer.Marketing.Testing installed.
  • With EPiServer.Marketing.Testing installed, without any active A/B tests.
  • With an active A/B test running

We've identified that UrlResolver.GetUrl is one of the culprits in the severe degradation of performance we see when using this module, so I decided to use that as reference. 5 iterations of 500 calls to the method gave the following results:

  • Without the plugin:
    • 500 iterations took 65ms
      500 iterations took 71ms
      500 iterations took 68ms
      500 iterations took 74ms
      500 iterations took 81ms
  • With the plugin installed, but no active A/B tests
    • 500 iterations took 123ms
      500 iterations took 101ms
      500 iterations took 98ms
      500 iterations took 108ms
      500 iterations took 122ms
  • With an active A/B test (Landing page, 100% participation)
    • 500 iterations took 123139ms
      500 iterations took 162896ms
      500 iterations took 203619ms
      500 iterations took 244731ms
      500 iterations took 289076ms

Note: Subsequent testing doesn't always replicate these results. I'm not sure why yet, but UrlResolver.GetUrl works fine suddenly (no code changes). Still, the results with active A/B tests are in the 200 - 230ms range, almost 3x the results without the plugin installed. 

The code used for testing:

    var urlResolver = ServiceLocator.Current.GetInstance<UrlResolver>();

    List<string> results = new List<string>();
    for (int x=0;x<5;x++)
    {
        var watch = Stopwatch.StartNew();
        for (int i=0;i<500;i++)
        {

            urlResolver.GetUrl(Model.TestRef);
        }
        watch.Stop();
        results.Add(string.Format("500 iterations took {0}ms", watch.ElapsedMilliseconds));
    }

So why is this so much slower? SQL Profiler might have the answer.

  • Without any active A/B tests, a single call to UrlResolver.GetUrl() caches all of its results, so the next call to get the URL for the same ContentReference is cached. There are no DB activity at all on subsequent calls. 
  • With a single A/B test active (Landing page, 100% participation), a single request generates a total of 19 DB selects on the tables concering AB testing and performance indicators. Every single subsequent request generates the same amount of DB traffic. Loads more DB traffic and no caching.
  • The TestHandler and other code related to the Marketing.Testing module runs on every single request to the site, with quite a lot of overhead.

Are we doing something wrong when configuring our sites, or is this plugin simply not usable on production sites?

#180237
Edited, Jul 04, 2017 7:51
Vote:
 

I am very curious about this. Could someone at episerver please answer this? We are about to install this on a multisite soluton with +30 websites and this might be very bad if we are having trouble with performance.

#180242
Jul 04, 2017 9:28
Vote:
 

We have filed a support incident with EPiServer on this issue, I'll update the thread as soon as I hear anything from them.

#180265
Jul 04, 2017 15:55
Vote:
 

We have gotten a response from EPiServer support on this issue, and they suggest to cache all calls to UrlResolver.GetUrl. 

Not really a solution, but might be an alright workaround until EPiServer have implemented a better fix. (Haven't tested myself yet)

#180358
Jul 07, 2017 14:31
Vote:
 

I haven't dug into the AB testing myself but i have to ask.

Why does the AB test module affect the UrlResolver?

Will it return different urls depending on if it is in the A or B group?

If that is the case wouldn't caching really screw up the AB test result?

(i guess it would mitigate the performance issues if you install the AB test without using it but is that really what you need to solve?)

#180360
Jul 07, 2017 14:54
Vote:
 

The A/B test hooks into the LoadedContent and LoadedChildren events triggered from IContentLoader or anything that implements IContentEvents. which in turn is used by UrlResolver, and lots of other places.

If there are active tests, the LoadedContent eventhandler in EPiServer.Marketing.Testing.Web.TestHandler loads more content, which triggers more of the LoadedContent eventhandler.

The same can happen for LoadedChildren, which may quicky escalate out of control if you have active AB tests on children and childrens children.

These events do quite a bit of logic (checking for cookie data, setting cookies, loading data from the database, saving test data).

Disclaimer: It's not easy to read decompiled code on a friday afternoon. Might have overlooked something obvious that mitigates some of this. Some caching is done on loading the tests, but theres still a lot more database traffic compared to a the same application without the Marketing.Testing nuget installed.

#180362
Jul 07, 2017 15:29
Vote:
 

An update from support is that they've reported it to EPiSever developers, and will update me when they've gotten a response from them. 

#180465
Jul 11, 2017 23:57
Vote:
 

Got an updated response from EPiServer support who confirmed the bug, and told me that a fix is targeted for release on the 24th of August. :)

#180654
Jul 19, 2017 22:10
Vote:
 

Hello. Were there any updates to this - links to the bug fix release notes for example?

#201887
Mar 06, 2019 10:53
Vote:
 

I believe this is the link to the bug fix: https://world.episerver.com/documentation/Release-Notes/ReleaseNote/?releaseNoteId=MAR-1096.

It was released on July 24, 2017.

#201912
Mar 06, 2019 22:30
Vote:
 

Thank you for great investigation @Vegard Helland. Pretty late to the party but did you have chance to test with a more recent version to see how much things were improved? 

#201915
Mar 07, 2019 8:16
Vote:
 

@Quan Mai: It's been a while since I tested this, but we now have the addon running succesfully on the site that we had issues with earlier. There's still a bit of a performance hit when running active AB-tests, but nothing close to as bad as it was earlier when it actually made the server unresponsive when running tests on a high load page.

#201983
Mar 11, 2019 12:34
Vote:
 

@Vegard: do you have any measurement with the performance hit? That'd be very helpful for us!

#201987
Mar 11, 2019 13:11
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.