Ben Nitti
May 1, 2021
  850
(1 votes)

Optimizing your Optimizely Search & Navigation service for large files

Awhile ago I had a client with an excess of large files. I had increased their upload size limit to 2 GB and many of their documents were between 50 MB; a dozen or so files were between 1 and 2 GB.  Episerver recommends not exceeding the  by default 50 MB maximum request size.

Not surprisingly the indexing job started timing out and required immediate attention. 

I found there were several ways to tweak the the performance by filtering these files from the indexing job. 

I created an initialization module and changed the default batch sizes for the Find service. ContentBatchSize is used for the find index job, MediaBatchSize is for the event-driven indexing on media types. 

    [InitializableModule]
    [ModuleDependency(typeof(IndexingModule))]
    public class FileIndexingConventions : IInitializableModule
    {
        public void Initialize(InitializationEngine context)
        {
            ContentIndexer.Instance.MediaBatchSize = 3;     // Default is 5
            ContentIndexer.Instance.ContentBatchSize = 50;  // Default is 100
        }

        public void Uninitialize(InitializationEngine context)
        {
            throw new NotImplementedException();
        }
    }

I had several ways to filter out these large files. I could filter out IContentMedia from the index entirely or do the same with a custom type for pdfs and zip extensions.

ContentIndexer.Instance.Conventions.ForInstancesOf<MyPdfMediaType>().ShouldIndex(x => false);

Alternatively, I could stop the binary data from being indexed by decorating the propery with the [JsonIgnore] attribute:

    public class MyPdfMediaType : MediaData
    {
        [JsonIgnore]
        public override Blob BinaryData { get; set; }
    }

But since the client wanted to have the file content searchable, I decided only to filter the property when the filesize reached the find service limit. 

ContentIndexer.Instance.Conventions.ForInstancesOf<IContentMedia>().IndexAttachment(x => !IsFileSizeLimitReached(x));

...and for this I used an extention method to check against filesize binary data:

        private static bool IsFileSizeLimitReached(IBinaryStorable binaryContent)
        {
            // Note: 37 MB max. size refers to the base64 encoded file size .
            const int limitKb = 37000;

            try
            {
                var blobByte = (binaryContent.BinaryData as AzureBlob)?.ReadAllBytes() ??
                               (binaryContent.BinaryData as FileBlob)?.ReadAllBytes();

                if (blobByte == null)
                    return false;

                double fileSize = blobByte.Length;

                var isLimitReached = (int)(fileSize / 1024) >= limitKb;

                return isLimitReached;
            }
            catch
            {
                return false;
            }
        }

Once in place I was able to run the job with no exceptions, no timeouts and a happy client!

May 01, 2021

Comments

Please login to comment.
Latest blogs
Preview multiple Visitor Groups directly while browsing your Optimizely site

Visitor groups are great - it's an easy way to add personalization towards market segments to your site. But it does come with it's own set of...

Allan Thraen | Sep 26, 2022 | Syndicated blog

The Report Center is finally back in Optimizely CMS 12

With Episerver.CMS.UI 12.12.0 the Report Center is finally re-introduced in the core product.

Tomas Hensrud Gulla | Sep 26, 2022 | Syndicated blog

Dynamic Route in ASP.NET Core When MapDynamicControllerRoute Does Not Work

Background Creating one of the add-on for Optimizely I had to deal with challenge to register dynamically route for the API controller. Dynamic rou...

valdis | Sep 25, 2022 | Syndicated blog

404 Error on Static Assets Within an Optimizely plugin

Background With the move to CMS 12 and .NET 5/6, developers are now able to build Plugins and Extensions using Razor Class Libraries (RCL).  These...

Mark Stott | Sep 23, 2022