Take the community feedback survey now.

Ben Nitti
May 1, 2021
  4071
(1 votes)

Optimizing your Optimizely Search & Navigation service for large files

Awhile ago I had a client with an excess of large files. I had increased their upload size limit to 2 GB and many of their documents were between 50 MB; a dozen or so files were between 1 and 2 GB.  Episerver recommends not exceeding the  by default 50 MB maximum request size.

Not surprisingly the indexing job started timing out and required immediate attention. 

I found there were several ways to tweak the the performance by filtering these files from the indexing job. 

I created an initialization module and changed the default batch sizes for the Find service. ContentBatchSize is used for the find index job, MediaBatchSize is for the event-driven indexing on media types. 

    [InitializableModule]
    [ModuleDependency(typeof(IndexingModule))]
    public class FileIndexingConventions : IInitializableModule
    {
        public void Initialize(InitializationEngine context)
        {
            ContentIndexer.Instance.MediaBatchSize = 3;     // Default is 5
            ContentIndexer.Instance.ContentBatchSize = 50;  // Default is 100
        }

        public void Uninitialize(InitializationEngine context)
        {
            throw new NotImplementedException();
        }
    }

I had several ways to filter out these large files. I could filter out IContentMedia from the index entirely or do the same with a custom type for pdfs and zip extensions.

ContentIndexer.Instance.Conventions.ForInstancesOf<MyPdfMediaType>().ShouldIndex(x => false);

Alternatively, I could stop the binary data from being indexed by decorating the propery with the [JsonIgnore] attribute:

    public class MyPdfMediaType : MediaData
    {
        [JsonIgnore]
        public override Blob BinaryData { get; set; }
    }

But since the client wanted to have the file content searchable, I decided only to filter the property when the filesize reached the find service limit. 

ContentIndexer.Instance.Conventions.ForInstancesOf<IContentMedia>().IndexAttachment(x => !IsFileSizeLimitReached(x));

...and for this I used an extention method to check against filesize binary data:

        private static bool IsFileSizeLimitReached(IBinaryStorable binaryContent)
        {
            // Note: 37 MB max. size refers to the base64 encoded file size .
            const int limitKb = 37000;

            try
            {
                var blobByte = (binaryContent.BinaryData as AzureBlob)?.ReadAllBytes() ??
                               (binaryContent.BinaryData as FileBlob)?.ReadAllBytes();

                if (blobByte == null)
                    return false;

                double fileSize = blobByte.Length;

                var isLimitReached = (int)(fileSize / 1024) >= limitKb;

                return isLimitReached;
            }
            catch
            {
                return false;
            }
        }

Once in place I was able to run the job with no exceptions, no timeouts and a happy client!

May 01, 2021

Comments

Please login to comment.
Latest blogs
Optimizely CMS - Learning by Doing: EP06 - Create Header, Footer, Menu & Component/View for Blocks

  Episode 6  is Live!! The latest installment of my  Learning by Doing: Build Series  on  Optimizely CMS 12  is now available on YouTube! This vide...

Ratish | Nov 4, 2025 |

Going Headless: 3 Ways to Store Custom Data in Optimizely Graph

Welcome to another installment of my  Going Headless  series. Previously, we covered: Going Headless: Making the Right Architectural Choices Going...

Michał Mitas | Nov 3, 2025

A day in the life of an Optimizely OMVP - What's New in Optimizely CMS: A Comprehensive Recap of 2025 Updates

Hello and welcome to another instalment of a day in the life of an Optimizely OMVP. On the back of the presentation I gave in the October 2025 happ...

Graham Carr | Nov 3, 2025

Optimizely CMS Mixed Auth - Okta + ASP.NET Identity

Configuring mixed authentication and authorization in Optimizely CMS using Okta and ASP.NET Identity.

Damian Smutek | Oct 27, 2025 |