Our client will be sending nightly (or whenever an updates occurs) product import data that we are receiving as JSON and processing in a scheduled job. my question is this, we have computationally heavy data tasks to perform in triggers when we index a product, and obviously doing that index during import is causing a major slow-down. BUT.. while the importer is running, an editor could at the same time be editting a different single product (instead of mass import) and we'd want the index'ing task to run when they save/publish.
What are our options for disabling indexing in this scenario? Disabling for a single publish event would be ideal.
You can have the "Exclude from Index" property that by default is true so Find index does not index this page. (There are a lot of examples available on this forum, let me know if you need code snippet)
Once the importer finish processing, you can programmatically set this property to false. So find can index the product.
thanks @Naveed but we have 90k products to import sometimes, and we need it turned off for the data import operation, but not for that product in general, so we'd have to immediately re-publish the 90k with the flag set to allow the import, doubling the work. I was hoping for a flag i could pass to the ContentRepo for instance, rather than somthing set on a product.
This is what you need:
EventedIndexingSettings.Instance.EventedIndexingEnabled = false;
EventedIndexingSettings.Instance.ScheduledPageQueueEnabled = false;
Surjit is correct above however having done this before if you're going to be making large amount of updates to the catalogue via the IContentRepository I would also recommend turning of validation using EPiServer.DataAccess.SaveAction.SkipValidation as described here https://world.episerver.com/documentation/developer-guides/CMS/Content/Validation/
This can save a lot of processing time so that catalog updating is faster and consumes less resources.
Or even better, use the batch API https://world.episerver.com/blogs/Quan-Mai/Dates/2019/10/new-simple-batch-saving-api-for-commerce/
Thanks for your posts everyone....really like the idesa of the Batch, however im not sure it will work for us for a number of reasons:
1) We are loading products via a Json file that is stored in Azure, for which we were notified that the file exists via Azure Service Bus. Our code needs to know how many products within the file were successfully loaded so that we can restart the job from a particular point in the file (we have a restartable job that will pick up where it left off)
2) We have quite complicated products (similar to bundles and packages) that can't be created until the child-products have been upserted.
It's certainly possible but would need further investigation...
Really appreciate evreryone's input, really useful and informative answers.
Can you call an additional endpoint from the service bus just after the file notification but before the import starts?
if so then create an http endpoint that will disable automatic indexing (using the code above) and renable when the job ends (whatever the nature of the end)
I have an import job that processes a .csv file with +2 million lines each night. We use Azure table and Azure queues to compute delta import and handle queueing.
When processing queue items we use same approach as @Surjit mentioned above;
// before starting
EPiServer.Find.Cms.EventedIndexingSettings.Instance.EventedIndexingEnabled = false;
// when done
EPiServer.Find.Cms.EventedIndexingSettings.Instance.EventedIndexingEnabled = true;
To speed up save process we skip validation and keep just latest version:
var publishAndClearAction = SaveAction.Publish.SetExtendedActionFlag(ExtendedSaveAction.ClearVersions);
_contentRepository.Save(itemToSave, publishAndClearAction | SaveAction.SkipValidation, AccessLevel.NoAccess);