My customer reported missing searchhits for spesific words in some given pages on their site. What these pages had in common were their big text size. MainBody had approximately 20kB of text (utf-8).
I spoke with Episerver support and they think this happens because the limit set for the field mappings (ignore_above) in Elastic, with is the backbone of S & N, is set to 10kB per field in the index. (ignore_above is set til 8kB for demo (demo02.find.episerver.net) indexes). The ignore_above mapping setting tells elastic not to give any hits if the text size is more then the value set. And since the value is 10kB, no hit is returned for this field.
It seams to be correct because when i reduce the size of the text I do get hits for this page. But(!) in another solution also runing EpiServer.Find 13.0.5 this is not an issue, eventhough the ignore_above value is the same for both solutions. So I am not sure this is the issue afterall. Has anybody else experienced this?
How to check the mapping values: https://[findhostname]/[privatekey]/[indexname]/*/_mapping
More on ignore_above from elastic: https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-above.html
I know the ignore_above was higher in earlier versions but it's interesting that you say it's the same for both solutions.
The differense between the two solutions mentioned, is that they live on difference epi S & N hosts: es-api01.episerver.com (ok to search in fields with lots of text) and es-eu-api03.episerver.net (no hits when field exeed the ignore_above limit)
My best guess is that the index with higher ignore_above value was provisioned before the change and the one with lower ignore_above value was setup after the change. The change is a part of templates used for when creating a new index.
It looks like it was changed in late 2019. Perhaps early 2020.
Documentation is updated: https://world.episerver.com/documentation/developer-guides/search-navigation/Integration/cms-integration/Indexing/#StringLengthIndexingLimitation