Removed stemming and now episerver find can't differ between lett

Five New Optimizely Certifications are Here! Validate your expertise and advance your career with our latest certification exams. Click here to find out more

AI OnAI Off

Home / Forums / Developers (Main Products) Forum / CMS 7.5 - 11 /

Removed stemming and now episerver find can't differ between letters with and without diacritical sign

Adil

Vote:

Hi,

Got a question regarding episerver find, my problem is similiar to this ticket
https://world.episerver.com/forum/developer-forum/EPiServer-Search/Thread-Container/2017/2/tell-find-to-skip-stemming-for-certain-words/

The thing is I want to avoid using Episerver Finds stemming since it gives results which doesn't match with the search query. This was the original problem that I was trying to solve.
If I send in the language (in this case Language.Swedish) and search for "mats"(this is a name) I will get result such as "mat"(food), "matfisk"(food fish), "matsvinn"(food waste), "matställen"(dining), etc.

I solved that by sending in Language.None and now I'm getting results that matches exactly with the search query for instance "mats". But that leads me to my current problem which is that the search can't differ between letters with or without diacritical sign.

Now the search result shows results that matches with search query but the problem is that episerver find now does not notice the difference between letters with and without diacritical character.
If you search for “våra”(our) then you get results that contains the word “våra” but also “vara”(be) and this is because I have removed the language analysis (stemming)
is there any way for epi find to distinguish between letters with and without diacritical sign?

#219211

Edited, Mar 31, 2020 9:20

Ravindra S. Rathore

Vote:

Hi Adil,

Have you tried this?

Restricting search results to diacritic characters

When submitting a free text search that contains diacritic characters, Search & Navigation results include both the diacritic and non- diacritic versions of the same character. For instance, the free text search 'Ånge' returns Ange, Änge, etc.

To have the basic search return results that include only the submitted diacritic characters, follow these steps.

When the Search & Navigation index is created, include the language that includes the diacritic character (for example, Swedish).

Add that language to the search query. For example:

(SearchClient.Instance.Search<PageData>(Language.Swedish)...

More info-

https://world.episerver.com/forum/developer-forum/EPiServer-Search/Thread-Container/2018/10/search-removesignores-diacritics-e-g-229228246/

https://world.episerver.com/documentation/developer-guides/search-navigation/getting-started/adding-search-functionality/

#219216

Edited, Mar 31, 2020 12:05

Adil

Vote:

I sent the question to episerver support and they gave the answer.

Hi Adil,

I can confirm that this is an issue with the Find backend Swedish analyzer.
I've created a ticket to the Find dev team. Usually this can be fixed. I'll keep you updated.

Using Language.None , standard analyzer is not recommended. It has other drawbacks as well.

Thanks,
Daniel Dahlin
Team Lead, Application Support EMEA

#223682

Jun 02, 2020 12:04

This topic was created over six months ago and has been resolved. If you have a similar question, please create a new topic and refer to this one.