Vulnerability in EPiServer.Forms
We have detected some words are registered as synonyms without us adding them manually. One example is "rus" (Norwegian for "intoxication") and "fyll" (Norwegian for binge drinking and "fill" as in "fill in the form").
Is this documented anywhere, and how do we get rid of it?
Just guessing here, but I think it might be because of the decompund analyzer. Try using the standard synonym analyzer:
.For("my query", query => query.Analyzer = Language.Norwegian.SynonymAnalyzer)
Thanks Per Magne!
I tried your solution but this only made my search use ONLY the "magic" synonym analyzer and not the synonyms defined in EPiServer CMS Find Admin.
I use code similar to the following:
var query = searchClient.UnifiedSearch()
This gives me both word decompound synonyms, manual synonyms I have defined in EPiServer CMS Find Admin, and unwanted synonyms I'm not able to understand where comes from. Examples of unwanted synonyms: searching for "rus" gives results for "fyll", searching for "døv" gives results for "lurer" etc. As a side effect, when I define "resirkulering" as synonym to "kildesorting" the synonym doesn't work at all.
If I call UnifiedSearch() with Language.None like this:
var query = searchClient.UnifiedSearch(Language.None)
all my custom synonyms work, no strange synonyms, but I lose word decompound synonyms.
We have this issue as well. Some synonyms have magically appeared and we can't get rid of them (and they are not shown in the synonym interface). On top of that, the generated synonyms are ridicilous..
I tried to change the analyzer as mentioned above, but to no avail. There must be an option to turn this off? Our query looks similar to this:
searchResult = _client.Search<ISearchContent>()
Did you find a solution to this?
This causes problems when using auto-boosting as well, since irrelevant synonyms will receive the same boosting.
Guessing this is still an issue? Did anyone find a way to disable "magic" synonyms when using UnifiedSearch?
var lang = Languages.GetSupportedLanguage(ContentLanguage.PreferredCulture); //Norwegian
var query = SearchClient.Instance.UnifiedSearch(lang)
Any solution to this yet? I am searching (unified search) for «seng» (the Norwegian word for «bed») and is getting search results for «bedrifter» (the Norwegian word for «companies»). This makes no sense.
If I turn of synonyms, the problem goes away. But then I loose the synonymes we have defines in Episerver Find UI. I do not want those synonyms.
I have tried this, but then I loose stemming. When searching for «bedrift», I want the search results to include the plural form «bedrifter».
SearchClient.Instance.UnifiedSearch().For(model.Query, q => q.Analyzer = Language.Norwegian.Analyzer);
There is a built-in synonym list for Norweigan.Since end of last year there is a feature request related to this. "FIND-6651 Be able to turn off or manage built-in synonyms"Currently there is no ETA on this but I've sent an inquiry.
Is there no workaround?
Is this built-in synonym list only used for Unified search? Or «normal» search too?
No workaround, other than disabling the use of synonyms (.UsingSynonyms()) which perhaps is not a workaround.It's used when synonym analyzer is used which means whenever you use .UsingSynonyms().
I need the synonyms that are added in the Episerver Find UI, so that is not an option.
Thank you dada - I am having problems with EPiServer.Labs.Find.ImprovedSynonyms.
My code, copied from your example at github:
UnifiedSearchResults results = SearchClient.Instance.UnifiedSearch(Language.English)
Build error:Error CS1061 'IQueriedSearch<ISearchContent, QueryStringQuery>' does not contain a definition for 'UsingSynonymsImproved' and no extension method 'UsingSynonymsImproved' accepting a first argument of type 'IQueriedSearch<ISearchContent, QueryStringQuery>' could be found (are you missing a using directive or an assembly reference?) I have both the reference and the using directive.