Calling all developers! We invite you to provide your input on Feature Experimentation by completing this brief survey.
Calling all developers! We invite you to provide your input on Feature Experimentation by completing this brief survey.
Looks like this feature has also been requested in the past. See https://world.episerver.com/forum/developer-forum/Feature-requests/Thread-Container/2017/1/be-able-to-filter-out-stopwords-for-all-search-not-only-morelike/.
Thanks Bob for the reply. I will continue to strip out words with Regex in the meantime, though it's not an ideal solution.
Any idea within Episerver if this feature will be picked up?
Hey Janaka
The MoreLikeThis query has a StopWords method, perhaps you can look at how its implemented and re-use the implementation for a standard search?
David
Hi David
Thanks for that. I took a look through this and I can see the stop words applied to a MoreLikeQuery get sent in the JSON. However this seems to be quite coupled to the MoreLike query. I tried creating my own query type but this didn't seem to have any effect on the main typed search.
It looks like it's to internal to the standard typed search to be able to extend at this point, though my experience here is limited. Probably best left for the Find team.
In the end I was able to come up with a solution that works.
I have created an extension method to remove the stop words from the query using Regex and then apply my standard typed search.
querySearch.For(query.RemoveStopWords()) .InField(x => x.DisplayName)
I just ensure that the full search term is tracked.
querySearch.Track(new[] {query});
I'm finding in my free text search that I am getting a lot of extra results when the search phrase includes an article as part of the term. Articles in English are the words "the, a, an".
Doing a search for the book "hunger games" produces 12,575 results.
Doing a search for "the hunger games" produces 414, 524 results.
While the majority of the top products are the same, there are some differences to the results when using the word "the".
So far the only way I can see to trim this is by removing any article words from the query before executing the search. I wasn't sure if there is a way for Find to do this through code or configuration? I don't want to use AND instead of OR for the search, as I still wouldn't like to restrict it too much.
The search for of my query looks like this: