More like

Use the MoreLike method to find documents whose text content is "like" a given string. This functionality is typically used for, but not limited to, finding related documents or objects.

Examples

A simple example can look like this:

searchResult = client.Search<BlogPost>()
    .MoreLike("guitar")
    .GetResult();

After invoking the MoreLike method, you can customize the search query with several methods. For instance, because you do not have a lot of documents with similar content, you probably want to lower the minimum document frequency requirement. That is the level at which words that do not occur in at least many documents are ignored, which defaults to five.

searchResult = client.Search<BlogPost>()
    .MoreLike("guitar")
        .MinimumDocumentFrequency(1)
    .GetResult();

A full list of extension methods for customizing the query follows below. But before you look at those, look at an example of finding documents "related" to a given document. Assuming you indexed two BlogPosts with similar content, you can search for similar documents as the first and expect the second using a query such as this:

var firstBlogPost = //Some indexed blog post about guitars
var secondBlogPost = //Another blog post about guitars

searchResult = client.Search<BlogPost>()
    .MoreLike(firstBlogPost.Content)
        .MinimumDocumentFrequency(1)
    .Filter(x => !x.Id.Match(firstBlogPost.Id))
    .GetResult();

📘
Note
When you issue these types of queries, use some caching because the result is not likely to change very often. Even if it does, a few minutes' delay might not matter.

Customize the query

As the nature of content can differ greatly between indexes and types, it is often a good idea to play around with available settings after having invoked the MoreLike method. The following methods can be called to customize the query. See also the Elastic Search guide.

MinimumDocumentFrequency – The frequency at which words are ignored and do not occur in at least this many docs. The default is 5.
MaximumDocumentFrequency – The maximum frequency in which words may still appear. Words that appear in more than this many docs are ignored. The default is unbounded.
PercentTermsToMatch – The percentage of terms to match on. The default is 30 (percent).
MinimumTermFrequency – The frequency below which terms are ignored in the source doc. The default frequency is 2.
MinimumWordLength – The minimum word length below which words are ignored. The default is 0.
MaximumWordLength – The maximum word length above which words are ignored. The default is unbounded (0).
MaximumQueryTerms – The maximum number of query terms included in any generated query. The default is 25.
StopWords – A list of words considered "uninteresting" and ignored.

📘Note

Customize the query

📘
Note