The built-in EPiServer search functionality described in a non-technical way
I recently got a question about what you can do with the built-in search in EPiServer CMS 6 R2 to optimize the search experience for the users. The question originally came from a customer that did not want to buy and implement an external search engine before knowing that they had taken what is there out of the box as far as possible.
Since Search is not one of my stronger areas I had a look here on World and in the SDK and found technical documentation scattered in several places, but I couldn’t find any good summary. Also, the customer is a web editor and the documentation in the developer guides is far too technical. Web editors do not talk code.
So, challenge taken, here goes:
The built-in search function in CMS 6 R2 can do:
- free text search on text in pages and files
- possible to specify whether you want your result to exactly match the search phrase that was typed in, or if it should allow matches where a word contains what was typed in (e.g: get a match on "plant" when you typed in "ant").
- property based search (for example search for pages of a specific type or pages that have a particular property set to something specific)
A specific question on that was: In CMS 6 using the basic built-in search functionality, can the adding of Categories and setting them for pages improve the search functionality/experience?
Category is a built-in property, hence it is possible to do a property based search on it, just like you would when filtering on page type, publish date or something else.
Given that categories are added, you could then either have a fixed category that the search results are filtered by, or build a function where the user can select category to filter on.
There is a free "extended" search function available called "EPiServer Search". It is not included in CMS 6 by default but lives in the EPiServer framework. It can be seen in action in the Online Center search (in the global menu). It is possible to add EPiServer Search to your CMS 6 site to extend the search functionality, and from EPiServer 7 it is integrated in the CMS product.
Implementing the functionality in EPiServer Search does require developer resources and does take time, so it is not as "free" as you might think. Implementing EPiServer Find is much easier, developers love it, but then Find is a separate licensed product :).
The following is possible to do when EPiServer Search has been added to the CMS 6 project (or if you have EPiServer 7 CMS):
-
The main gain is in performance, since EPiServer Search is an indexed search function. Indexed content is retrieved faster than content that has to be fetched from a database. One of my colleagues has a brilliant way of explaining it: "Looking for non-indexed content is like looking for clean socks in the morning and starting your search in the kitchen, then the bathroom, then the bedroom, then the wardrobe in the bedroom. If the content is indexed you know to go to the wardrobe directly, and you even know which shelf in the wardrobe.".
-
Global search on all types of content, such as video, images etc. Not only pages and files. (There is one exception: you will not get hits on content in blocks that are added in a Content Area. This is usually not a problem though, since content in a block is often promoting content in a page, and that page will be found when performing the search. Searching in blocks is supported in EPiServer Find.)
-
Static facets. You can simplify it by saying that facets is a type of filter or grouping. Faceted searching means that you, after getting the first query from the user, present a number of groups(facets) of content that the user can limit their search to.
-
Searching on categories and content types
-
Define which facets you want to filter on
-
-
Event driven indexing
-
Instead of crawling through all HTML content on the site and indexing it, you push the content you want to index to the search service. This gives better performance.
-
-
The search can be extended to include content from external systems.
-
The search results are filtered based on access rights, so that visitors will only see the content that they have permission to see.
-
Built-in filtering to limit the search results to the type of content you are after. For example blog, forum, club etc.
-
The search service is pluggable, i.e. can be exchanged for something else. You can replace the built in Lucene indexing with another index without needing to change the code on the client.
-
Stemming in English (i.e. adding a prefix, suffix, or pluralization to make the word you search for into a new word). An example of stemming would be if you took the word “plant” and made additional keywords out of words such as “planter”, “planting”, “plants”, “planted”. If you search on “plant” and the search engine uses stemming, you will get hits for each of the stemmed words.
-
Instant Search: The most important information is shown as you type your search phrase and become more specific the more you type in. You can see this behavior in, for example, Google Search.
It would be very good with some actual implementation examples where each example has:
1) a summary of which of the above that was used in the implementation, and
2) a clear description of the functionality that the user experiences, so that it becomes even easier to explain.
I don’t have any such examples at the moment, but if you do, please share them!
Thanks for a great post!
One thing that you haven't mentioned, that usually is the first thing I mention to clients when asked about the search improvements with EPiServer Search, is that the page and the file search now share the same search behaviour as they are no longer using different types of indexes. This means that they both support the same syntax, e.g. they both understand quoted phrases. It also means that the relevance between files and pages are again useful.
Thanks Henrik for adding a very valid point that I had overlooked!
I recently got new information about EPiServer Search and have updated this blog post based on it;
EPiServer Search does not support stemming very well, and the process of replacing the indexing service is sparsely documented and very complicated. Hence none of them are recommended to have in the list of supported features.