EPiServer Full Text Search Now Available For CMS 6 R2
You may or may not be aware that the EPiServer Relate product ships with a full text search service. I am pleased to announce that this now being made available to all CMS 6 R2 customers as a separate download here.
The search service is split into 2 parts:
- A client API to push content into the indexing server and obtain search results
- A REST based web service which wraps Lucene.net to provide the default indexing service implementation.
The idea of this is that the REST based web service can be replaced with another implementation wrapping a different indexing engine that support the push model, without having to change your site code that pushes content and performs search requests.
Getting Started with EPiServer FTS
1. Download the EPiServer Search Windows Installer and run it on the desired machine
2. Deploy the Indexing Service using EPiServer Deployment Center. The default service is implemented as a Windows Communication Foundation Service with a SVC file endpoint. The service can be deployed on a new or existing ASP.NET enabled web site. Note that an existing site in this context does not need to be an EPiServer CMS site although this works just as well:
Whilst experimenting with it, I would recommend installing the service on the same EPiServer CMS site that will be it’s client:
You need to provide the path of where the indexing service will store its index files. This should not be under a website root. I recommend using the the EPiServer CMS site’s VPP folder with a sub “Index” folder. Note that this folder does not need to exist at this time:
You should now be ready to deploy:
When the deployment is finished the index service target website’s configuration file should have been updated with the following element:
<episerver.search.indexingservice><clients><add name="local"description="local"allowLocal="true"readonly="false" /></clients><namedIndexes defaultIndex="default"><indexes><add name="default"directoryPath="C:\EPiServer\VPP\MyEPiServerSite3\Index"readonly="false" /></indexes></namedIndexes></episerver.search.indexingservice>
You now need to update the web.config of the EPiServer CMS site that is going to be the client to the installed index service (in the example above, it’s the same site). The file should have an episerver.search element like this:
<episerver.search active="false"><namedIndexingServices defaultService=""><services><!--<add name="{serviceName}"
baseUri="{indexingServiceBaseUri}"accessKey="{accessKey}"/>--></services></namedIndexingServices><searchResultFilter defaultInclude="true"><providers /></searchResultFilter></episerver.search>
The “active” attribute needs to be set to true, the “services” element needs to have a child entry identifying the deployed index service and the defaultService attribute needs to identify the service instance:
<episerver.search active="true"><namedIndexingServices defaultService="EPiServerFTS"><services><add name="EPiServerFTS"baseUri="http://localhost:17007/IndexingService/IndexingService.svc"accessKey="93CC0599-3D95-4B30-8F25-4B8A21D289A0"/></services></namedIndexingServices><searchResultFilter defaultInclude="true"><providers /></searchResultFilter></episerver.search>
Note for the “accesskey” attribute I generated a GUID. This value needs to be replicated to the indexing service configuration to authorize access from this client:
<episerver.search.indexingservice><clients><add name="93CC0599-3D95-4B30-8F25-4B8A21D289A0"description="local"allowLocal="true"readonly="false" /></clients><namedIndexes defaultIndex="default"><indexes><add name="default"directoryPath="C:\EPiServer\VPP\MyEPiServerSite3\Index"readonly="false" /></indexes></namedIndexes></episerver.search.indexingservice>
Populating the index
Your site should now be ready to start feeding data into the index. You can of course write this code from scratch, but for those of you who want something as a base to get started, you can download the EPiServer FTS Facade code I have developed. This is available on the EPiServer Nuget feed as EPiServer.Samples.FTSFacade.1.0.nupkg. Those of you using Visual Studio 2008 can get the code here.
The Nuget package includes:
1. A client facade class which abstracts away some of the work you need to do when talking to the EPiServer Search Client API
2. An initialization module which on first run will traverse the CMS page tree in order to push the page’s searchable properties and metadata to the index. The “Documents”, “Global” and “PageFiles” VPP folders are also enumerated to push information about the files stored in them to the index.
3. Events handlers so whenever a page or file is added, updated or removed, the relevant information is pushed to the index
4. A class of extension methods for the EPiServer.Search.IndexResponseItem class to obtain the CMS page or file that the search result is for and methods to generate urls, preview and title text for a search result
4. An Online Center Search Provider to provide search results from the index when searching in EPiServer Online Center
5. A “Re-Index” Online Center Gadget which gives Administrators the ability to trigger a re-index of the site’s pages and files in the same way as the initialization module does. Note that your will need to update the web.config file with the name of your assembly in the episerver.shell/publicModules section:
<configuration><episerver.shell><publicModules><add name="ReIndex"resourcePath="~/FTSFacade/Gadgets/ReIndex/"><assemblies><add assembly="your assembly name goes here" /></assemblies></add></publicModules></episerver.shell></configuration>
The next time the site is re-compiled and started the initial indexing should be done. Note that indexing is done asynchronously using a persistent queue so it normally takes a few seconds before you see anything happening in the index folder you specified when you installed the indexing service.
The code in the FTSFacade folder installed can now be modified to suite the needs of your site.
Your site’s search page can now be wired up to the FTSFacade’s CmsSearchHandler.GetSearchResults method:
public SearchResults GetSearchResults
(string searchTerm,
bool includeAuthorNamesInSearch,
bool includeFilesInSearch,
int page,
int pageSize
);
The SearchResults object returned is defined as follows:
public class SearchResults{public Collection<IndexResponseItem> IndexResponseItems { get; }public int TotalHits { get; internal set; }public string Version { get; internal set; }}
You can populate your search results page with the IndexResponseItems. As mentioned above, the FTSFacade includes extension methods for IndexResponseItem to obtain CMS pages and files for the search result.
Disclaimer: The code in the FTS Client Facade is provided “as is” and should be thoroughly tested before being deployed in a commercial project.
The following tech-notes contain more information about the EPiServer FTS Client and Service:
Full Text Search Client Configuration
Full Text Search Product Integration
Full Text Search Service Configuration
Happy searching!
Awesome blog post!!
This is really exciting :) Thanks for the info and the nuget feed!
Very cool! Much needed love for the search engine :)
Much love for the client facade, and especially the reindex gadget. Can you build a reindexer for Relate too? :)
The facade zip file seems to be missing in action
(http://world.episerver.com/PagesFiles/104493/FTSFacade.zip)
Also, you should probably make whoever manage to make this work with composer a emvp :)
Magnus P: I believe there's a reindexer for Relate in the Relate upgrade scripts. It is described here: http://world.episerver.com/Articles/Items/Migrate-EPiServer-Relate-1-to-EPiServer-Relate-2/
Thanks Kristoffer, I'll look into it.
Fixed the link to the code now!
Hi Paul. Great article. FTS looks really interesting.
I have some additional questions about FTS and the new search functionality provided by EPiServer. I have started a thread here: http://world.episerver.com/Modules/Forum/Pages/thread.aspx?id=43021
Also I can't seem to find any documentation about the EPiServer.Search.Queries.Lucene. You don't happen to know if there is any, do you?
Hi Paul,
I am currently working on a EPiServer Commerce project and am looking for a way to implement relevance on different fields. For instance, the title field should have far more relevance when hit by a search keyword then lets say the description field and should show up higher in the search results.
Lucene provides a way to add relevance (boosting) to fields and to terms, but I can't seem to get my finger around it and make it work (where to start?).
I then stumbled upon your blog and wondered if this could be used inside EPiServer Commerce and give me a way to implement the relevance functionality.
Thanks,
--Andre
Hi Paul,
And thanks for a great post.
I have tried to implement this on a site I'm currently developing for a client. I got everything up and running and the index seems to be populated nicely. However, I can't get any results returned from the search. Whatever I try searching for the SeachResults.IndexResponseItems collection is empty.
I have checked that the Searchable property is set to true and i can see that the text entered to these properties are stored in the index. But when I do a search for a word that is existing in the index no results are returned. I have also tried to do a search directly to the FTS client like so:
FuzzyQuery query = new FuzzyQuery(searchTerm, Field.Default, (float)0.5);
return EPiServer.Search.SearchHandler.Instance.GetSearchResults(query, 0, 10);
But without any luck. I know it's difficult to answer with so little information and I guess there could be a number of reasons but any pointers would be much appreciated.
Cheers,
David
This is what I have been looking for. Great post Paul!
Hi Paul
I've got the facade code up and running and have successfully extended it to index some data I store i Dynamic Data Store. In most cases it works fine but I have some problems with special characters. For instance I have a text element containing 'Lucene.Net'. It's only returned when I use an exact phrase search. Search terms like 'Lucene' and 'Lucene*' returns no hits.
What is the correct pattern for handling special characters? Anything I should do when setting up the IndexRequestItem? When preparing the query text? In the FTS configuration?
Thanks,
Thomas
Hi Paul,
* It seems that the documentation that you have linked to is not 100% accurate / up-to-date. There is at least some mismatches between the config-files that ship with the installation, and the available settings described in the documentation. Is there any more / newer documentation available?
* Our customer is running in a load-balanced environment. This raises some issues. What is best practice for configuring the search in such an environment? How is this solved in EverWeb?
* Does new new search handle Composer-content and Dynamic-content?
* Are multiple named indexes supported? Or is there any other way of differentiating the searchresult? We want to display an extended search-result for some of our users.
Thanks,
Bjørn
Hi Bjørn,
1) That documentation should be up-to-date. I failed to spot the mismatch between the configuration that comes with EPiServer Framework / EPiServer Search installations and the linked documentation. Can you please point out the specific issue, so we can fix or clarify the documentation as appropriate?
2) Having a shared instance that each client is configured to use would be the most straightforward configuration (along the lines of the examples above but "baseUri" on the client side would not point at localhost and on the service side you would explicitly specify which client hosts are allowed rather than relying on "allowLocal" alone)
3) The referenced FTS Facade does not have such functionality included, however it is possible to extend it to index additional content.
4) Unless there is any particular reason why complete separation is required I would suggest just using different search queries.
I see that the GetSearchResults method in the facade Paul blogged about takes a string for search terms (so that you can just pass on the search terms from the browser input field), but if you look at its implementation you'll see that it creates a structure of query expression objects where the search terms string is matched against some particular field and then all the other requirements relating to access rights, that the items specifically are CMS pages or files, etc, are AND:ed with that and passed on down to the search client.
I think the best approach would be to create some variation of this that filters out the right things in your different scenarios.
Regards,
Håkan
Hey David Sandeberg,
I experienced the same problem as you - the FTS would not return any results. I found the problem. Paul's comment on the page-parameter to the method GetSearchResults(...) says "The zero based index of the results page to return", which indicates that you should pass 0 if you want the first page. This is wrong. Passing 1 instead of 0 will give you the search-results.
By the way, if anybody wants to explore the index built by Lucene, I suggest installing Java and downloading Luke v0.9.9.1: http://code.google.com/p/luke/downloads/list
Hi,
Anyone used the Microsoft Dynamic CRM in Episerver cms 6 R2 ?
If yes.....i have some errors regarding that.....
while i followed the steps stated in the below url :
http://world.episerver.com/Documentation/Items/Installation-Instructions/EPiServer-Connect/Installation-Instructions---EPiServer-Connect-for-CRM-12/
Still when i click on Test Connector button it is displaying
Content webservice test failed
Metadata webservice test faile.
Can one please figure it out ?
Is this not available on nuget anymore?