Performance: How to disable keyword indexing in SP2
EPiServer CMS indexes content in properties for two purposes, one of them is to extract links and the other is to extract keywords for the internal search engine.
If you don’t use the internal search engine the server does a lot of work and fills up tables* in the database with a lot of keywords that are never used.
Since EPiServer CMS 5 R2 Service Pack 2 you can disable the keyword indexing by setting:
<site (..) indexingTextEnabled="false" (..) />
*) Affected tables in the database are tblKeyword and tblPageKeyword
Oh, I so wish this feature had been available for earlier versions too (like 4.61)... guess we'll have to wait until we have migrated the site.
Out of curiosity, if we're not using the built-in search, could we just "delete from tblKeyword / tblPageKeyword"? At regular intervals of course.
Does this affect all search-based features? Like the search method in the staging functionality, the searchbox in edit-mode, PageSearch control etc.
Steve: I guess that workaround should work.
Jonas: This affects the SearchDataSource control which is used on the Search-template in a default installation, not in edit-mode.
Is there a way of start a reindex of all keywords ? This is useful after migrated a site when the site f.ex has EN as the masterlanguage and you would like the masterlanguage to be NO. Using a tool like this (http://blog.najmanowicz.com/2009/04/06/advanced-language-manipulation-tool-for-episerver/) does not change the LanguangeID in the tblKeyword. When trying to search the SearchDataSource gives you an "Object reference not set to an instance of an object". This is due to the page security check in the public TextSearchResults SearchPages() and the differ in languagebranch the page will be null and an unhandled exception is thrown.
A very good thing is that the EPiServer code is not obfuscated, so it's possible to debug your way to the the problems (even if they in this case are my own).
You can try this rather out-dated tool:
http://world.episerver.com/FAQ/Items/How-do-I-reindex-all-pages-in-the-database/
Not sure it works in CMS 5 but its worth the try and the code should be included.
Afternoon Per, Any idea why indexing is returning pages/ properties marked as 'Searchable' false.
These are pages within the epi tree that are modules and as far as I have read 'Searchable' set to false should suffice?. We certainly do not want the user to navigate to the data/module containers - seems like post process filtering is the only way I can resolve it.
Regards
Only properties marked searchable will be indexed, so you may have to republish the page if you change the value after the fact.
Does this also work for CMS 5 (non SP2)? A related question: if we don't use the internal search engine, is it safe to uncheck "Searchable property"? Will this affect performance in any way? Is it possible to change the default value of "Searchable property"?