Five New Optimizely Certifications are Here! Validate your expertise and advance your career with our latest certification exams. Click here to find out more

Infuriating search not indexing all page string fields

Vote:
 

Has anybody had any experience of the Episerver Search not indexing all the text fields in a page? I have a text field in my SitePageData which stubbonly refuses to show up in any searches. I have tried changing it to a XHTMLString to see if that makes any difference, but it does not. The field is updated from a scheduled task which collects all the text data from any blocks within the page. I know that the field is getting updated as the function that populates the fields can also read them out again to a log, with a different parameter.

The site is supposed to go live on 30th Dec so i am a bit worried

Thanks, in advance,

Marshall

#173440
Dec 27, 2016 11:06
Vote:
 

Hi Marshall,

So, I assume you are talking about the built-in Episerver search? It is annoying that it doesn't index everything, but, I think that is on purpose becasue they want people to use Find. Anyway, here is what you do to index custom fields.

First, create an init module:

[InitializableModule]
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class SearchInitialization : IInitializableModule
    {
        public void Initialize(InitializationEngine context)
        {
            IndexingService.DocumentAdding += CustomizeIndexing;
        }

        public void Uninitialize(InitializationEngine context)
        {
            IndexingService.DocumentAdding -= CustomizeIndexing;
        }

        void CustomizeIndexing(object sender, EventArgs e)
        {
            var addUpdateEventArgs = e as AddUpdateEventArgs;

            if(addUpdateEventArgs == null)
            {
                return; //Document is not being added/updated
            }

            var document = addUpdateEventArgs.Document;

            var page = document.GetContent<IContent>() as PageData;

            if(page != null && page is SitePageData)
            {
                var examplePage = page as SitePageData;
                                         
                if(examplePage.YourCustomField != null)
                {
                    document.Add(new Field("CUSTOM_FIELD", examplePage.YourCustomField, Field.Store.NO, Field.Index.ANALYZED));
                }
                
            }
        }
    }

Here is the extension method GetContent that the variable page is using when getting set:

public static class DocumentHelper
    {
        public static T GetContent<T>(this Document document) where T : IContent
        {
            const string fieldName = "EPISERVER_SEARCH_ID";

            var fieldValue = document.Get(fieldName);

            if(string.IsNullOrWhiteSpace(fieldValue))
            {
                throw new NotSupportedException(
                    string.Format("Specified document did not have a '{0}' field value", fieldName));
            }

            var fieldValueFragments = fieldValue.Split('|');

            Guid contentGuid;

            if(!Guid.TryParse(fieldValueFragments[0], out contentGuid))
            {
                throw new NotSupportedException(
                    "Expected first part of ID field to be valid GUID");
            }

            return ServiceLocator.Current.GetInstance<IContentLoader>().Get<T>(contentGuid);
        }
    }

Here is an example of a search query using the cutom field:

public void Search(string q)
        {
            var culture = ContentLanguage.PreferredCulture.Name;
            SearchResult = new List<IndexResponseItem>();

            var query = new GroupQuery(LuceneOperator.AND);

            // Only search for pages
            query.QueryExpressions.Add(new ContentQuery<PageData>());

            // Search for keywords in any of te fields specified below (OR condition)
            var keywordsQuery = new GroupQuery(LuceneOperator.OR);

            // Search in default fields
            keywordsQuery.QueryExpressions.Add(new FieldQuery(q));

            // Search in the custom fields
            keywordsQuery.QueryExpressions.Add(new CustomFieldQuery(q, "CUSTOM_FIELD"));

            query.QueryExpressions.Add(keywordsQuery);

            // The access control list query will remove any pages the user doesn't have read access to
            var accessQuery = new AccessControlListQuery();
            accessQuery.AddAclForUser(PrincipalInfo.Current, HttpContext.Current);
            query.QueryExpressions.Add(keywordsQuery);

            var fieldQueryResult = SearchHandler.Instance.GetSearchResults(query, 1, 40)
                .IndexResponseItems
                //.Where(x =>
                //    (x.Culture.Equals(culture) || string.IsNullOrEmpty(x.Culture))
                //    )
                .ToList();

            SearchResult.AddRange(fieldQueryResult);

        }

Lastly, when you update the field in your scheduled job, update the index:

var contentSearchHandler = ServiceLocator.Current.GetInstance<ContentSearchHandler>();
            contentSearchHandler.UpdateItem(currentPage as IContent);

I hope this helps!

- John

#173537
Edited, Dec 30, 2016 20:20
This topic was created over six months ago and has been resolved. If you have a similar question, please create a new topic and refer to this one.
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.