Index and search for custom content in EpiServer search

Vote:
 

Hi,

I'm trying to index some custom content in episerver search, but I cannot seem to find any good documentation. I have an episerver page page which renders content based on the routing parameter. I would like to make those "pages" searchable using episerver search. My plan was to make a class which could be indexed and then search for it using the default searchhandler.

I was planning on creating a scheduled jobb which read the data from the remote api, storing it in episerver search and then updating the index on a regular basis, since there will be no events when Ítems are removed. BUT I don't know if this is the way I should to this, or if it's even possible, even though it's lucene in the end, so anything should be possible I guess.

And then the search could be something like:

query.QueryExpressions.Add(new ContentQuery());
query.QueryExpressions.Add(new ContentQuery());

(This does not work, since contentquery requires IContent, so I'm guessing I should use another approach but I'm a little confused for how to do this.)

Somebody have any pointer regarding this?

Br,

Mårten

#186287
Dec 15, 2017 13:46
Vote:
 

Unless you actually implement IContent and maybe a custom ContentProvider there is not real gain with using the episerver searchhandler.

It would be simplier to create your own lucene index for your custom data.

#186292
Dec 15, 2017 15:18
Vote:
 

May be you can use episerver find connectors 

https://world.episerver.com/documentation/Items/Developers-Guide/EPiServer-Find/9/DotNET-Client-API/Searching/configuring-find-connectors/

#186301
Dec 15, 2017 16:48
Vote:
 

@Erik, if I were to implement IContent, how can I then accomplish what I´m trying to do, IE index som custom content?

I could make a custom content provider but that's not really something that feels nessecary at the moment. Would that help me in this case?

Regarding making a custom lucene index - if I were to do this, would it be possible to do this using the episerver lucene api? And also query across multiple indexes when doing the search? That would be simple using elasticsearch but I havent really tried using the search api for something like this earlier.

@Tahir, thank you for the suggestion, that link, unfortunately, is regarding episerver find while I'm using episerver search.

Br

Mårten

#186323
Dec 17, 2017 21:31
Vote:
 

Rather simple really, all you need to do is call:

ContentSearchHandler.UpdateItem(IContent)

Custom content provider is more useful if you want to give your editors the ability to link to the custom content when they work with pages etc.

I am not aware of anything there that episerver search needs to index the custom content, although potentially it might be necessary.

If you want to query the custom content together with episerver content then a custom lucene index isn't the way to go.

#186358
Dec 18, 2017 15:59
Vote:
 

I've a solution for searching with different filters (against different properties). 

First I create a SearchInitialization that handles the indexing.

In short: If the content is of GenericMedia I index my custom property (Organisation).

using EPiServer.Framework;
using EPiServer.Models.Media;
using EPiServer.Search.IndexingService;
using System;
using System.Collections.Generic;
using Lucene.Net.Documents;
using System.Linq;
using System.Web;
using EPiServer.Business.Search;
using EPiServer.Core;

namespace EPiServer.Business.Initialization
{
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class SearchInitialization : IInitializableModule
    {
        public void Initialize(EPiServer.Framework.Initialization.InitializationEngine context)
        {
            IndexingService.DocumentAdding += CustomizeIndexing;
        }

        // Add custom fields to search document 
        void CustomizeIndexing(object sender, EventArgs e)
        {
            var addUpdateEventArgs = e as AddUpdateEventArgs;

            if (addUpdateEventArgs == null)
            {
                return;
            }

            // Get the document being indexed
            var document = addUpdateEventArgs.Document;

            IContent content;
            try
            {
                content = document.GetContent<IContent>();
            }
            catch (Exception ex)
            {
                if (ex is ContentNotFoundException)
                {
                    return;
                }
                throw ex;
            }

            // Custom fields for GenericMedia
            if (content.GetOriginalType().Name == "GenericMedia")
            {
                var genericMedia = content as GenericMedia;
                if (genericMedia != null)
                {
                    var organisation = string.IsNullOrWhiteSpace(genericMedia.Organisation) ? "" : genericMedia.Organisation;
                    document.RemoveField("cfOrganisation");
                    document.Add(new Field("cfOrganisation", organisation, Field.Store.YES, Field.Index.ANALYZED));
                }
            }

            return;
        }

        public void Preload(string[] parameters)
        {
        }

        public void Uninitialize(EPiServer.Framework.Initialization.InitializationEngine context)
        {
            IndexingService.DocumentAdding -= CustomizeIndexing;
        }
    }
}

I add a class for handling a CustomFieldQuery:

using EPiServer.Search.Queries;
using EPiServer.Search.Queries.Lucene;
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Web;

namespace EPiServer.Business.Search
{
    public class CustomFieldQuery : IQueryExpression
    {
        public string Field { get; set; }
        public string Expression { get; set; }
        public float? Boost { get; set; }

        public CustomFieldQuery(string queryExpression, string fieldName)
        {
            Expression = queryExpression;
            Field = fieldName;
            Boost = null;
        }

        public CustomFieldQuery(string queryExpression, string fieldName, float boost)
        {
            Expression = queryExpression;
            Field = fieldName;
            Boost = boost;
        }

        public string GetQueryExpression()
        {
            return string.Format("{0}:({1}{2})",
                Field,
                LuceneHelpers.EscapeParenthesis(Expression),
                Boost.HasValue ? string.Concat("^", Boost.Value.ToString(CultureInfo.InvariantCulture).Replace(",", ".")) : string.Empty);
        }
    }
}

When performing the search I do something like this. Using my CustomFieldQuery:

	    // Access control
            AccessControlListQuery aclQuery = new AccessControlListQuery();
            aclQuery.AddAclForUser(PrincipalInfo.Current, HttpContext.Current);

            // Culture
            var culture = ContentLanguage.PreferredCulture.Name;

            // Query
            GroupQuery query = new GroupQuery(LuceneOperator.None);


	    GroupQuery documentQuery = new GroupQuery(LuceneOperator.AND);
            documentQuery.QueryExpressions.Add(new ContentQuery<GenericMedia>());

            //Freetext
            if (!String.IsNullOrWhiteSpace(Condition.Freetext))
            {
                documentQuery.QueryExpressions.Add(new FieldQuery((Condition.MatchFull ? Condition.Freetext : Condition.Freetext + "*")));
            }

            foreach (var organisation in Condition.Organisation)
            {
                documentQuery.QueryExpressions.Add(new CustomFieldQuery(organisation, "cfOrganisation", 2.0f));
            }

            VirtualPathQuery documentpathQuery = new VirtualPathQuery();
            documentpathQuery.AddContentNodes(Condition.DocumentSearchRoot);
            documentQuery.QueryExpressions.Add(documentpathQuery);
            documentQuery.QueryExpressions.Add(aclQuery);

            query = documentQuery;

            // Gets result
            var searchHandler = ServiceLocator.Current.GetInstance<SearchHandler>();
            SearchResults searchResult = searchHandler.GetSearchResults(query, 1, 100);

The examples is not the complete source code. But handles the three pieces I'm using (Initialization, CustomFieldQuery and the Search itself).

#186370
Edited, Dec 18, 2017 17:45
Vote:
 

Wow, first question that pops to mind is:

Didn't your field get indexed if you added the [Searchable] attribute to your GenericMedia.Organization property?

And didn't you get hits on the organization field if you just used ContentSearchHandler.GetSearchResults(searchPhrase, page, pagesize) instead of calling the SeachHandler directly?

Granted, you wouldn't get any boost on your custom field that way, but I see nothing in the question that suggests that would be necessary.

#186432
Dec 19, 2017 15:32
Vote:
 

I'm sure there's prettier ways of doing this.

Fields get indexed when I added the [Searchable] attribute. But the goal was to be able to search for GenericMedia where the Organization property had the given value (and not other properties).

For example: Searching for "test" should not find content with Header=Test, but ONLY content with Organization=Test. 

#186444
Dec 19, 2017 15:44
Vote:
 
[Searchable(false)]
public string Header { get;set;}

As long as you use your own classes you should be able to use attributes.

I understand you had more advanced goals/demands, but I haven't seen anything from the original poster to indicate he has such complex needs. wink

#186448
Dec 19, 2017 16:13
Vote:
 

True! 

My answer is more complex than it needs to be when looking at the question. Most likely you may skip the initialization part and the CustomFieldQuery. Instead just use the [Searchable] attribute and pass a GroupQuery to GetSearchResults().

#186449
Dec 19, 2017 16:21
Vote:
 

Thank you so much @Eric and @Peteng! What I ended up doing was implementing a scheduled job and then used the SearchHandler to index using the 'IndexRequestItem' which gave me full control for how the content got indexed and how I could retreive it. When simply inheriting PageData and using the ContentSearch handler i could get some existing properties like the PageName to be indexed but not my custom defined properties.

To be able to implement IContent (or inherit PageData) and index through the ContentSearchHandler I think I would be required to use a custom ContentProvider since it appears that the UpdateItem reads data from ContentRepository when it decides which properties should be indexed. (I might be wrong but when digging through the EPiServer dll's it looked that way). If I were to start over this would probably would be the way I would go, though.

And again, thanks! :)

- Mårten

#186498
Dec 20, 2017 21:57
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.