Include more data on an PageData object

Vote:
 

Hi,


First, I'm not using PageTypeBuilder in this project, but maybe I'm gonna add that for some specific page types.


Alot of the web pages are a collection of other pages, e.g. teaser pages and more specific in this case tabs.

Now I'm using the PageIndexer and sets some conventions, like;

PageIndexer.Instance.Conventions.ForInstancesOf<PageData>()
    .ShouldIndex(page =>
        PageTypesToIndex.Contains(page.PageTypeID) &&
        page["ExcludePageInSearch"] == null);

But how do I include more data to the pages, like tabs and teasers? The client class has IncludeField, so I tried that;

IClient client = Client.CreateFromConfig();

client.Conventions.ForInstancesOf<PageData>()
    .IncludeField(page => page.GetExtraContent());

GetExtraContent() is an extension method which loops through all tab pages and generates a string of all content. But the content this methods returns is not searchable. I can't see the content in 'Explore Index' either.


How can I add more content to an PageData so it's searchable?

 

#63632
Nov 22, 2012 16:02
Vote:
 

If you are doing it exactly like your example I think it is because you are not using the conventions on the searchclient singleton in the second example which should make it hard for it to remember it.

SearchClient.Instance.Conventions.ForInstancesOf<PageData>()
                .IncludeField(page => page.GetExtraContent());


#63634
Nov 22, 2012 16:10
Vote:
 

Wow I feel stupid now :) Of course I have to use the same instance. Now it works.

How should I format the content that I'm passing in? How does the property SearchText works? Should I remove all html, is there a helper for removing all unnecessary content, like tables and headings?

#63636
Nov 22, 2012 16:47
Vote:
 

There is a helper extension method for string called StripHtml that should remove the tags for you. That you can run

#63637
Edited, Nov 22, 2012 16:50
Vote:
 

Regardings SearchText() it loops through all properties and filters out those that aren't marked as searchable on the page type. It then sorts the properties placing those of type string first. Finally it concatenates the rest. In other words it provides a decent default search text which could be especially usefull for non-PTB and non-v7 sites.

#63643
Nov 22, 2012 22:26
Vote:
 

That was my guess too. Is it possible to hook in some code and add extra content to SearchText? Or should I go with my own extension method?

How does the relevancy works? The sooner the term is in SearchText the more relevant? Because then it would be nice to be able to add headings, titles, keywords and so on first in the string.

#63644
Nov 22, 2012 23:08
Vote:
 

The default SearchText method isn't extensible in the sense that you can hook in to it. However, you can easily replace it. If you're using PTB or CMS 7 the easiest way is to add a string property (non-EPiServer) to your page types named SearchText which will then take precedence. You can also exclude the method and then include your own. In both cases you can choose to "extend" on the default SearchText method by invoking it and then add what it returns before returning the value from your own property or method.

Regarding relevancy all text in the SearchText is equal. If you want to boost some specific text you can include it in a separate field in which you search (using .InField(...)). By doing that alone you probably boost that text but using the InField method you can also choose to give that field a specific boost should you want to.

#63666
Nov 25, 2012 23:11
Vote:
 

Joel: Overriding SearchText worked out great!

First I had to exclude the default SearchText method and then include mine. I added one parameter to the method signature so I could call mine.

SearchClient.Instance.Conventions.ForInstancesOf<PageData>()
    .ExcludeField(page => page.SearchText()) // Exclude the default SearchText
    .IncludeField(page => page.SearchHitTypeName())
    .IncludeField(page => page.SearchHitUrl())
    .IncludeField(page => page.SearchPublishDate())
    .IncludeField(page => page.SearchText(true)) // Include our extened SearchText
    .IncludeField(page => page.SearchTitle())
    .IncludeField(page => page.SearchTypeName())
    .IncludeField(page => page.SearchUpdateDate());
public static string SearchText(this PageData page, bool extended)
{
    StringBuilder content = new StringBuilder();

    content.AppendLine(page.SearchText());

    // Custom content

    return content.ToString().StripHtml();
}

    

#63800
Nov 29, 2012 17:09
Vote:
 

Does this idea for IncludeField and ExcludeField still work in Find v12? I can't seem to figure it out with the ContentIndexer..

#170782
Oct 30, 2016 17:37
Vote:
 

Okay, figured out it should be:

EPiServer.Find.Framework.SearchClient.Instance.Conventions.ForInstancesOf<PageData>().ExcludeField(x => x.ACL); // with using EPiServer.Find.ClientConventions;

Matt

#170789
Oct 31, 2016 0:45
This topic was created over six months ago and has been resolved. If you have a similar question, please create a new topic and refer to this one.
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.