Per Magne Skuseth
Jan 27, 2016
  9759
(5 votes)

EPiServer Find: Index blocks in XhtmlString

Many users use blocks in their XhtmlString properties. However, when an XhtmlString field is being indexed by EPiServer Find, the block content does not get included in the field value. This could lead to users not getting relevant search hits if the term they are searching for resides in a block inside an XhtmlString field.
Below you will find an example on how you could include the block text content to the XhtmlString field value in the EPiServer Find index.

I have written an extension method for XhtmlStrings that returns a string with both block text content and the static text content. It loops through each fragment of the property and appends the values using a StringBuilder

   1: public static string ToBlocksIncludedString(this XhtmlString xhtmlString)
   2: {
   3:     var sb = new StringBuilder();
   4:     if (xhtmlString != null && xhtmlString.Fragments.Any())
   5:     {
   6:         var contentLoader = ServiceLocator.Current.GetInstance<IContentLoader>();
   7:         foreach (IStringFragment fragment in xhtmlString.Fragments.GetFilteredFragments(PrincipalInfo.AnonymousPrincipal))
   8:         {
   9:             // the content fragments contains the referenced blocks
  10:             if (fragment is ContentFragment)
  11:             {
  12:                 var contentFragment = fragment as ContentFragment;
  13:                 if (contentFragment.ContentLink != null &&
  14:                     contentFragment.ContentLink != ContentReference.EmptyReference)
  15:                 {
  16:                     var referencedContent = contentLoader.Get<IContent>(contentFragment.ContentLink);
  17:                     sb.Append(referencedContent.SearchText() + " ");
  18:                 }
  19:             }
  20:             else if (fragment is StaticFragment)
  21:             {
  22:                 // ... and the static fragments contains the static text in the XhtmlString
  23:                 var staticFragment = fragment as StaticFragment;
  24:                 sb.Append(staticFragment.InternalFormat + " ");
  25:             }
  26:         }
  27:     }
  28:     return sb.ToString();
  29: }

 

Include the value in the SearchText field

If you are using Unified Search, you should add the new value to the SearchText field. For IContent, the SearchText field is a combination of all string-based properties from the current content that has not been marked with [Searchable(false)], and will automatically added to the index by standard conventions.
Override the field by adding a property named SearchText on your content type.

   1: public class StandardPage : SitePageData
   2: {
   3:     [Searchable(false)]
   4:     public virtual XhtmlString MainBody { get; set; }
   5:  
   6:     [RemoveHtmlTagsWhenIndexing]
   7:     public string SearchText => this.SearchText() + " " +  MainBody.ToBlocksIncludedString();
   8: }

Make sure that the original XhtmlString property has “Searchable” set to false, or you’ll get both the ToBlocksIncludedString value and the standard value of the XhtmlString property added to the SearchText field.

 

Changing the default indexing behavior for all XhtmlStrings

If you always want to include the block content for you XhtmlStrings, you could do this by adding a few conventions in an initializable module. In the example below, the standard value, named “AsViewedByAnonymous” in the index, is replaced by the new value.

   1: [InitializableModule]
   2: [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
   3: public class FindFieldInitialization : IInitializableModule
   4: {
   5:     public void Initialize(InitializationEngine context)
   6:     {
   7:         SearchClient.Instance.Conventions.ForType<XhtmlString>().ExcludeField(x => x.AsViewedByAnonymous());
   8:         SearchClient.Instance.Conventions.ForType<XhtmlString>().IncludeField(x => x.ToBlocksIncludedString());
   9:         SearchClient.Instance.Conventions.ForType<XhtmlString>().Field(x => x.ToBlocksIncludedString()).Modify(x => x.PropertyName = "AsViewedByAnonymous" + TypeSuffix.String);
  10:     }
  11:  
  12:     public void Uninitialize(InitializationEngine context){}
  13: }
Jan 27, 2016

Comments

Jan 27, 2016 09:00 AM

Great!

Jan 28, 2016 10:45 AM

Sweet! I've been looking for a solution for this. Didn't know about the magic

xhtmlString.Fragments.GetFilteredFragments(PrincipalInfo.AnonymousPrincipal)


Hovard Berg
Hovard Berg Mar 1, 2018 03:35 PM

Nice! 

This code indexes all blocks inside XHTML. We would like to check if the block type should be indexed or not based on the three ways to set this:

https://world.episerver.com/documentation/Items/Developers-Guide/EPiServer-Find/8/Integration/EPiServer-75/Indexing-content-in-a-content-area/

I've tried to get the content type for each block, and locate IndexInContentAreas property, but I can't find it. Anyone knows how you can programatically know if the content type should be indexed in content areas or not?

Peter Gustafsson
Peter Gustafsson Jun 5, 2018 01:31 PM

@Hovard I made this modification to check for the IndexInContentAreas attribute and property.

var referencedContent = contentLoader.Get(contentFragment.ContentLink);

// Check if referencedContent is suppose to be indexed in contentareas

// Attribute
var hasIndexInContentAreasAttribute = referencedContent.GetOriginalType().IsDefined(typeof(IndexInContentAreasAttribute));

// Property
var indexInContentAreasPropertyValue = referencedContent.GetOriginalType().GetProperty("IndexInContentAreas") != null ? 
    (bool) referencedContent.GetOriginalType().GetProperty("IndexInContentAreas").GetValue(referencedContent) : false;

if (hasIndexInContentAreasAttribute || indexInContentAreasPropertyValue)
{
    sb.Append(referencedContent.SearchText() + " ");
}



Peter Gustafsson
Peter Gustafsson Jun 5, 2018 01:44 PM

Just in case someone else stumbles upon the same issue.

Note! 

referencedContent.SearchText()

Doesn't care about any custom property or extension method that might be implemented for SearchText on a block.

I.e.

public class MyBlock : SiteBlockData
{
    // This and any other string properties will be in SearchText
    public virtual string SomeName { get; set; }

    // Custom SearchText (this will not replace SearchText if this block is in a xhtmlstring)
    public string SearchText => "Hello!"
}

A quickfix might be to do something like this

if (referencedContent is MyBlock myBlock)
{
    sb.Append(myBlock.SearchText + " ");
}
else
{
    sb.Append(referencedContent.SearchText() + " ");
}

bushra
bushra Aug 4, 2020 11:08 AM

Thank you so much for thid great article :) 

Please login to comment.
Latest blogs
Vulnerability in EPiServer.GoogleAnalytics v3 and v4

Introduction A potential security vulnerability was detected for Optimizely Google Analytics addon (including EPiServer.GoogleAnalytics and...

Bien Nguyen | Sep 20, 2023

Overriding Optimizely’s Content Recommendations Block to Implement Custom Recommendations

Introduction The Content Recommendations add-on for Optimizely CMS dynamically recommends content from your site tailored to the interests of each...

abritt | Sep 13, 2023 | Syndicated blog

Developer contest! Install the AI Assistant and Win Bose QC45 Headphones!

We are thrilled to announce a developer contest where you have the chance to win a pair of Bose Headphones. The goal is to be the first developer t...

Luc Gosso (MVP) | Sep 7, 2023 | Syndicated blog

Send Optimizely notifications with SendGrid API, not SMTP

If your Optimizely site already sends transaction emails through an email platform API, why not do the same with Optimizely notification emails?

Stefan Holm Olsen | Sep 6, 2023 | Syndicated blog

Optimizely Configured Commerce Custom POST API

Introduction When creating custom API controllers for an Optimizely B2B project it’s possible you’ll want to create POST calls. Following the...

Dylan Barter | Sep 6, 2023

Using Google’s structured data to improve your SEO in Optimizely's B2B Configured Commerce

Introduction Following proper markup standards for search engine optimization is imperative for the success of every website. While Optimizely B2B’...

Dylan Barter | Sep 6, 2023