November Happy Hour will be moved to Thursday December 5th.

Henrik Fransas
Aug 28, 2014
  8559
(4 votes)

Using Azure Search as search solution in EPiServer

Microsoft released a new preview search solution in Azure a little while ago and this post is about how to use it in an EPiServer website. I try this out because that when deploying an EPiServer site to Azure as an Azure website the built in EPiServer Search does not work and it feels bad to not be able to search.

Azure search service is available to all with an azure account and right now there are two versions/levels of it, free and standard. Free is as it sounds free of charge for all and has these limitations.

· Up to 3 indexes

· Up to 10 000 documents

· Up to 50 MB of storage

· No scaling

· Shared environment

Standard cost right now 125 USD/month and has these limitations.

· Up to 50 indexes

· Up to 15 000 000 documents

· Up to 300 GB storage

· Dedicated environment

· Possibility to scale up to max 35 units

Read more about price and limitations here: http://azure.microsoft.com/en-us/pricing/details/search/

For most websites 10 000 documents/pages is enough and the two limitations that can be a problem is the storage and number of indexes. The storage is enough if you do not have pages with a lot of data in them but 50 MB might be consumed pretty fast. For the number of indexes it might seems a lot with three index but Azure search service does not see index as for example EPiServer Find does. In Azure search service an index is a type of object so if you want to be able to do more advanced searches for different kind of pagetypes it might demand that you will have one index per pagetype and then three is a very small number.

 

How to use it

You first have to create your search service in the Azure preview portal, read more about it here: http://azure.microsoft.com/en-us/documentation/articles/search-configure/

 

Create index

After that you have to think a lot of what you want to index since every kind of object is its own index with its own http connection and so on. In this first try I will keep it very simple and only use one index with information from pages in a compressed way. The object looks like this:

var index = new
{
    name = "pages",
    fields = new[]
    {
        new { name = "id", type = "Edm.Int32", key = true },
        new { name = "name", type = "Edm.String", key = false },
        new { name = "linkurl", type = "Edm.String", key = false },
        new { name = "metatitle", type = "Edm.String", key = false },
        new { name = "metadescription ", type = "Edm.String", key = false },
        new { name = "teasertext", type = "Edm.String", key = false },
        new { name = "mainbody", type = "Edm.String", key = false },
        new { name = "contenttypeid", type = "Edm.Int32", key = false },
    }
};

Azure search services support some EDM (Entity Data Model) data types in index and document, read more about it here: http://msdn.microsoft.com/en-us/library/azure/dn798938.aspx

As you can see this is absolutely not all information needed in a full scale search for EPiServer and I can already now not see this as a fully acceptable replacement for EPiServer Search in Azure but I will go on just to test it out.

To make it simple I will not implement logic that listening for publish page event, just do a schedule task that will iterate though all pages and add them to the index. The code will be on github so you can extend it yourself if you want.

 

Populate index

The best way to do this in a live version is to publish updates when a user press publish but for this test I am only going to populate the index with a schedule task. I create a task where I goes through all pages in the site for the pagetypes that are interesting and then I map that page to a class that I serialize and send to the index. I am doing it per pagetype because I do not know how Azure search service handles a very big request. The schedule job looks like this:

[ScheduledPlugIn(DisplayName = "[Azure Search] Update index", Description = "Update the index with all published pages")]
public class UpdateAzureSearchService
{
    public static string Execute()
    {
        var totalStopWatch = new Stopwatch();
        totalStopWatch.Start();

        IndexPageType(typeof(StartPage).GetPageType());
        IndexPageType(typeof(ArticlePage).GetPageType());
        IndexPageType(typeof(NewsPage).GetPageType());
        IndexPageType(typeof(ProductPage).GetPageType());
        IndexPageType(typeof(StandardPage).GetPageType());

        totalStopWatch.Stop();

        return string.Format("Azure search service updated. Time taken: {0}", totalStopWatch.Elapsed);
    }

    private static void IndexPageType(ContentType contentType)
    {
        var client = new HttpClient();
        client.DefaultRequestHeaders.Add("api-key", ConfigurationManager.AppSettings["AzureSearchServiceApiKey"]);

        var pages = FilterForVisitor.Filter(DataFactory.Instance.GetAllPagesOfCertainPageType(contentType.ID, ContentReference.RootPage));

        if (pages == null) return;

        var pagesToUpdateObject = new UpdateAzureSearch
        {
            value = pages.Select(page => page.MapContentToAzureSearchServiceObject()).ToList()
        };

        var response = client.PostAsync(ConfigurationManager.AppSettings["AzureSearchServiceRootUrl"] + "/indexes/pages/docs/index?api-version=2014-07-31-Preview", new StringContent(JsonConvert.SerializeObject(pagesToUpdateObject), Encoding.UTF8, "application/json")).Result;

        response.EnsureSuccessStatusCode();
    }
}

GetAllPagesOfCertainPageType is a extension that looks like this:

public static PageDataCollection GetAllPagesOfCertainPageType(this DataFactory datafactory, int pageTypeDefinitionId, PageReference root)
{
    var propertyCriteriaCollection = new PropertyCriteriaCollection
    {
        new PropertyCriteria
        {
            Condition = CompareCondition.Equal, 
            Name = "PageTypeID", 
            Type = PropertyDataType.PageType, 
            Value = pageTypeDefinitionId.ToString(CultureInfo.InvariantCulture), 
            Required = true
        }
    };

    var pagesOfCorrectType = DataFactory.Instance.FindPagesWithCriteria(root, propertyCriteriaCollection);

    return pagesOfCorrectType;
}

And MapContentToAzureSearchServiceObject is a extension method that looks like this:

public static AzureSearchServiceObject MapContentToAzureSearchServiceObject(this PageData pageData)
{
    return new AzureSearchServiceObject
    {
        id = pageData.ContentLink.ID.ToString(CultureInfo.InvariantCulture),
        contenttypeid = pageData.ContentTypeID,
        name = pageData.Name,
        mainbody = pageData.GetPropertyValue("MainBody", string.Empty),
        linkurl = pageData.LinkURL,
        metadescription = pageData.GetPropertyValue("MetaDescription", string.Empty),
        metatitle = pageData.GetPropertyValue("MetaTitle", string.Empty),
        teasertext = pageData.GetPropertyValue("TeaserText", string.Empty)
    };
}

I am using a wrapper object called UpdateAzureSearch that only has one property called value that is a list of AzureSearchServiceObject because Azure search services demands that the json sent to it looks like that. The objects are defined like this:

public class AzureSearchServiceObject
{
    public string id { get; set; }
    public string name { get; set; }
    public string linkurl { get; set; }
    public string metatitle { get; set; }
    public string metadescription { get; set; }
    public string teasertext { get; set; }
    public string mainbody { get; set; }
    public Int32 contenttypeid { get; set; }
}

public class UpdateAzureSearch
{
    public List<AzureSearchServiceObject> value { get; set; }
}

 

 

Use the index

This testsite is created with the alloy template and in that there are a searchpage already created so all I did was to create my own SearchService and removed the one that are using EPiServer Search. I made it so simple as possible and are only doing a plain text search with no steeming and so on. Azure search services is built on Elastic Search so there are a lot of possibilities for more advanced queries but this is only a Proof Of Concept so I keep it as simple as possible. The SearchService looks like this:

I then simplified the search controller so it looks like this:

public class SearchPageController : PageControllerBase<SearchPage>
{
    private readonly AzureSearchService _searchService;

    public SearchPageController(
        AzureSearchService searchService)
    {
        _searchService = searchService;
    }

    [ValidateInput(false)]
    public ViewResult Index(SearchPage currentPage, string q)
    {
        var model = new SearchContentModel(currentPage)
        {
            SearchServiceDisabled = false,
            SearchedQuery = q
        };

        if (!string.IsNullOrWhiteSpace(q))
        {
            var hits = Search(q.Trim()).ToList();
            model.Hits = hits;
            model.NumberOfHits = hits.Count();
        }

        return View(model);
    }

    private IEnumerable<SearchContentModel.SearchHit> Search(string searchText)
    {
        return _searchService.Search(searchText);
    }
}

 

Conclusion

This is a preview service from Azure and there are some things that can be better, like not handling a objecttype as an index but it is absolutely working and for simple sites it can be a solution but for bigger sites it is better to pick up the wallet and pay for EPiServer Find.

If you like to test for yourself you can download the code from here: https://github.com/hesta96/AzureSearchServicesTest
To make the code work all you have to do is to create your own Azure Search Solution in Azure Preview Portal (https://portal.azure.com) and update web.config with your url and key. The database is included in the project and all blobs are saved in the database so it should just be to press F5 after that.

Aug 28, 2014

Comments

Henrik Fransas
Henrik Fransas Aug 28, 2014 02:54 PM

Why are Windows Live Writer so dam hard to use?????

Aug 29, 2014 07:15 AM

Agree, Live writer sucks and is usually making me say a lot words that are better left unsaid.

Henrik Fransas
Henrik Fransas Aug 29, 2014 07:18 AM

Linus, it ended up with me having to change the html-code by hand and yes there were some ugly words.
It took nearly the same amount of time to create and make the blog post look ok as it took to write all the testcode that I blogged about,,,,

Johan Kronberg
Johan Kronberg Sep 1, 2014 10:36 AM

Interesting!

Sven Tegelmo
Sven Tegelmo Mar 29, 2021 02:10 PM

Interesting. Must say that the price plan differences for Azure Search vs Episerver Find makes me wonder if we should go for Azure Search.

Can someone say what features in Find makes it worth to go for Find?
(The difference/limitation in what an Index is for Azure Search is not a problem for us. I think we can have the samr properties/fields for all our pages/documents)

Please login to comment.
Latest blogs
Adding Geolocation Personalisation to Optimizely CMS with Cloudflare

Enhance your Optimizely CMS personalisation by integrating Cloudflare's geolocation headers. Learn how my Cloudflare Geo-location Criteria package...

Andy Blyth | Nov 26, 2024 | Syndicated blog

Optimizely SaaS CMS + Coveo Search Page

Short on time but need a listing feature with filters, pagination, and sorting? Create a fully functional Coveo-powered search page driven by data...

Damian Smutek | Nov 21, 2024 | Syndicated blog

Optimizely SaaS CMS DAM Picker (Interim)

Simplify your Optimizely SaaS CMS workflow with the Interim DAM Picker Chrome extension. Seamlessly integrate your DAM system, streamlining asset...

Andy Blyth | Nov 21, 2024 | Syndicated blog

Optimizely CMS Roadmap

Explore Optimizely CMS's latest roadmap, packed with developer-focused updates. From SaaS speed to Visual Builder enhancements, developer tooling...

Andy Blyth | Nov 21, 2024 | Syndicated blog