Paul Gruffydd
Jun 8, 2018
  6794
(10 votes)

Listing popular content with Profile Store

Every so often, a certain requirement comes up. Sometimes it’s called “Most Popular”, sometimes it’s called “Trending”, I’ve even seen it called “Hot topics” but, regardless of the name, the requirement is the same – Show a listing of the top x most popular pieces of content. On paper, this may sound straightforward but it hides bit of a challenge in that, in order to show the most popular pages, you need to know what content is popular which generally requires keeping track of page views within a given timeframe.

One approach to this would be to take advantage of auto boosting within Find to apply a hit boost to a search. This would be pretty straightforward (and pretty efficient) however the hits tracked to apply this boost only come from tracked clicks from the results of a find search. This means that hits from navigation, social sharing, google searches, etc. won't be counted so you may miss out on a fair proportion of the hits for a given page.

An alternative approach would be to pull data from an analytics package such as Google Analytics or, as I've chosen in this instance, Episerver Profile Store. The process would be similar whether we were to use GA or Profile Store but this is a good opportunity to take a look at how we can put the Profile Store to good use beyond simply tracking data and looking at it in Insight.

Tracking

The first thing we need to do is to track our page views and, within profile store, there are many ways we could do this. I could go into detail here but I think that’s been covered fairly comprehensively by David Knipe and Nicola Ayan though, having said that, I’m going to use a slightly different method and use the [PageViewTracking] attribute from the “EPiServer.Tracking.PageView” library available through NuGet. Why? Well, a few reasons. Partly because it’s nice and easy (you just add [PageViewTracking] to your controller action and it does the rest), partly because I couldn’t get the [Tracking()] attribute from Episerver.Tracking.Cms to work, but mostly because we need to track things in a consistent way.

At present, it’s only really Episerver Advance personalisation which requires you to track specific data in the Profile Store and it requires that data in a specific format. As more and more features come to rely on the data in the Profile Store, they too will need this data in a consistent format so, to me at least, it seems reasonable to expect that those features would require data in the same format as Advance to avoid having to raise multiple page view tracking events in slightly different formats on each page request.

Aggregating the data

Now we’ve got data flowing in to our profile store instance, we can look at how we can use that data to power our listing. Profile Store comes with a rest API for querying data both from individual profiles and from the events which were tracked. In this instance we’re going to use the latter to pull out all recent “epiPageView” events (as tracked by “EPiServer.Tracking.PageView”). First of all though, let’s take a look at the structure of the data we’re requesting.

The basic structure of an event tracked by the attribute mentioned above is as follows, where the top level of this object is common to all events tracked within Profile Store but the contents of the Payload can be any arbitrary data we want to add:

{
    "TrackId": null,
    "DeviceId": "bf4b2611-63f5-4364-899a-7017f6d044b5",
    "EventType": "epiPageView",
    "EventTime": "2018-05-31T15:50:38.4951303Z",
    "Value": "Viewed Start",
    "Scope": "463470c3-3eca-41d3-8b12-3f7f92f62d34",
    "CountryCode": "Localhost",
    "PageUri": "http://localhost:59422/",
    "PageTitle": null,
    "RemoteAddress": "127.0.0.1",
    "Payload": {
        "epi": {
            "contentGuid": "bd437cef-41bd-4ebc-8805-0c20fcf4edcf",
            "language": "en",
            "siteId": "463470c3-3eca-41d3-8b12-3f7f92f62d34",
            "ancestors": [
                "43f936c9-9b23-4ea3-97b2-61c538ad07c9"
            ],
            "recommendationClick": null
        }
    },
    "User": {
        "Name": null,
        "Email": ""
    }
}

When we query for this data we get back an array of these objects plus some totals. I’m going to use RestSharp for the Rest requests and Newtonsoft.Json to deserialise my data into a slightly cut-down object representation of the data (shown below):

{
    "Total": 123,
    "Count": 123,
    "Items": [
        {
            "Payload":{
                "epi": {
                    "contentGuid": "00000000-0000-0000-0000-000000000000",
                    "language": "en"
                }
            }
        }
    ]
}

In order to get back the list of popular pages, we need to query Profile Store for “epiPageView” hits within a given timeframe and total the results. The basic URL for this query looks something like this:
/api/v1.0/trackevents/?$filter=EventType eq epiPageView and EventTime gt 2018-05-01T00:00:00Z

In an ideal world, we could do the aggregation as part of our query but unfortunately this isn’t supported right now so we’re going to have to do it ourselves.

Obviously sifting through thousands of hits is not something we want to do on the fly so we'll use a scheduled job. This scheduled job pulls back the epiPageView events in pages of 1000, tallies up the totals and saves the result to the Dynamic Data Store in a suitable format for future consumption, adding in the PageTypeId to allow us to query by PageType later.

    [ScheduledPlugIn(DisplayName = "Index Recent Page Hits", DefaultEnabled = false, IntervalLength = 1, IntervalType = EPiServer.DataAbstraction.ScheduledIntervalType.Hours)]
    public class IndexRecentHits : ScheduledJobBase
    {
        //Settings
        private readonly string _apiRootUrl = ConfigurationManager.AppSettings["episerver:profiles.ProfileApiBaseUrl"];
        private readonly string _appKey = ConfigurationManager.AppSettings["episerver:profiles.ProfileApiSubscriptionKey"];
        private readonly string _eventUrl = "/api/v1.0/trackevents/";
        private readonly string _timeWindow = ConfigurationManager.AppSettings["RecentHours"] ?? "24";
        private readonly int _resultsPerPage = 1000;

        private Dictionary<string, int> _recentHits = new Dictionary<string, int>();
        private bool _stopSignaled;

        private static DynamicDataStoreFactory _dataStoreFactory;
        private static IContentLoader _contentLoader;

        public IndexRecentHits(DynamicDataStoreFactory dataStoreFactory, IContentLoader contentLoader)
        {
            _dataStoreFactory = dataStoreFactory;
            _contentLoader = contentLoader;
        }

        public IndexRecentHits()
        {
            IsStoppable = true;
        }

        /// <summary>
        /// Called when a user clicks on Stop for a manually started job, or when ASP.NET shuts down.
        /// </summary>
        public override void Stop()
        {
            _stopSignaled = true;
        }

        /// <summary>
        /// Called when a scheduled job executes
        /// </summary>
        /// <returns>A status message to be stored in the database log and visible from admin mode</returns>
        public override string Execute()
        {
            //Call OnStatusChanged to periodically notify progress of job for manually started jobs
            OnStatusChanged(String.Format("Beginning processing of recent hits"));

            var totalProcessed = 0;
            var errorCount = 0;

            //Get the recent hit counts
            if (!int.TryParse(_timeWindow, out int recentHours))
            {
                recentHours = 24;
            }
            var fromDate = DateTime.Now.AddHours(0 - recentHours).ToUniversalTime().ToString("o");

            // Set up the request
            var request = GetTrackingRequest($"EventType eq epiPageView and EventTime gt {fromDate}", _resultsPerPage);

            // Gather the data from Profile Store
            ProcessEventResults(1, request);

            if (_stopSignaled)
            {
                return "Execution was cancelled by user";
            }


            var store = _dataStoreFactory.CreateStore(typeof(RecentHit));
            store.DeleteAll();

            foreach (var hit in _recentHits)
            {
                if (_stopSignaled)
                {
                    return "Execution was cancelled by user";
                }
                try
                {
                    var keyParts = hit.Key.Split('_');
                    var page = _contentLoader.Get<SitePageData>(new Guid(keyParts.FirstOrDefault() ?? Guid.Empty.ToString()));
                    var recentHit = new RecentHit
                    {
                        PageId = page.ContentLink.ID,
                        PageTypeId = page.ContentTypeID,
                        Parents = _contentLoader.GetAncestors(page.ContentLink).Select(x => x.ContentLink.ID).ToArray(),
                        Language = keyParts.LastOrDefault() ?? "en",
                        Hits = hit.Value
                    };
                    store.Save(recentHit);
                }
                catch (Exception)
                {
                    errorCount++;
                }
                totalProcessed++;
                if (totalProcessed.ToString().EndsWith("0"))
                {
                    OnStatusChanged($"Indexed {totalProcessed} of {_recentHits.Count} with {errorCount} errors");
                }
            }

            return $"Reindexed {totalProcessed} pages with {errorCount} errors";
        }

        #region Private Methods
        /// <summary>
        /// Makes a request to ProfileStore and processes results
        /// </summary>
        private void ProcessEventResults(int page, RestRequest request)
        {
            OnStatusChanged($"Fetching hits page {page}");
            if (_stopSignaled)
            {
                return;
            }

            //Handle pagination
            request.AddOrUpdateParameter("$skip", (page - 1) * _resultsPerPage);

            // Execute the request to get the events matching the filter
            var eventResponseObject = GetTrackingResponse(request);
            foreach (var result in eventResponseObject.Items)
            {
                //Add/update the hit count per event
                var key = $"{result.Payload.Epi.ContentGuid}_{result.Payload.Epi.Language}";
                if (_recentHits.ContainsKey(key))
                {
                    _recentHits[key]++;
                }
                else
                {
                    _recentHits.Add(key, 1);
                }
            }

            //Repeat until all pages of results have been processed
            if (eventResponseObject.Total > _resultsPerPage * page)
            {
                ProcessEventResults(page + 1, request);
            }

        }

        /// <summary>
        /// Builds the ProfileStore request
        /// </summary>
        private RestRequest GetTrackingRequest(string filter, int resultsPerPage)
        {
            var req = new RestRequest(_eventUrl, Method.GET);
            req.AddHeader("Ocp-Apim-Subscription-Key", _appKey);

            req.AddParameter("$top", resultsPerPage);
            req.AddParameter("$filter", filter);
            return req;
        }

        /// <summary>
        /// Serialises the ProfileStore response into an object
        /// </summary>
        private TrackingObjectResponse GetTrackingResponse(RestRequest request)
        {
            var client = new RestClient(_apiRootUrl);
            var getEventResponse = client.Execute(request);
            return JsonConvert.DeserializeObject<TrackingObjectResponse>(getEventResponse.Content);
        }
        #endregion

    }

Putting it all together

So, now we’ve collected the data and put it together into a handy list format, it’s time to put it to use and get the listing onto our site. To do this I’m going to create a couple of helper functions to get the data (one returning typed data, the other returning all data):

public IEnumerable<T> GetPopularPages<T>(ContentReference ancestor, string language, int numberOfResults) where T : PageData
{
    var store = _dataStoreFactory.CreateStore(typeof(RecentHit));
    var contentTypeId = _contentTypeRepository.Load<T>().ID;
    var hits = store.Items<RecentHit>().Where(x =>     x.Parents.Contains(ancestor.ID) && x.Language.Equals(language) && x.PageTypeId.Equals(contentTypeId)).OrderByDescending(x => x.Hits).Take(numberOfResults).ToList();
    var contentRefs = hits.Select(x => new ContentReference(x.PageId));
    return _contentLoader.GetItems(contentRefs, new LoaderOptions() { LanguageLoaderOption.Specific(CultureInfo.GetCultureInfo(language)) }).OfType<T>();
}

public IEnumerable<IContent> GetPopularPages(ContentReference ancestor, string language, int numberOfResults)
{
    var store = _dataStoreFactory.CreateStore(typeof(RecentHit));
    var hits = store.Items<RecentHit>().Where(x => x.Parents.Contains(ancestor.ID) && x.Language.Equals(language)).OrderByDescending(x => x.Hits).Take(numberOfResults);
    var contentRefs = hits.Select(x => new ContentReference(x.PageId));
    return _contentLoader.GetItems(contentRefs, new LoaderOptions() { LanguageLoaderOption.Specific(CultureInfo.GetCultureInfo(language)) });
}

From here, it’s just a matter calling one of these functions from a block or page and rendering the result. In my case, I created a block and added it to the news and events page of the alloy site which gave me this…

Image AlloyNewsPopular.png

For those interested, I’ve added the code (including the block) to a Gist on GitHub but do bear in mind that this has been created as a proof-of-concept rather than a battle-hardened, production-ready feature so use it with caution.

Jun 08, 2018

Comments

Jun 8, 2018 11:04 AM

Nice write up, thanks for sharing!

K Khan
K Khan Jun 8, 2018 12:54 PM

Straight to favourites! 

Paul McGann (Netcel)
Paul McGann (Netcel) Jun 11, 2018 01:50 PM

Nice work Paul, will definitely be implementing this soon!

Please login to comment.
Latest blogs
Increase timeout for long running SQL queries using SQL addon

Learn how to increase the timeout for long running SQL queries using the SQL addon.

Tomas Hensrud Gulla | Dec 20, 2024 | Syndicated blog

Overriding the help text for the Name property in Optimizely CMS

I recently received a question about how to override the Help text for the built-in Name property in Optimizely CMS, so I decided to document my...

Tomas Hensrud Gulla | Dec 20, 2024 | Syndicated blog

Resize Images on the Fly with Optimizely DXP's New CDN Feature

With the latest release, you can now resize images on demand using the Content Delivery Network (CDN). This means no more storing multiple versions...

Satata Satez | Dec 19, 2024

Simplify Optimizely CMS Configuration with JSON Schema

Optimizely CMS is a powerful and versatile platform for content management, offering extensive configuration options that allow developers to...

Hieu Nguyen | Dec 19, 2024