Hey, Mark,
Check out this blogpost by Jeroen:
http://jstemerdink.wordpress.com/2014/04/05/indexing-blocks-with-episerver-search/
Thanks, I'd started down a different approach, I'd got as far as below just trying to figure bits out:
GroupQuery innerQuery = new GroupQuery(LuceneOperator.OR); innerQuery.QueryExpressions.Add(new ContentQuery<PageData>()); innerQuery.QueryExpressions.Add(new ContentQuery<BlockData>()); GroupQuery query = new GroupQuery(LuceneOperator.AND); query.QueryExpressions.Add(innerQuery); query.QueryExpressions.Add(new FieldQuery(Request.QueryString["q"])); SearchResults results = Locate.SearchHandler().GetSearchResults(query, 1, 100); List<PageData> pageList = new List<PageData>(); ContentSearchHandler contentSearchHandler = Locate.ContentSearchHandler(); foreach (var hit in results.IndexResponseItems) { IContent content = contentSearchHandler.GetContent<IContent>(hit); if (content is PageData) pageList.Add(content as PageData); if (content is BlockData) { var references = DataFactory.Instance.GetReferencesToContent(content.ContentLink, false); pageList.AddRange( references .Select(r => { return DataFactory.Instance.Get<IContent>(r.OwnerID); }) .OfType<PageData>()); } }
However it does put all the load at the point of search whereas point of publish may make more sense even if it does inflate the indexes by quite a bit.
Might have to have a try and see how it does. I'm wondering if it'll handle, or can be made to handle, nested blocks although the only example I can think of where I use those probably don't need to be searchable - it's just gallaries and carousels I use nested blocks for at the minute.
Well, not much to do, but search the tree until you find a IContent of type PageData.
I personally prefered the approach where the load is not on search, since it's done once. In that case, you'd probably need a recursion to get to the deepest level of items in a contentarea. This introduces another level of concern though - cleanup needs to be done in a case block is changed, which I don't see in the link I've sent you.
Actually thats a good point, on that link updating a block wouldn't update the data for the page which is a big issue for shared blocks.
I actually think for expediency I may need to pursue my original solution in this instance and hope EPiServer has some wicked good caching logic in that DataFactory (I've not investigated but I'm hoping it doesn't have to go to the DB for each request unless it is invalidated by a publish operation).
You are right, from EPi documentation:
Object Cache
Automatically caches all objects in EPiServer CMS that is being requested from the API, via for example DataFactory (or IContentRepository). The object cache is based on the ASP.NET runtime cache and only read-only objects are stored to enable great performance. Invalidation is handled by an event system (described more the Framework SDK) with support for load balanced servers. The object cache can be used by custom classes, see CacheManager.
This is a good read for cache mechanisms, while it's written for EPi 5 and 6, I believe the principles are still valid in 7: http://joelabrahamsson.com/how-episerver-cms-caches-pagedata-objects/
I think I'd use the same function as the EPiServer Interface does when you're trying to remove a block.
var contents = new List<ContentData>();
var repository = ServiceLocator.Current.GetInstance<IContentRepository>();
IEnumerable references = repository.GetReferencesToContent(contentLink, false);
foreach (ReferenceInformation reference in references)
{
ContentReference ownerContentLink = reference.OwnerID;
CultureInfo ownerLanguage = reference.OwnerLanguage;
ILanguageSelector selector = new LanguageSelector(ownerLanguage.Name);
var content = repository.Get<ContentData>(ownerContentLink, selector);
var contentArea = content["ResponsibleEditors"] as ContentArea;
if (contentArea != null)
{
if (contentArea.ContentFragments.Any(fragment => fragment.ContentLink == contentLink))
contents.Add(content);
}
}
Essentially I'm wanting to figure out which pages a block is used in.
Why?
Well it's actually related to some site search functionality. For some concepts in my site where we have the need for n items, where n is a non fixed number, I use blocks to build up the page. Not sure if this is the right approach but it's familiar from my Sitecore days with placeholders.
Some of these blocks I'd like to contribute to search results for the page. Oh, I'm just using the Lucene search - got to start somewhere!
I can actually get search results from page data and block data and get all the IContent items, but I then want to attribute the blocks results to the pages that contain them. Don't worry I plan to use an interface to mark particular blocks as searchable, I don't want every CTA from dragging a page into a content area to be searchable.
So is there a way to figure out where a block is used? Or is there a better way to go about searching page content inlcuding block data. Only restriction is can be a solution that requires us paying any more money - we've no budget for paid options on this project.
I'm assuming there is a remarkably easy way to do this that I've completely missed, as if you delete a block in use you get an alert telling you where - but I've not stumbled across it yet.