What happens if you don't add UnifiedFile in your init module?
Files and pages should be automatically added to the UnifiedSearchRegistry with sensible defaults for projections.
I still get 33 items of "EPiServer.Web.Hosting.VersioningFile" in my index. But no difference on UnifiedSearchHit objects.
Correction: UnifiedSearchHit objects for files have disappeared.
Like I wrote I had a similiar problem for PageData hits before implementing ISearchContent on my base type so I might be missing something that happends on app init inside the product assemblies?
Hmm, my guess is that while UnifiedSearch exists in the .NET 3.5/CMS 6 R2 version of the general .NET API (EPiServer.Find.dll) the automatic registering of pages and files which the .NET 4/CMS 7 version of the CMS integration (EPiServer.Find.Cms.dll) doesn't exist in the 6 R2-version.
With CMS 7 it should "just work" :)
Try something like this:
.CustomizeProjection(x => x.ProjectUrlFrom<UnifiedFile>(file => GetFileUrl(file.PermanentLinkVirtualPath)))
private static string GetFileUrl(string permanentLinkVirtualPath)
{
UrlBuilder url = new UrlBuilder(permanentLinkVirtualPath);
Global.UrlRewriteProvider.ConvertToExternal(url, null, Encoding.UTF8);
return url.ToString();
}
OK! Having this in my init module now and I can see that the Url prop of UnifiedFile got the correct path:
SearchClient.Instance.Conventions.UnifiedSearchRegistry.Add(typeof(UnifiedFile));
FileIndexer.Instance.Conventions.ShouldIndexVPPConvention = new VisibleInFilemanagerVPPIndexingConvention();
PageIndexer.Instance.Conventions.EnablePageFilesIndexing();
SearchClient.Instance.Conventions.UnifiedSearchRegistry
.ForInstanceOf<UnifiedFile>()
.CustomizeProjection(x => x.ProjectUrlFrom<UnifiedFile>(file => GetFileUrl(file.PermanentLinkVirtualPath)));
Joel: In one of our project, same product versions as Johan Kronberg mentions but no PTB, we're not getting any titles or url's. I have registered projections for those properties but we only get a short excerpt.
SearchClient.Instance.Conventions.UnifiedSearchRegistry.Add(typeof(PageData));
SearchClient.Instance.Conventions.UnifiedSearchRegistry
.ForInstanceOf<PageData>()
.CustomizeProjection(x =>
x.ProjectTitleFrom<PageData>(page =>
GetPageTitle(page)));
SearchClient.Instance.Conventions.UnifiedSearchRegistry
.ForInstanceOf<PageData>()
.CustomizeProjection(x =>
x.ProjectUrlFrom<PageData>(page =>
GetPageUrl(page)));
SearchClient.Instance.Conventions.UnifiedSearchRegistry
.ForInstanceOf<PageData>()
.CustomizeProjection(x =>
x.ProjectTypeNameFrom<PageData>(page =>
GetPageTypeName(page)));
Is it possible to filter results from UnifiedSearch? I'm only able to filter on the UnifiedSearchHit object's properties not the underlaying PageData object's properties.
Maybe there is a way to add a filter in the conventions instead? What I would like to do is to exclude some pages (e.g. pages with 'ExcludePageInSearch' set to true).
The page variable will be null in those expressions :(
The expression must "point" to a (or several) properties which can be retrieved as fields from the index. For instance page => page.PageName. Ie, you could for instance add an extension method for PageData that retrieves a Headline property and configure that to be indexed (included). Then you could do:
page => GetFirstNonEmpty(page.Headline(), page.PageName)
Regarding filtering you can tell the UnifiedSearchRegistry to hold two types of filters for you, one that is always applied for the type and one that is applied when doing public search. I *think* the syntax is ForInstancesOf<PageData>().AlwaysFilter(...).
Other than that you can do:
UnifiedSearchFor("..")
.Filter(x => !x.MatchTypeHierarchy(typeof(PageData)) | ((PageData)x).PageName.Match("Something))
In regards to what you're saying about "The expression must "point""... How would I go about projecting file.Summary.Title as the title with file.Name as fallback? Tried a couple of variations without any luck.
If page is null, does that mean we can't use the indexer page["PageHeading"] in the expression? And we must use typed properties?
Johan K, create a method that returns the first non-empty string amongst several (FirstNonEmpty(params string)), then file => FirstNonEmpty(file.Summary.Title, file.Name)
Johan P, correct. Find indexes code properties by default. You can instruct it to index other expressions, such as extension methods, but I'm 60% sure you can't tell it to index an indexer expression. Given your context I would personally create a number of extension methods for PageData that maps to the property names in ISearchContent (SearchTitle, SearchHitUrl etc) plus methods for any properties you need to filter on and then tell Find's client conventions to include them when indexing. Then you won't need to customize projections as nice matching values are already in the index.
I took the Extension Method approach as well. Code for reference:
public static string SearchTitle(this UnifiedFile file)
{
return file.Summary != null && !string.IsNullOrWhiteSpace(file.Summary.Title) ? file.Summary.Title : file.Name;
}
public static string SearchHitUrl(this UnifiedFile file)
{
var url = new UrlBuilder(file.PermanentLinkVirtualPath);
Global.UrlRewriteProvider.ConvertToExternal(url, null, Encoding.UTF8);
return url.ToString();
}
// Index files
SearchClient.Instance.Conventions.UnifiedSearchRegistry.Add(typeof(UnifiedFile));
FileIndexer.Instance.Conventions.ShouldIndexVPPConvention = new VisibleInFilemanagerVPPIndexingConvention();
PageIndexer.Instance.Conventions.EnablePageFilesIndexing();
SearchClient.Instance.Conventions.ForInstancesOf<UnifiedFile>()
.IncludeField(file => file.SearchHitUrl())
.IncludeField(file => file.SearchTitle());
Another related issue is that I have a PDF containing a word. When I search for this word in the Admin explorer view I get 1 file as result. When doing UnifiedSearchFor using the same phrase in my search page I get no results. Searching in my search page for a phrase in the file name I get the same file as result. Seems like the file content is not searched (guessing because of the missing hookups described above), how can I do this using the file content extraction of Find?
In the 6 R2 CMS integration the actual file content is indexed as the return value of an extension method named Attachment (located in EPiServer.Find.Cms.UnifiedFileExtensions) while in the 7 CMS integration this method has been renamed SearchAttachment in order to match what Unified Search needs.
My suggestion would be to:
1. Create an extension for UnifiedFile named SearchAttachment.
public static class MyUnifiedFileExtensions
{
public static Attachment(this UnifiedFile file)
{
return file.Attachment();
}
}
2. Exclude the original Attachment method so you won't have to index potentially large files twice.
3. Include your SearchAttachment method.
Works! Thanks!
Another thing... Which prop name/extension method handles the matching between File Best Bet and UnifiedFile? I think that's why my File Best Bets don't get boosted.
They are matched by the index ID (_id) which is retrieved using an extension method for UnifiedFile named GetIndexId. That method returns an id by creating a hash from either the PermanentLinkVirtualPath property or the VirtualPath property if the first is null.
Are the files returned in the search result but not placed first?
Are the files returned in the search result but not placed first?
When in result they are not placed first. If added for a word that won't have the file in the normal result they are still not in the result at all.
Hmm. Well, they shouldn't be returned if they don't match the search query (although we could certainly work around that to).
But the fact that they aren't boosted is strange. What you could do is inspect the request (or post it here for me and others to see). Here's how to do that if you're using IIS.
1. Install and run Fiddler.
2. Configure your application pool in IIS to run as your user. Note that this is crucial. The app pool must run as the exact same user as the one running Fiddler.
3. Trigger a search that should result in a best bet being applied.
4. Locate the request in the Fiddler log.
When I analyze the Request I can't see that anything concerning best bets is in the JSON sent to Find...
Could you post the request body here (with anything customer specific removed)? The best bet might not be that obvious :)
Also, could you try passing a language to the Search method if you don't already do that?
OK! Only a list of PTB page types are removed:
{
"from":0,
"size":10,
"query":{
"filtered":{
"query":{
"query_string":{
"fields":[
"SearchTitle$$string.sv",
"SearchText$$string.sv",
"SearchSummary$$string.sv",
"SearchAttachment$$attachment"
],
"query":"dator"
}
},
"filter":{
"or":[
{
"term":{
"___types":"EPiServer.Find.UnifiedSearch.ISearchContent"
}
},
{
"term":{
"___types":"EPiServer.Web.Hosting.UnifiedFile"
}
}
]
}
}
},
"facets":{
"SearchTypeName":{
"terms":{
"field":"SearchTypeName$$string"
}
},
"SearchHitTypeName":{
"terms":{
"field":"SearchHitTypeName$$string"
}
},
"All":{
"filter":{
"or":[
{
"exists":{
"field":"SearchTitle$$string"
}
},
{
"not":{
"filter":{
"exists":{
"field":"SearchTitle$$string"
}
}
}
}
]
}
}
},
"highlight":{
"fields":{
"SearchTitle$$string.sv":{
"pre_tags":[
"<strong>"
],
"post_tags":[
"</strong>"
],
"number_of_fragments":0
},
"SearchSummary$$string.sv":{
"pre_tags":[
"<strong>"
],
"post_tags":[
"</strong>"
],
"fragment_size":127,
"number_of_fragments":2
},
"SearchText$$string.sv":{
"pre_tags":[
"<strong>"
],
"post_tags":[
"</strong>"
],
"fragment_size":127,
"number_of_fragments":2
},
"SearchAttachment$$attachment":{
"pre_tags":[
"<strong>"
],
"post_tags":[
"</strong>"
],
"fragment_size":127,
"number_of_fragments":2
}
}
},
"fields":[
"___types",
"$type",
"SearchTitle$$string",
"SearchHitUrl$$string",
"SearchTypeName$$string",
"SearchHitTypeName$$string",
"SearchSection$$string",
"SearchSubsection$$string",
"SearchAuthors",
"SearchPublishDate$$date",
"SearchUpdateDate$$date",
"SearchFilename$$string",
"SearchFileExtension$$string",
"SearchGeoLocation$$geo"
],
"script_fields":{
"SearchSummary$$string-cropped-255":{
"script":"ascropped",
"lang":"native",
"params":{
"field":"SearchSummary$$string",
"length":255
}
},
"SearchText$$string-cropped-255":{
"script":"ascropped",
"lang":"native",
"params":{
"field":"SearchText$$string",
"length":255
}
}
}
}
Thanks!
You're right, there no best bet there. I'm not sure why it's not applied. Unless I'm mistaken the latest CMS 6 integration has criterias for language and such when adding best bets like the CMS 7 integration has. Perhaps it could be that such a criteria isn't met.
Let's see if one of the guys at EPi has some idea. Alternatively you could post your code as well and I'll gladly take a look.
This is what I have now. I have tried moving the extension functions around to "chain" in different order.
var query = SearchClient.Instance.UnifiedSearchFor(this.QueryParameter, Language.Swedish)
.TermsFacetFor(x => x.SearchTypeName)
.TermsFacetFor(x => x.SearchHitTypeName)
.FilterFacet("All", x => x.SearchTitle.Exists() | !x.SearchTitle.Exists())
.Skip(pageIndex * this.Pager.PageSize)
.Take(this.Pager.PageSize)
.Track()
.ApplyBestBets();
// Sort
if (!string.IsNullOrWhiteSpace(this.SortParameter))
{
switch (this.SortParameter)
{
case "date":
query = query.OrderByDescending(x => x.SearchPublishDate);
break;
case "title":
query = query.OrderBy(x => x.SearchTitle);
break;
}
}
// Filter facet
if (!string.IsNullOrWhiteSpace(this.TypeNameParameter))
{
query = query.FilterHits(x => x.SearchTypeName.Match(this.TypeNameParameter));
}
if (!string.IsNullOrWhiteSpace(this.HitTypeNameParameter))
{
query = query.FilterHits(x => x.SearchHitTypeName.Match(this.HitTypeNameParameter));
}
// Get results
this.Results = query
.GetResult(
new HitSpecification
{
HighlightTitle = true,
HighlightExcerpt = true,
ExcerptLength = 255,
PreTagForAllHighlights = "<strong>",
PostTagForAllHighlights = "</strong>"
});
I am doing a UnifiedSearchFor and I can see that I get results of type VersioningFile.
For these UnifiedSearchHit objects Url and Title properties are empty.
Search view in Find admin shows the file hits correctly.
I had a similiar problem for PageData hits before implementing ISearchContent on my base type.
I have tried projecting the properties in a manor like this without any luck:
In my init module I have: