EPiServer Find - filter UnifiedFile objects based on VPP folder

Vote:
 

Hi,


We have a template which enables to search from all EPiServer pages and additionally all files stored in VPP.

We are using for such feature Unified Search, but now we have a problem that there is a need to limit VPP files results based on a folder specified by content editor (there are more section specific search templates on site and each have PropertyFolderUrl defined in search page type). We cannot block indexing for any files with .ShouldIndex in Find initialization module because all files still have to be indexed for global search. So the question is how to filter results based on a folder on particular template.

Thanks in advance for any hints.

#72456
Jun 17, 2013 18:11
Vote:
 

Hi,

First you need to create an extension that maps to one of the ISearchContent-properties and I would for instance use the SearchSection-property and create an extension for UnifiedFiles looking like:

public static string SearchSection(this UnifiedFile file)
{
return xxx; /* my section logic */
}

Register it using:

SearchClient.Instance..Conventions.ForInstancesOf<UnifiedFile>().IncludeField(x => x.SearchSection())

By doing this your UnifiedFiles will have a section value then you later can use in your queries:

results = SearchClient.Instance.UnifiedSearchFor("bananas")
.Filter(x => x.SearchSection.Match("MySectionValue"))
.GetResult();

#72471
Jun 18, 2013 11:20
Vote:
 

Yes thanks it is working! my section logic looked like that (btw is there any difference if using UnifiedFile or VersioningFile?):

        public static string SearchSection(this VersioningFile file)
        {
            return file.Parent.VirtualPath;
        }


Important to remember that after registering it you need to run indexing job from CMS.

I have one more problem, I have started recently to receive following error when trying to run indexing job:

EPiServer.Find.Cms.FileIndexer: An exception occured while indexing (Batch): The remote server returned an error: (413) Request Entity Too Large.

There are only few files and they are not so large (max 2MB) and after that files are missing from index, when I edit single file in file manager (by changing some metadata) it is indexed after that properly, did you ever have such problem?

Maybe worth to mention that we are still running on dev index.

#72504
Jun 19, 2013 9:57
Vote:
 

Depending on your VPP type files may not always have versioning support and therefore not implement VersioningFile. In general I would say that you should use UnifiedFile as the extension point as all files must implement it.

As for the error there is a 5mb limit per request for dev-indicies. If no file is larger than 2mb you can set the file indexing batchsize to 1 to avoid this from happening when running the indexing job:

FileIndexer.Instance.FileBartchSize = 1;
ContentIndexer.Instance.FileBartchSize = 1;

#72505
Jun 19, 2013 10:06
Vote:
 

Yes it fixed failing of job:

            FileIndexer.Instance.FileBatchSize = 1;
            ContentIndexer.Instance.FileBatchSize = 1;

 

So then one more question:  Currently we have some metadata property added for VPP files which is just simply checkbox allowing (or not) file to be indexed. In initialization module we have added convention like that:

FileIndexer.Instance.Conventions.ForInstancesOf<VersioningFile>().ShouldIndex( x =>

{

     // checking if x.Summary.Dictionary["OurProperty"] contains correct value and return true/false based on that

});

it works fine when you run indexing job and also when you edit single file and SET our property -> file goes to be indexed, but in oposite direction when REMOVE our property -> code is executed, .ShouldIndex is evaluated correctly but later on in search results file still is returned.

As a workaroud we have tried to Delete it manually inside ShouldIndex with such code:

SearchClient.Instance.Delete<VersioningFile>(x.GetIndexId());

it basically works for mentioned workflow but fails again when run from job, which basically we can catch but still looks like there is something missing here to avoid exceptions from server.

#72507
Jun 19, 2013 10:24
Vote:
 

ShouldIndex tells only if the file should be indexed or not. It does not delete files if it returns false (The indexing job will however "sync" the index and remove all items not indexed during a run). Doing that delete seems like a good workaround just remeber to catch the error (if the file wasn't in the index to begin with the server will respond 404).

#72509
Jun 19, 2013 10:43
Vote:
 

Yes this is exactly what I am doing currently, thanks a lot for your input.

#72510
Jun 19, 2013 10:45
This topic was created over six months ago and has been resolved. If you have a similar question, please create a new topic and refer to this one.
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.