pages moved to Trash not deleted from the searchindex

Vote:
 

Hi

Episerver find search result containing items which are alredy deleted(ie pages in the recycle bin)

My query is

var articles = _client.Search()  
.CurrentlyPublished()
.ExcludeDeleted()
.Skip((currentPageNo) * pageSize)
.Take(pageSize)
.GetContentResult();

When I add a new article page it is updated in the episerver find but when I delete a page ,the TotalMatching count still same as before the delete. 

Jaanna

#185583
Edited, Nov 25, 2017 17:41
Vote:
 

Hi,

Is it only freshly deleted pages that show in search results?

GetContentResult applies some caching automatically, however, this should be cleared "whenever Episerver content is saved or deleted".

You could try testing with GetResult() to see if that works just to narrow your investigation.

/Jake

#185694
Nov 30, 2017 2:24
Vote:
 

(Late reply so this is probably solved by now?)

Hi

In my experience FIND still has pages in trashbin in its index. You can get around this by using FilterForVisitor() in the query.

The documentation around this issue varies alot from version to version. In some cases the recomendation is to add an indexing convention that excludes pages in the trashbin, but to my knowledge those will only prevent the page from ending up in the index for a full reindex to a clean index.

Early on the documentation stated that pages moved to trash were removed from the index, but ive never seen that happen for any FIND project that I have done.

/Torbjörn

#185903
Dec 06, 2017 10:29
Vote:
 

We are actually experiencing the exact same thing and have gotten to the point where we have put both .ExcludeDeleted() and .FilterForVisitor() in our queries, but we are still getting items back that are either in the transbin and also items that have been removed from the trashbin.

#185941
Dec 06, 2017 20:50
Vote:
 

This is how I solved the same issue.
First take a look here: https://world.episerver.com/blogs/Henrik-Fransas/Dates/2015/5/adding-episerver-find-to-alloy---part-2/

Then change the ShouldIndexPageData method to something like this:

private bool ShouldIndexPageData(SolutionPageData page)
{
            var wastedContent = ServiceLocator.Current.GetInstance<IContentLoader>().GetDescendents(ContentReference.WasteBasket).ToList();

            //Check if the page is published, not marked as disable indexing, etc 
            var shouldIndex = page.CheckPublishedStatus(PagePublishedStatus.Published)
                              && wastedContent.All(c => c.ID != page.PageLink.ID) //content in wastebasket should not be indexed
                              && !page.DisableIndexing;

            //The page should not be indexed, but in some scenarios it might already be indexed, so try to delete it.
            if (!shouldIndex)
             {
                    ContentIndexer.Instance.TryDelete(page, out var result);         
            }

            return shouldIndex;
}

Disclaimer: code above is just parts of the full code, might not work as is 

#185952
Dec 07, 2017 9:23
Vote:
 

@Erik I solved it in similar fashion as you did.

However is there a reason you're using the ContentLoader to get all the content in the wastebin instead of checking page.IsDeleted ?

#186238
Dec 14, 2017 12:25
Vote:
 

Thanks Peter, now I learned something new. Sometimes you don't see the obvious solutions :-)

#186239
Dec 14, 2017 12:53
Vote:
 

For us, this was a situation where we were dealing with thousands of events being fired related to Find this past summer and our queues were filling up, which we eventually found out was related to a completely different issue.   In the process of troubleshooting we set <episerver.find.cms disableEventedIndexing = "true"/> on our authoring server. So for us, we unfortunately caused the issue on our own by having this set.

#186249
Dec 14, 2017 15:19
Vote:
 

If anyone is experiencing this and using Find 13.x you should be aware that there's a bug.

ContentIndexer.Instance.TryDelete()  doesn't function properly.

This is the bug: https://world.episerver.com/support/Bug-list/bug/FIND-4048

The workaround for now is to use the SearchClient instead.

 var localizable = content as ILocalizable;
 if (localizable != null)
 {
     SearchClient.Instance.Delete(content.GetType(), SearchClient.Instance.Conventions.IdConvention.GetId(content), localizable.LanguageRouting(), null);
 } 
else { SearchClient.Instance.Delete(content.GetType(), SearchClient.Instance.Conventions.IdConvention.GetId(content), null); }
#199113
Nov 15, 2018 18:45
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.