A short while ago we noticed that our search page was failing on some terms but not on others. Looking through the logs it was complaining about not being able to cast some ProxyObject to IContentMedia. With the contentId from the log we looked in the CMS to find the content but that was some element from a form that wasn't even indexed. After some more poking and prodding, we finally realized those contentIds belonged to another site on the same Find index. Those files were PDF files and their indexes are missing a SiteId.
Now to be clear, we have currently two separate sites using the same index. These sites have separate databases and thus can overlap in contentIds. But with siteIds this shouldn’t matter, right?
For a short-term solution, we disabled indexing in generic files. But also found a solution in adding a SiteId property to the GenericFile class.
Today I read this post from a few months ago (https://world.episerver.com/forum/developer-forum/EPiServer-Search/Thread-Container/2017/7/episerver-find---indexing-global-assets-when-running-multi-site/) and thought this was the correct solution. The solution is to have a wildcard host setup. But either I don't understand it what it means or something else is amiss here. Below is a (redacted) screenshot with the wildcard host configured.
Could somebody explain as to why Find still doesn't want to include the site id?
I am not sure this will still resolve your issue, as you have 2 different DBs and one find index(same site id can be assigned). In above example of multisite, behind the scenes is always a single DB.
check [tblSiteConfig] and [tblSiteDefinition] also in both dbs, as if one db is copy of another db then potentially both sites will be having same uniqueid.
I don't want the two sites to share the both site id, just to make sure that every piece of content has the site id of the site where it belongs to. Currently, everything in Media > For All Sites > ... doesn't.