Search (Lucene legacy) results contains unfriendly url

Vote:
 

I'm seeing strange search results (using the Lucene/Leagacy search) where the link in some of the results looks like this:

~/link/70e7eef49e884a37b95f149a44915b82.aspx?epslanguage=sv

When clicked I get to a 404-page.

It seems like there are multiple versions of the page in the search result (basically the same title, ingress and body but with slight variations, spelling etc) so I'm guessing this might be indexed versions perhaps? But his is just guessing on my part.

I tried deleting all the pages in the trash and reindexed, but there search result is the same so it's probably not indexed the trash.

I also notived that the files in the Ref folder (in Index) hasn't been updated but the files in the Main folder has. How does Lucene work with these two folders?

Has anybody seen this error before? Where search results crop up with a unfriendly url which doesn't seem to point anywhere?

#255102
Edited, May 19, 2021 10:29
Vote:
 

Off the top of my head this could happen if this is unpublished content perhaps. You might get an internal link and obviously a 404.

#255178
May 20, 2021 8:59
Vote:
 

@Scott Reed, Any ideas on how to make the indexer to skip indexing of unpublished pages?

#255185
May 20, 2021 10:27
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.