Wow - I just logged onto the forum to post the exact same question!
+1 I also need to figure this out.
Reflector shows that class EPiServer.Find.Cms.ContentIndexer is responsible for executing the indexing. In method ReIndex it filters out pages that have external link :(
EPiServer.Find.Cms.ContentIndexer.ReIndex:
...
if ((!(content is PageData) || !((PageData) content).LinkURL().IsAbsoluteUrl()) && this.ShouldIndex(content))
{
list2.Add(content);
}
I was thinking about implementing my own version of IContentIndexer that handles ReIndex right, but 'Find CMS Indexing Job' does not use dependency injection to retrieve instance of IContentIndexer, instead it uses hard-wired ContentIndexer.Instance. So forget about using the existing job for any good. Guess I have to write my own indexing job too.
I registered a developer incident about this matter. The fix is mere removal of a condition, so near and yet so far :/
I created a simple scheduled job as a workaround for now.
[EPiServer.PlugIn.ScheduledPlugIn(
DisplayName = "Index external shortcut pages",
Description = @"'Find CMS Indexing Job' doesn't index pages with external shortcut. This does. Execute after 'Find CMS Indexing Job'",
SortIndex = 10200)]
public class IndexExternalShortcutPagesJob
{
private static readonly ILog Log = LogManager.GetLogger(typeof (IndexExternalShortcutPagesJob));
private IContentRepository ContentRepository { get; set; }
private IContentIndexer Indexer { get; set; }
private int _total;
private int _indexed;
private int _errors;
public static string Execute()
{
// Note! Implement your own user impersonation for scheduled jobs
//using (new UserImpersonation(Settings.Config.ScheduledJobUser))
//{
var contentRepository = ServiceLocator.Current.GetInstance<IContentRepository>();
var contentIndexer = ContentIndexer.Instance;
var job = new IndexExternalShortcutPagesJob(contentIndexer, contentRepository);
return job.Index();
//}
}
private IndexExternalShortcutPagesJob(IContentIndexer contentIndexer, IContentRepository contentRepository)
{
Indexer = contentIndexer;
ContentRepository = contentRepository;
}
private string Index()
{
try
{
var pages = GetExternalShortcutPages(ContentReference.RootPage).ToList();
var options = new IndexOptions { IndexAllLanguageVersions = false };
var results = Indexer.Index(pages, options).ToList();
_total = results.Count;
_indexed = results.Count(result => result.Ok);
_errors = _total - _indexed;
}
catch (Exception ex)
{
_errors++;
Log.Error(ex);
}
return string.Format("Total count: {0}<br>Indexed: {1}<br>Errors: {2}<br>(executed on {3})",
_total, _indexed, _errors, Environment.MachineName);
}
private IEnumerable<PageData> GetExternalShortcutPages(PageReference startpoint)
{
var pages = new Queue<PageData>();
pages.Enqueue(ContentRepository.Get<PageData>(startpoint, LanguageSelector.MasterLanguage()));
while (pages.Any())
{
var page = pages.Dequeue();
ContentRepository.GetChildren<PageData>(page.ContentLink, LanguageSelector.MasterLanguage())
.ForEach(pages.Enqueue);
var branches = ContentRepository.GetLanguageBranches<PageData>(page.PageLink)
.Where(branch => branch.LinkType == PageShortcutType.External);
foreach (var branch in branches)
{
yield return branch;
}
}
}
}
Thanks for the solution jouni.
I tried is out in my application and the external link pages where indexed but I had an issue with the url that was returned on the external link results through a unified search.
For example:
I had a page in the root of the website called 'example-link' which had a shortcut set to http://google.com.
When the page was returned as a result through unified search, the url was set to http://google.com/example-link.
I spent a little time refelecting and trying to figure out how this was happening with no avail so in the end I have created a special page type for external links where the url is set on a custom field.
This has been registered as a bug http://world.episerver.com/Support/Bug-list-beta/bug/106128/
Hi,
Want to thank you for reporting this bug. It is now fixed and if you are in need of this fix you can contact developer support. Otherwise it will be part of the patch 5 release.
Hi
Is it normal behaviour that Find CMS Indexing Job doesn't index pages that have external shortcut set (PageShortcutType.External), and instead removes them from the Find index?
For example, when shortcut page is edited by user, it's updated correctly to the Find index. But executing Find CMS Indexing Job removes it from the index! Such controversial behaviour would be utterly idiotic, thus the problem must be in my solution or indexing job is flawed. Has anyone else experienced the same problems? :)