Canonical with fallback languages

sebp

Vote:

Hello!

We've recently implemented the standard Optimizely methods (see below) for generating canonicals and alternate URL:s and it works pretty well.

@Html.CanonicalLink()
@Html.AlternateLinks()

However, we're wondering why we're not getting links for the pages which have replacement language / fallback?

For instance we have this page about sustainability which exists in Spanish (see below)

When visiting this page in the browser, on the mexican site i works fine and shows the spanish fallback / replacement content

But the alternate / canonicals are not using the es-mx urls, but the es-es URL instead, and the alternate links do not even include the link(s) that have fallback.

Are we missing something?

#336654

Feb 11, 2025 13:55

Eric Herlitz

Vote:

We had the same issue and had to write our own logic to handle this.

Try this code

using System;
using System.Globalization;
using System.Linq;

using EPiServer.Core;
using EPiServer.Web;
using EPiServer.Web.Routing;

namespace yadayada;

public class CanonicalUrlFactory
{
    private readonly IUrlResolver urlResolver;
    private readonly ISiteDefinitionResolver siteDefinitionResolver;

    public CanonicalUrlFactory(
        IUrlResolver urlResolver,
        ISiteDefinitionResolver siteDefinitionResolver
    )
    {
        this.urlResolver = urlResolver ?? throw new ArgumentNullException(nameof(urlResolver));
        this.siteDefinitionResolver = siteDefinitionResolver ?? throw new ArgumentNullException(nameof(siteDefinitionResolver));
    }

    //Modified and copied from http://www.dodavinkeln.se/post/how-to-get-the-external-url-to-content
    //This method also works for MediaData.
    public virtual string Create(
        ContentReference contentLink,
        CultureInfo contentLanguage,
        bool absoluteUrl = true
    )
    {
        var result = this.urlResolver.GetUrl(
            contentLink,
            contentLanguage.Name,
            new VirtualPathArguments
            {
                ContextMode = ContextMode.Default,
                ForceCanonical = absoluteUrl
            });

        // HACK: Temprorary fix until GetUrl and ForceCanonical works as expected,
        // i.e returning an absolute URL even if there is a HTTP context that matches the content's site definition and host.
        if(!absoluteUrl)
        {
            return result;
        }

        if(!Uri.TryCreate(result, UriKind.RelativeOrAbsolute, out var relativeUri))
        {
            return result;
        }

        if(relativeUri.IsAbsoluteUri)
        {
            return result;
        }

        var siteDefinition = this.siteDefinitionResolver.GetByContent(
            contentLink: contentLink,
            fallbackToWildcard: true,
            fallbackToEmpty: true);
        var hosts = siteDefinition.GetHosts(
                language: contentLanguage,
                fallbackToUnmapped: true)
            .ToList();
        var host = hosts.FirstOrDefault(h => h.Type == HostDefinitionType.Primary) ??
                   hosts.FirstOrDefault(h => h.Type == HostDefinitionType.Undefined);
        var baseUri = siteDefinition.SiteUrl;

        // Avoid exception if host is missing, i.e. page without startpage
        if(baseUri == null)
        {
            return string.Empty;
        }

        if(host != null && host.Name.Equals("*") == false)
        {
            // Try to create a new base URI from the host with the site's URI scheme. Name should be a valid
            // authority, i.e. have a port number if it differs from the URI scheme's default port number.
            Uri.TryCreate(siteDefinition.SiteUrl.Scheme + "://" + host.Name, UriKind.Absolute, out baseUri);
        }

        var absoluteUri = new Uri(baseUri, relativeUri);

        return absoluteUri.AbsoluteUri;
    }
}

#336846

Feb 17, 2025 21:10

sebp - Feb 18, 2025 7:14

Thank you Eric, I will give this a go :)

- Feb 24, 2025 9:57

This code seems to be a bit outdated. Instead of using ForceCanonical, you can use ForceAbsolute to always get an absolute URL (if that's what you want).

sebp

Vote:

It seems this is working out pretty well. So I wrote my own method for doing the same for alternate links, not sure it's the best solution but it seems to work! Here it is if anyone needs it :)

public Dictionary<string, string> GetAlternateLinks(ContentReference contentReference)
{
var isEditMode = _contextModeResolver.CurrentMode.EditOrPreview();
if (isEditMode)
{
return null;
}

var alternates = new Dictionary<string, string>();
var allLanguages = _languageBranchRepository.ListEnabled().Select(x => x.Culture);

var startPage = _contentLoader.Get<StartPage>(ContentReference.StartPage);
var masterLanguage = startPage.MasterLanguage;

foreach (var lang in allLanguages)
{
var url = GetCanonicalUrl(contentReference, lang);

if (!string.IsNullOrEmpty(url))
{
var urlBuilder = new UrlBuilder(url);
var routedContent = _urlResolver.Route(urlBuilder);
var isPublished = _publishedStateAssessor.IsPublished(routedContent, PublishedStateCondition.None);

if (routedContent != null && !ContentReference.IsNullOrEmpty(routedContent.ContentLink) && isPublished)
{
alternates[lang.Name] = url;

if (lang == masterLanguage)
{
alternates["x-default"] = url;
}
}
}
}

return alternates;
}

#336906

Edited, Feb 19, 2025 10:49

Eric Herlitz - Feb 19, 2025 11:14

Awesome, I'm sure it will serve you just fine!

Johan Petersson

Vote:

Canonical links should not contain fallback or replacement languages/URLs. The purpose of the canonical links is to tell search engines where the original content is located so it doesn't index duplicates. Es-es and es-mx are duplicates in this case, hence es-es should be the canonical URL of this page.

If you navigate to /es-es/sobre-xxx/sustainability/, is there an alternate link rendered to /es-mx/sobre-xxx/sustainability/ with hreflang set to 'es-mx'?

#336987

Edited, Feb 24, 2025 9:55

sebp - Mar 07, 2025 6:52

When the client first approached me with their request, my initial response was to suggest checking with our SEO expert to determine what actually needed to be addressed. The feedback I received was that they were using "Screaming Frog," and the software flagged an issue: the canonical tag was set to "es-es" even when accessing the page from "es-mx."

That said, I think your explanation makes the most sense. If the canonical tag is meant to refer to the original content, it absolutely should remain "es-es," even when the page is accessed from "es-mx."

Also, I can confirm that the "es-mx" alternate link does render correctly when viewing the sustainability page in "es-es."

Maxime Messely - Jul 11, 2025 7:22

Hi Johan. From a purely technical standpoint, I do understand why you say that and why Optimizely falls back to its original's language. However, the purpose of canonical links towards a search engine is not solely to say that there is an "original" of the current page somewhere else, but it will actively stop indexing the current page and index the canonical instead. On a website intended for a global audience, it's not great to have a lot of (localized) urls not indexed. Having the alternate hreflang links present does not undo the damage that the canonical does.

And, that's all assuming that the content *is* actually identical. In our case, we had components on the page that render content from external systems based on the context language. Same thing can happen when content blocks are translated but used on a non-translated page. The content will be different, yet Optimizely will still claim the canonical is a different URL because it is only taking the *page* level into account. I will be making my custom implementation for now.

Btw, I LOL'd at the "screaming frog" bit that sebp posted. 😁 Same happened to me.