Johan Björnfot
Nov 15, 2016
  3599
(8 votes)

Internationalized Resource Identifiers (IRIs)

An Internationalized Resource Identifier (IRI) is a network address that contain non ASCII characters as below:

Image IRI.png

EPiServer CMS has previously (prior to 10) only allowed characters in url segments according to RFC 1738 which basically allows ALPHA / DIGIT / '-'/ '_'/ '~' / '.'/ '$'/. 

It is now (from CMS.Core version 10.1.0) however possible to define a custom character set that are used for url segments and simple address. This is done by registering an instance of UrlSegmentOptions with a custom regular expression in IOC container. When an expression is set that allows characters outside RFC 1738 the setting UrlSegementOptions.Encode is recommended to be set to true so that url:s gets properly encoded. Below is an example of how a character set that allows unicode characters in the letter category.

using EPiServer.ServiceLocation;
using EPiServer.Framework.Initialization;
using EPiServer.Framework;
using EPiServer.Web;

namespace EPiServerSite6
{
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class IRIConfigurationModule : IConfigurableModule
    {
        public void ConfigureContainer(ServiceConfigurationContext context)
        {
            context.Services.RemoveAll<UrlSegmentOptions>();
            context.Services.AddSingleton<UrlSegmentOptions>(s => new UrlSegmentOptions
            {
                Encode = true,
                ValidUrlCharacters = @"\p{L}0-9\-_~\.\$"
            });
        }

        public void Initialize(InitializationEngine context)
        {}

        public void Uninitialize(InitializationEngine context)
        {}
    }
}

UrlSegmentOptions also exposes a CharacterMap property where it is possible to define a mapping for unsupported characters, for example 'ö' => 'o'. 

Internationalized Domain Names (IDN)

As explained in IDN and IRI are internationalized domain names registered in its punycode format (a way of representing Unicode characters using only ASCII characters). 

Internationalized domain names should be registered in admin mode under Manage Websites in their punycode format. 

Nov 15, 2016

Comments

Please login to comment.
Latest blogs
Implementing EmbeddedLocalization in Optimizely CMS 12

My previous post on translation (Translating Optimizely CMS 12 UI components) gives an overview of how to implement the FileXmlLocalizationProvider...

Eric Herlitz | Jan 27, 2023 | Syndicated blog

Breaking changes in EPiServer.CMS.TinyMce 4.0.0

After upgrading to the latest version of EPiServer.CMS.TinyMce, the dropdown with formats disappears. Learn how to get it back!

Tomas Hensrud Gulla | Jan 27, 2023 | Syndicated blog

Translating Optimizely CMS 12 UI components

Optimizely CMS 12 have been out for a while now, but still some elements haven't been properly translated resulting in a GUI defaulting to english....

Eric Herlitz | Jan 26, 2023 | Syndicated blog

Image preview in Optimizely CMS12 all properties view

With these simple steps, you can now see an Image and its Metadata, including size and dimensions, when editing an Image property in Optimizely...

Tomas Hensrud Gulla | Jan 26, 2023 | Syndicated blog