Take the community feedback survey now.

Johan Björnfot
Nov 15, 2016
  5515
(8 votes)

Internationalized Resource Identifiers (IRIs)

An Internationalized Resource Identifier (IRI) is a network address that contain non ASCII characters as below:

Image IRI.png

EPiServer CMS has previously (prior to 10) only allowed characters in url segments according to RFC 1738 which basically allows ALPHA / DIGIT / '-'/ '_'/ '~' / '.'/ '$'/. 

It is now (from CMS.Core version 10.1.0) however possible to define a custom character set that are used for url segments and simple address. This is done by registering an instance of UrlSegmentOptions with a custom regular expression in IOC container. When an expression is set that allows characters outside RFC 1738 the setting UrlSegementOptions.Encode is recommended to be set to true so that url:s gets properly encoded. Below is an example of how a character set that allows unicode characters in the letter category.

using EPiServer.ServiceLocation;
using EPiServer.Framework.Initialization;
using EPiServer.Framework;
using EPiServer.Web;

namespace EPiServerSite6
{
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class IRIConfigurationModule : IConfigurableModule
    {
        public void ConfigureContainer(ServiceConfigurationContext context)
        {
            context.Services.RemoveAll<UrlSegmentOptions>();
            context.Services.AddSingleton<UrlSegmentOptions>(s => new UrlSegmentOptions
            {
                Encode = true,
                ValidUrlCharacters = @"\p{L}0-9\-_~\.\$"
            });
        }

        public void Initialize(InitializationEngine context)
        {}

        public void Uninitialize(InitializationEngine context)
        {}
    }
}

UrlSegmentOptions also exposes a CharacterMap property where it is possible to define a mapping for unsupported characters, for example 'ö' => 'o'. 

Internationalized Domain Names (IDN)

As explained in IDN and IRI are internationalized domain names registered in its punycode format (a way of representing Unicode characters using only ASCII characters). 

Internationalized domain names should be registered in admin mode under Manage Websites in their punycode format. 

Nov 15, 2016

Comments

Please login to comment.
Latest blogs
Optimizely CMS - Learning by Doing: EP06 - Create Header, Footer, Menu & Component/View for Blocks

  Episode 6  is Live!! The latest installment of my  Learning by Doing: Build Series  on  Optimizely CMS 12  is now available on YouTube! This vide...

Ratish | Nov 4, 2025 |

Going Headless: 3 Ways to Store Custom Data in Optimizely Graph

Welcome to another installment of my  Going Headless  series. Previously, we covered: Going Headless: Making the Right Architectural Choices Going...

Michał Mitas | Nov 3, 2025

A day in the life of an Optimizely OMVP - What's New in Optimizely CMS: A Comprehensive Recap of 2025 Updates

Hello and welcome to another instalment of a day in the life of an Optimizely OMVP. On the back of the presentation I gave in the October 2025 happ...

Graham Carr | Nov 3, 2025

Optimizely CMS Mixed Auth - Okta + ASP.NET Identity

Configuring mixed authentication and authorization in Optimizely CMS using Okta and ASP.NET Identity.

Damian Smutek | Oct 27, 2025 |

Optimizely: Multi-Step Form Creation Through Submission

I have been exploring Optimizely Forms recently and created a multi-step Customer Support Request Form with File Upload Functionality.  Let’s get...

Madhu | Oct 25, 2025 |

How to Add Multiple Authentication Providers to an Optimizely CMS 12 Site (Entra ID, Google, Facebook, and Local Identity)

Modern websites often need to let users sign in with their corporate account (Entra ID), their social identity (Google, Facebook), or a simple...

Francisco Quintanilla | Oct 22, 2025 |