Johan Björnfot
Nov 15, 2016
  4822
(8 votes)

Internationalized Resource Identifiers (IRIs)

An Internationalized Resource Identifier (IRI) is a network address that contain non ASCII characters as below:

Image IRI.png

EPiServer CMS has previously (prior to 10) only allowed characters in url segments according to RFC 1738 which basically allows ALPHA / DIGIT / '-'/ '_'/ '~' / '.'/ '$'/. 

It is now (from CMS.Core version 10.1.0) however possible to define a custom character set that are used for url segments and simple address. This is done by registering an instance of UrlSegmentOptions with a custom regular expression in IOC container. When an expression is set that allows characters outside RFC 1738 the setting UrlSegementOptions.Encode is recommended to be set to true so that url:s gets properly encoded. Below is an example of how a character set that allows unicode characters in the letter category.

using EPiServer.ServiceLocation;
using EPiServer.Framework.Initialization;
using EPiServer.Framework;
using EPiServer.Web;

namespace EPiServerSite6
{
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class IRIConfigurationModule : IConfigurableModule
    {
        public void ConfigureContainer(ServiceConfigurationContext context)
        {
            context.Services.RemoveAll<UrlSegmentOptions>();
            context.Services.AddSingleton<UrlSegmentOptions>(s => new UrlSegmentOptions
            {
                Encode = true,
                ValidUrlCharacters = @"\p{L}0-9\-_~\.\$"
            });
        }

        public void Initialize(InitializationEngine context)
        {}

        public void Uninitialize(InitializationEngine context)
        {}
    }
}

UrlSegmentOptions also exposes a CharacterMap property where it is possible to define a mapping for unsupported characters, for example 'ö' => 'o'. 

Internationalized Domain Names (IDN)

As explained in IDN and IRI are internationalized domain names registered in its punycode format (a way of representing Unicode characters using only ASCII characters). 

Internationalized domain names should be registered in admin mode under Manage Websites in their punycode format. 

Nov 15, 2016

Comments

Please login to comment.
Latest blogs
PageCriteriaQueryService builder with Blazor and MudBlazor

This might be a stupid idea but my new years resolution was to do / test more stuff so here goes. This razor component allows users to build and...

Per Nergård (MVP) | Feb 10, 2025

Enhancing Optimizely CMS Multi-Site Architecture with Structured Isolation

The main challenge of building an Optimizely CMS website is to think about its multi site capabilities up front. Making adjustment after the fact c...

David Drouin-Prince | Feb 9, 2025 | Syndicated blog

How to: set access right to folders

Today I stumped upon this question Solution for Handling File Upload Permissions in Episerver CMS 12, and there is a simple solution for that Using...

Quan Mai | Feb 7, 2025 | Syndicated blog

Poking around in the new Visual Builder and the SaaS CMS

Early findings from using a SaaS CMS instance and the new Visual Builder grids.

Johan Kronberg | Feb 7, 2025 | Syndicated blog

Be careful with your (order) notes

This happened a quite ago but only now I have had time to write about it. Once upon a time, I was asked to look into a customer database (in a big...

Quan Mai | Feb 5, 2025 | Syndicated blog

Imagevault download picker

This open source extension enables you to download images as ImageData with ContentReference from the ImageVault picker. It serves as an alternativ...

Luc Gosso (MVP) | Feb 4, 2025 | Syndicated blog