Globalization

Product version:

EPiServer CMS 5 SP1

Document version:

1.0

Document creation date:

07-06-2006

Document last saved:

21-01-2008

Introduction

This technical note outlines some of the technical issues regarding globalized Web sites. For further information about EPiServer CMS 5 globalization support, see the white paper, "Working with Multiple Languages in EPiServer". 

Table of Contents

Display Language to Web Site Visitors

How does EPiServer know which language to display to visitors? The short answer is that EPiServer enforces the language to be visible in the URL, either in the path or the domain part of the URL.

The reasons for this are simple:

  • search engines, such as Google, must be able to crawl a Web site and easily separate content
  • users expect to be able to cut and paste a link into an e-mail and send it to a friend so that the friend can click on the link and will always get the same content.

There are also some technical reasons such as output caching in .NET and Web-browser caching on the client that expect a single URL to render the same content to anonymous users.

Types of Language Settings

There are three different language concepts in EPiServer, two which are defined by ASP.NET (culture and uiCulture), and one which is the EPiServer content language. We refer to the ASP.NET culture as "System Language" and uiCulture as "User Interface Language". All languages and language settings are expressed as Cultures (as defined by the .NET CultureInfo object). A typical culture is "en-US" which defines the language as English (en) with the culturally defined specifics for United States (US). You may in some cases only define the language, such as "sv" which defines the neutral culture Swedish.

Content Language

This controls which language version of the page content that is displayed. It can be a neutral culture or a specific culture.

The preferred content language is determined by the following rules:

  1. If there is a language indicator in the URL (a friendly URL like http://company.com/en/info , or a classic URL like http://company.com/templates/page.aspx?id=23&epslanguage=en), that language is used (en).
  2. If you are in Edit mode and have a language selection from the language selection drop-down list, that language is used.
  3. If you have defined the host name to be associated with a specific language, that language is used. (see information about web.config section <siteHosts> for more information).
  4. If the requests contain a cookie named epslanguage, use the language defined by the cookie.
  5. If the web.config setting pageUseBrowserLanguagePreferences is true, then the language preference from the Web browser is used.
  6. If nothing else is discovered, use the first enabled language branch as defined in Admin / Language Branches, i.e. this could be viewed as the default language.

Note After the preferred content language has been determined, there is another step that uses the preferred content language to determine the actual language to display. If the current page does not exist for the preferred content language, a language fallback process is started. This is defined by the page language settings in Edit mode. See information about languageSelector below.

System Language

The system language determines how listings are sorted, how to format date and time, etc. Since these types of formatting rules are culturally dependent, the system language must not be a neutral culture.

The system language is determined by the following rules:

  1. If not in Edit/Admin, use the Content Language.
  2. If user is logged on and profiles are enabled, use the personalized language selection for this user.
  3. Use the setting (xx-YY) from web.config ( <globalization culture="xx-YY" ... /> ). Note that if culture is set to auto, then the language preferences from the Web browser is used.

User Interface Language

The user interface language is only used to pull out localized texts in the ASP.NET application, it will define the majority of the texts in Edit and Admin mode of EPiServer, but for a site visitor the user interface language will only apply to minor elements such as texts on buttons. For a visitor the majority of information is usually content, which is defined by the content language.

The user interface language is determined by the following rules:

  1. If not in Edit/Admin, use the Content Language
  2. If the user is logged on and profiles are enabled, use the personalized language selection for this user.
  3. Use the setting (xx-YY) from web.config ( <globalization uiCulture="xx-YY" ... /> ). Note that if culture is set to auto, then the language preference from the Web browser is used. 

Globalization Scenarios

This chapter outlines some recommended globalization scenarios and describes how to manage them.

Scenario 1 - Global Domain with Multiple Languages

You want all your visitors to go to the official "site.com" address, but need to display different content depending on their language selection.

  1. Set a default language that most of your visitors will understand, normally English for a global site. See the rules for determining content language, rule 6 (first enabled language branch) is really the default language.
  2. Activate language detection based on browser preference (pageUseBrowserLanguagePreferences="true" in web.config).
  3. To implement a language selection option for the site visitors, see code in the Public Templates package.

Test the Configuration:

Test the configuration by following the instructions below:

  1. Open Internet Explorer and select Internet Options from the Tools menu. Click Languages in the History group box and select the language preference. You should be redirected to the correct language, if your language selection can be matched to a content language.

Scenario 2 - Local Domains Mapped to Languages

This approach is when you want http://site.se to default to Swedish language without having the path in the URL contain the language. This approach requires configuration.

       1.   Add the actual configuration, for example:

<site description="Example Site">

    <siteHosts>

         <add name="www.site.se" language="sv" />

         <add name="www.site.no" language="no" />

         <add name="www.site.co.uk" language="en-GB" />

         <add name="*" />

      

</siteHosts>

       2.   Add links or flags to the header or start page that link to other languages with their domain names.

The above example will set preferred content language for any requests to www.site.se to Swedish (sv), www.site.no to Norwegian and to www.site.co.uk to British English (en-GB). Note the last <add>, it is a "default" for any host name that does not match any other setting. It can contain a language attribute, but in this way you could use a www.site.com that uses browser language preferences as per scenario 1.

Scenario 3 - Remember User Preference

If you have a single domain and require the user language preference to be persisted, you can set a cookie with the current language selection. This means that the next time the user visits your site, they will also be redirected to the correct language. This may also be used if you, for example, are redirecting the user to another system, where language preference is not retained in the URL. You can set a cookie named epslanguage (see rules for Content Language) to make sure that when the user returns to the site, he or she gets the same language as before.

Be careful when using cookies and always try to build the Web site based on the concept that language is included in the URL. This will ensure that you never lose language context when the user is navigating your site.

Imagine that you have language cookie "no" and click on a link from a friend that leads to the English site. You should of course continue to use the English site when surfing, so use cookies with care if you actually need them.

Page Language

A page is created on a language branch and the first language created becomes the master language branch for that page. This applies to all pages, globalized or not. The master contains all properties, both properties for that language and common properties. When new languages are created for a page, they will only save properties that are specific for that language.

As a developer all languages will contain the common properties from the master version in the PageData object, but they are not editable. If you change a property that should not be saved per language, you will get an exception when calling the DataFactory.Save method if the language you are publishing is not the master language for that page.

Note  Those familiar with the database table tblPage should familiarize themselves with the new table tblPageLanguage, which contains a subset of the metadata from tblPage. You may see that some metadata exists both in tblPage and tblPageLanguage, as the master version will, for backwards compatibility, always store its data in both tables, but it is always tblPageLanguage that is used for loading page content.

Page Properties

Language-specific properties are defined by the administrator/developer on a page type. The metadata (for example PageName) that is language-specific cannot be changed and has been defined by the system. Metadata properties that relate to content are language-specific and metadata properties that relate to navigation are common.

The follow properties are defined per language:

  • PagePendingPublish
  • PageWorkStatus
  • PageSaved
  • PageChanged
  • PageCreatedBy
  • PageChangedBy
  • PageCreatedSID
  • PageLanguageBranch
  • PageName
  • PageStartPublish
  • PageStopPublish
  • PageChangedOnPublish
  • PageCreated
  • PageLanguageID
  • PageExternalURL
  • PageURLSegment
  • PageShortcutType
  • PageShortcutLink
  • PageTargetFrame
  • PageLinkURL
  • PageDelayedPublish

You can programmatically check if a property is language-specific by checking IsLanguageSpecific on the PropertyData class (see samples for an example).

Language Branch

A language branch has a unique identifier in the database to handle constraints, but is always exposed in APIs as a language code, for example "en". Language codes must therefore be unique; two language branches cannot use the same language code as the reverse lookup would fail.

Page Language Settings

To help the system know which languages should be used on which part of the site, we have Page Language Settings, which actually define the languages that should be available to the editor when creating new pages. They also define fallback languages and replacement languages. The administrative API for page language settings are EPiServer.DataAbstraction.PageLanguageSetting. The runtime API to read settings with support for inheritance is EPiServer.DataFactory.PageLanguageSettings. Page Language Settings does not restrict the languages that are rendered on the site, it only helps EPiServer make intelligent choices based on custom settings.

Language Selector

Languages are selected at runtime using a language selector (EPiServer.Core.LanguageSelector). A new instance of this class is created and passed to most methods in the global instance of DataFactory (Global.EPDataFactory). Custom implementations of EPiServer.Core.ILanguageSelector can be used to get a customized language selection. The Language selector uses the Page Language Settings, for example, to know when to fallback a missing language to another.

A language is considered available by the language selector if it has been published (CurrentPage.PendingPublish is false). The language selector does not check publish dates, so for example when a news item expires on one language, it is no longer displayed (no fallback to another language is applied).

Dynamic Properties

Dynamic properties can also be made available per language. This should, however, be used with caution as overusage of dynamic properties is not recommended as it may negatively affect performance. Only use dynamic properties for administrative settings that must be done per language.

Dynamic properties do not use Page Language Settings and are always loaded with the same language as the page, so for example a Swedish page will always get Swedish dynamic properties (even if displayed on an English site due to fallback configuration).

Archive Page

A page is archived when an archive page has been set and the "stop publish" date has passed. In the process the "stop publish" date will be cleared. On globalized pages you have multiple stop publish dates, but only the master language "stop publish" dates are checked, and when the move is made to archive page, all "stop publish" dates for all languages are cleared.

Subscription

When a language/page is updated, it will be sent information to all users that have the updated language as their current preference. The user’s language preference is determined from the property PersonalizedData.Language. If the user has no language preference, updates on the first found language are sent to the user. The user’s language preference can also be edited in admin mode for the user; this list is though filtered on current language files (XML resource files) in the Lang-directory so make sure all languages have a resource file.

The subscription mailer reads Page Language Settings and will take replacement and fallback language into account.

Note! The subscription is based on some special predefined properties, for example “EPSUBSCRIBE” and “EPSUBSCRIBE-ROOT”, neither of these are supported as language specific.

Web Browser Preference

The Web browser sends headers that inform the site of the languages that the user prefers. For example on a Swedish Internet Explorer on Windows XP the header will contain "sv". If this feature is enabled in web.config/System Settings in Admin mode, EPiServer will try to match this value to the language code that is enabled as a Web site language. An exact match is always preferred, for example a Web site visitor with language preference English New Zealand (EN-NZ) will try to get an exact match, but will fallback to English (EN) if found instead.

You should always use the ISO language codes, but in order to enable the fallback of user preference you may want to, for example, use language code "EN" for the English version that should be "master"-English.

Searching

The search control PageSearch supports searching on either all languages or a list of defined languages. On templates before EPiServer 4.60, searching is performed by default on all languages and results are displayed on the current language if available. The actual search does not use Page Language Settings as part of the search query, but will use Page Language Settings when selecting which language version of the page that should be displayed to the user.

No special treatment is used for files so if you have language-specific files, you must separate them in different directories.

Property Searching (FindPagesWithCriteria)

The property search control PropertySearch and the underlying API DataFactory.FindPagesWithCritiera will by default search on all languages. The hits will, however, use the same language selection process as any other page loading. There are two parameters that you can control:

  1. Search only on a specific language
  2. Pass in language selector which will be used when returning the actual pages.

This lets you for example find a property only on a specific language branch.