The trust relationship between the primary domain and the trusted domain failed
From the beginning the web based on EPiServer CMS6 has had the occasional occurrences of this issue. I from the beginning said and thought it wasn't an issue with the episerver installation but rather a network/domain related issue, as stated by the error message.
We found that removing the machine that showed the error from the domain and then adding it back solved the problem. This happened around every second month or so. It was no biggie.
But, the recurrence of the issue started to increase and for the last few weeks it has been a major issue for the organization.
We needed to fix it badly as the problem not always was possible to solve by simply removing/adding the machine to the domain, we had to do it several times.
The setup in short:
- Load balancer in DMZ
- Public server 1 and 2 in the domain
- Staging/edit server in the domain
- Resource/file server in the domain
The servers, let's call them #1, #2 and #3, where all added in two groups on the resource server, these groups had R and/or RW permission to the VPP-folders on this server.
We noticed that the error came after automatic recycling of the app pool.
We googled for the issue and found some resources:
- Change the cached logons value from 25 tom something low (unclear on which server so we changed it on all of them).
- Create dummy groups on the application servers, WebUsers, WebAdmins, this was done already on release.
- Clean up the database tables tblWindowsUser, tblWindowsGroup, tblWindowsRelations, make sure residue from old development machines, unknown domains and so on is removed.
- Change the account running the app pool to network service or a domain account - tried both, nothing helped.
- Give "Authenticated Users" permissions on the share - no difference.
(By this time we where reflecting upon the idea to move all the files to a linux server and write a new VirtualPathProvider.) - Paul at EpiServer Support sent me a code snippet which was the key to the solution!
From a similar incident (though actually in a multi domain environment, DMZ and two domains) the support had saved a piece of code which tested the actual permissions on a share (attached below).
I had to build this code into the application and release it. To be able to run it I also had to map all the VPPs to local paths (where there's no files, but it bypassed the big error, luckily we had load balancing so I could hide the server behaving bad).
This test code showed that some of the permissions was somewhat corrupted or at least impossible to resolve from SIDs to accounts. But, on the resource server everything looked fine and dandy.
I'm in no way a windows permissions expert - quite the opposite. But, I went to the resource server and removed permissions on the share until none of the unresolvable SIDs where left in the report from the code snippet. Then I added the permissions needed, R permissions for #1 and #2, I gave the machine permissions directly on the share, bypassing the local groups on the resource server, and of course RW permissions for #3 - and - everything just started to work!
So, in short, our problem was corrupted permissions on the resource server, why and how it was corrupted I don't know. The problem did not show up on the file server but through the remote permission check in the attached code.
Actually it seems the stack trace showed us the error from the beginning as it reads:
[SystemException: The trust relationship between the primary domain and the trusted domain failed.
]
System.Security.Principal.NTAccount.TranslateToSids(IdentityReferenceCollection sourceAccounts, Boolean& someFailed) +1149
System.Security.Principal.NTAccount.Translate(IdentityReferenceCollection sourceAccounts, Type targetType, Boolean forceSuccess) +52
The error message is a bit confusing because it mentions domain even though all the servers in this scenario is in the same domain, but the stack trace ends in TranslateToSids.
I think EpiServer is very sensitive for this error, as I could see in the report from the snippet there was enough permissions to read the files from the share, but it seems Epi fails if one permission in the collection fails. It would probably be a good idea to look into this problem in a future release.
Resources mentioned:
http://blogs.interakting.co.uk/post/EPiServer-The-trust-relationship-between-the-primary-domain-and-the-trusted-domain-failed.aspx
http://www.techsupportforum.com/forums/f103/disconnecting-from-domain-169618.html
Code originally from Torbjörn Andersson at Logica:
TestTwo.aspx
<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="TestTwo.aspx.cs" Inherits="EPiServer.TestTwo" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head id="Head1" runat="server">
<title>Test Windows Authentication</title>
</head>
<body>
<form id="form1" runat="server">
<div>
<asp:Label runat="server" ID="lblDebugInfo" /><br/>
Physical path to test: <asp:TextBox runat="server" ID="txtPhysPath">\\MyServer\MyShare</asp:TextBox><br/>
<asp:Button Text="Test" runat="server" ID="cmdTestLogin" OnClick="TestLogin" />
</div>
</form>
</body>
</html>
TestTwo.aspx.cs
using System;
using EPiServer.Core;
using System.Security.Principal;
using System.Security.AccessControl;
using System.IO;
using EPiServer.Web.Hosting;
using System.Text;
namespace EPiServer
{
public partial class TestTwo : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
}
protected void TestLogin(object sender, EventArgs e)
{
lblDebugInfo.Text = GetAndValidatePhysicalPathBase(txtPhysPath.Text);
}
internal static string GetAndValidatePhysicalPathBase(string physicalPath)
{
var sb = new StringBuilder();
physicalPath = Environment.ExpandEnvironmentVariables(physicalPath);
var info = new DirectoryInfo(physicalPath);
var info2 = new DirectoryInfo(GenericHostingEnvironment.ApplicationPhysicalPath);
if (info.FullName.StartsWith(info2.FullName, StringComparison.OrdinalIgnoreCase))
{
throw new EPiServerException(
"VirtualPathNativeProvider attribute 'physicalPath' must not refer to a path under the application path");
}
if (!info.Exists)
{
info.Create();
}
bool hasFileSystemRights;
var principal = new WindowsPrincipal(WindowsIdentity.GetCurrent());
sb.Append("Running account = ");
sb.Append(principal.Identity.Name);
sb.AppendLine("<br />");
sb.AppendLine("<br />");
FileSystemRights rights = 0;
foreach (FileSystemAccessRule rule in info.GetAccessControl().GetAccessRules(true, true, typeof (NTAccount)))
{
try
{
if (principal.IsInRole(rule.IdentityReference.Value))
{
if ((rule.AccessControlType == AccessControlType.Deny) &&
((rule.FileSystemRights & FileSystemRights.Modify) != 0))
{
hasFileSystemRights = false;
}
if (rule.AccessControlType == AccessControlType.Allow)
{
rights |= rule.FileSystemRights;
}
}
sb.Append(rule.IdentityReference.Value);
sb.AppendLine("<br />");
sb.Append("OK");
sb.AppendLine("<br />");
sb.AppendLine("<br />");
}
catch (Exception e)
{
sb.Append(rule.IdentityReference.Value);
sb.AppendLine("<br />");
sb.Append(e.Message);
sb.AppendLine("<br />");
sb.AppendLine("<br />");
}
}
hasFileSystemRights = ((rights & FileSystemRights.Modify) == FileSystemRights.Modify);
if (!hasFileSystemRights)
{
throw new EPiServerException(string.Format(
"VirtualPathNativeProvider must have modify rights to '{0}'", info.FullName));
}
return sb.ToString();
}
}
}
Thanks for sharing Gustaf!