Deane Barker
Sep 1, 2009
(4 votes)


Over the last few years, I’ve done four implementations of the Google Mini search appliance.  This is a piece of hardware (a 1U rack mount) that acts has a search crawler and engine.

It crawls your Web site (or whatever else you point it at) 24 hours a day, and you can throw queries at it via a REST interface, and get results back as XML (you can also transform the XML on the device itself, and use it to actually present queries to the end user, but this is awkward and requires you to dupe your interface on another machine, which is never fun).

The device is quite good for text-heavy search, and retails for $2,995, making it a cheap solution for a lot of situations.

The Mini can do fairly granular searching of META (search protocol reference). Over the years, we’ve figured out that you should stack as many META tags as possible in your pages, because you never know what you’re going to want to search on.  If, for instance, your client wants to isolate a search to just news articles, then it’s helpful to have a META tag in there with the type of content (alternately, you could create a distinct collection in the device, but maintaining these can be tedious).

For another CMS, we developed a control that dumped all sorts of META to the HEAD tag of the page.  We refined this over the years to only run for the Mini, since it got to the point where it was computationally expensive to find and return all this information, and we only needed it for the Mini (we didn’t need it for public search engines, for instance).

For our first EPiServer/Mini integration, we adapted the control a bit, but the functionality is roughly the same – it dumps all sort of information to META tags, including any properties you might specify.

Register it like this:

<%@ Register TagPrefix=”Blend” Namespace=”Blend.EPiServer.Controls” Assembly=”[insert your assembly name here"]” />

Then put the control in the HEAD tag like this:

<Blend:EPiServerSearchMeta TagNameFormat="MySite.EPiServer.{0}" UserAgentString=”gsa” QuerystringCode=”OpenSesame” Properties="Title,Summary" runat="server" />

It will only run when the currently executing page is of type TemplatePage (so, only for EPiServer templates that have a content object attached).

The control outputs the following information:

  • The page ID
  • The page type ID
  • The page type name
  • The page name
  • The parent page ID
  • The parent page type ID
  • The parent page type name
  • Every page ID from the current page’s parent back to the start page (in multiple META tags)
  • The depth of the page (the start page is 0, top level pages are 1, etc.)

It looks like this:

<meta name="MySite.EPiServer.PageID" content="9" />
<meta name="MySite.EPiServer.PageTypeID" content="7" />
<meta name="MySite.EPiServer.PageTypeName" content="NewsArticle" />
<meta name="MySite.EPiServer.PageName" content="Deane Saves the World" />
<meta name="MySite.EPiServer.ParentPageID" content="8" />
<meta name="MySite.EPiServer.ParentTypeID" content="5" />
<meta name="MySite.EPiServer.ParentTypeName" content="NewsArchive" />
<meta name="MySite.EPiServer.AncestorID" content="8" />
<meta name="MySite.EPiServer.AncestorID" content="7" />
<meta name="MySite.EPiServer.AncestorID" content="3" />
<meta name="MySite.EPiServer.PageDepth" content="3" />
<meta name="MySite.EPiServer.Category" content="7" />
<meta name="MySite.EPiServer.Category" content="9" />
<meta name="MySite.EPiServer.Category" content="13" />
<meta name="MySite.EPiServer.Category" content="15" />
<meta name="MySite.EPiServer.Category" content="16" />

There are a few control attributes…

TagNameFormat is the format of the “name” attribute of the resulting META tag.  So, in the above example, the Page Type ID of the content will output as:

<meta name=”MySite.EPiServer.PageTypeID” content=”7”/>

Properties is a comma-delimited list of properties you want to dump to META.  Be careful here, obviously – the entire text of the content object is unnecessary and potentially problematic.  The control will simply call ToWebString() on all of them, so make sure this outputs what you want.  Also, if the property is a Category selection, the control will split the IDs up under separate tags.

UserAgentString is used to identify the crawler. Enter a value in here that will be unique to the user agent string of your crawler – “gsa” works well for the Mini.  If the control finds this string it will execute, otherwise it will exit without doing anything.

QuerystringCode is a secret code you can use to debug the control.  If this value is found in a querystring argument called “show_meta,” the control will always execute (regardless of the user agent string). This is useful for debugging, so you can see the META it outputs.

Get the Code (.zip file, containing a single .cs file)

Sep 01, 2009


Sep 21, 2010 10:32 AM

Awesome stuff Deane. Nice to see you blogging
/ Jacob Khan Sep 21, 2010 10:32 AM

Nice article. Have you ever had the Google Mini indexing a document surfaced on the web via EPiServer SharePoint Connect?

Sep 21, 2010 10:32 AM

Joel: I have not, sorry.
/ Deane Sep 21, 2010 10:32 AM

no worries, thanks anyway.

Please login to comment.
Latest blogs
Implementing EmbeddedLocalization in Optimizely CMS 12

My previous post on translation (Translating Optimizely CMS 12 UI components) gives an overview of how to implement the FileXmlLocalizationProvider...

Eric Herlitz | Jan 27, 2023 | Syndicated blog

Breaking changes in EPiServer.CMS.TinyMce 4.0.0

After upgrading to the latest version of EPiServer.CMS.TinyMce, the dropdown with formats disappears. Learn how to get it back!

Tomas Hensrud Gulla | Jan 27, 2023 | Syndicated blog

Translating Optimizely CMS 12 UI components

Optimizely CMS 12 have been out for a while now, but still some elements haven't been properly translated resulting in a GUI defaulting to english....

Eric Herlitz | Jan 26, 2023 | Syndicated blog

Image preview in Optimizely CMS12 all properties view

With these simple steps, you can now see an Image and its Metadata, including size and dimensions, when editing an Image property in Optimizely...

Tomas Hensrud Gulla | Jan 26, 2023 | Syndicated blog