Common Pitfalls with Search in Optimizely Graph - and How to Avoid Them
Optimizely Graph offers powerful, flexible search capabilities out of the box, making it a popular choice for headless implementations. However, like any robust system, it comes with nuances that can trip you up if you’re not careful.
In this post, I’ll try to cover common pitfalls developers may encounter when working with search in Optimizely Graph, especially around content duplication, unintentional exposure, and content provider handling - and show you how to avoid them.
Duplicate Content in Search Results
One of the most frustrating issues is seeing the same content item indexed multiple times or search results returning unexpected duplicates.
Incorrect Authentication Mode
Optimizely Graph exposes not only the published versions of your content but also includes drafts and other unpublished versions. This is particularly useful for building custom on-page editing experiences in headless setups. However, it also introduces a potential risk: unpublished content may unintentionally appear in search results if access controls are not correctly configured.
Recommended Solution
To mitigate this, it's crucial to separate authentication modes between editing and search functionalities:
-
On-page editing should use authenticated access to retrieve drafts and work-in-progress content.
-
Public search should be limited strictly to published content and operate under a separate, read-only context.
Reusing tokens or authentication flows between these use cases should be strictly avoided.
This may also be a potential security concern if not handled properly - but that’s a topic worthy of a separate article.
For implementation guidance and best practices, refer to the official documentation:
Authentication – Optimizely Graph
Shortcut Pages Misindexed
In Optimizely CMS, pages using Shortcuts (e.g., “Fetch content from another page”, see documentation) are by default indexed as standalone pages, even though they point to other content. This behavior can result in unexpected duplicate entries in your search results, which may negatively affect content relevance and SEO.
Recommended Solution
To prevent this, it's important to filter out shortcut pages at the query level during indexing or search execution.
A more robust and maintainable solution is to:
-
Introduce a custom property like ExcludeFromSearch (boolean).
-
Automatically set this property to true for shortcut pages (e.g., in content events).
-
Respect this property in your search queries by excluding any pages where ExcludeFromSearch == true.
This approach not only handles shortcut pages gracefully but also gives editors fine-grained control over which pages should be searchable.
Exposing Technical Content That Shouldn't Be Indexed
It’s not uncommon for non-public content - such as settings pages, container pages, or system-level content-to inadvertently appear in Graph search results. These items were never intended to be publicly accessible and can clutter search experiences or expose internal information.
Common Causes
-
Missing exclusion flags on special or restricted content.
-
Lack of visibility review during the Graph schema setup.
-
Overly broad queries that don’t filter by content purpose or type.
How to Prevent It
Use Graph filters to explicitly exclude:
-
Content in designated folders (e.g. /settings, /utility)
-
Content with a flag like ExcludeFromSearch = true
-
Specific content types not meant for search (e.g. SiteSettingsPage, RedirectPage)
Best Practice
Implement an ExcludeFromSearch property on all content types. This offers low coupling between search logic and CMS structure, and allows editors or developers to easily manage visibility without tightly binding filtering rules to folder paths or content type checks.
Multilingual Content in Optimizely Graph
In most cases, our content exists in multiple languages. However, a basic language filter in a Graph query may unintentionally exclude content that lacks a translation in the currently selected site language.
Example Scenario:
-
A user is browsing the site in German ("de").
-
The CMS has a fallback language set to English ("en").
-
Some product pages have not yet been translated into German.
-
A basic Graph filter like Language: { Name: { eq: "de" } } will exclude these untranslated pages, even if fields like the product name are language-agnostic and would otherwise match the search criteria.
Why This Matters:
Depending on your business needs, showing fallback content might be acceptable or preferred. However, even if you choose not to display it, it’s important to make that decision consciously, not by default behavior.
Recommended Approach:
To support fallback languages in search results, leverage the following Graph fields:
-
MasterLanguage: Indicates the original language of the content.
-
ExistingLanguages: Lists all languages the content has been translated into.
By using these properties, you can intelligently include fallback content in queries-ensuring a better user experience while respecting localization strategy.
Poor Relevance and Ranking in Search Results
Even with a clean index and a well-designed Graph schema, search results can appear noisy, irrelevant, or misleading if queries aren’t thoughtfully constructed. Common issues like missing field boosting, unfiltered related content, and generic query logic can cause even high-quality content to be buried beneath less relevant matches.
Missing field boosting
By default, all fields are treated equally in Optimizely Graph unless explicitly boosted. This often leads to situations where low-priority fields (e.g., descriptions) match as strongly as high-priority ones (e.g., titles or product names).
Recommendation:
Use field-level boosting to give higher weight to meaningful fields like title, name, or tags. Learn more: Boosting in Optimizely Graph
Unintended indexing of related content
Product pages or articles often embed related content (e.g., "Related Products" blocks or references). If this content is indexed directly with the main page, it can bleed into the searchable content of the parent - leading to false positives, confusing matches, and degraded search relevance.
Example:
A product page for a wireless mouse might appear in search results for a mechanical keyboard if a related keyboard product is embedded and indexed as part of the same content.
Recommendation:
Ensure related items are either:
-
Excluded from the index, or
-
Indexed separately and referenced cleanly, rather than merged into the searchable body of the main item.
Access-Controlled Content Leaking into Public Search
In many CMS implementations, certain content such as gated resources, subscriber-only pages, or internal documentation should only be accessible to specific user groups. A critical security risk occurs when this restricted content unintentionally appears in public search results, exposing sensitive or paid information to unauthorized users.
Common Cause:
-
Incorrect authentication mode used when querying Optimizely Graph, resulting in unrestricted access to protected content.
How to Prevent It:
To ensure proper access control:
-
When authenticating to Optimizely Graph, always include the current user context and their roles/permissions.
-
Avoid using system-wide tokens or anonymous access for features that should reflect user-level content restrictions.
Optimizely Graph enforces access control only when explicitly requested via authenticated queries. If omitted, all content-including restricted items-may be returned.
How to retrieve restricted content securely (Optimizely Docs)
Final Thoughts
Optimizely Graph is a powerful tool but like any flexible system, it needs intentional setup and governance to avoid messy search results and content exposure issues.
By being aware of the common pitfalls and applying proactive filters, schema hygiene, and provider discipline, you’ll set yourself (and your editors!) up for a clean, accurate, and efficient search experience.
Have you run into a tricky Graph issue in production? Let me know - always keen to learn how others are solving these challenges.
Comments