Arthur Vander Voort
Feb 18, 2023
  2076
(0 votes)

Optimizely PIM - Data Cleansing in PIM: Save Time by Importing Messy Data

In my previous post, I talked pretty extensively about why setting up data governance is both an obstacle and essential to getting data into the PIM. Another common obstacle is trying to perfect your data prior to importing it to the PIM. You can spend a lot of time in an Excel spreadsheet, trying to find and replace a bunch of bad values across dozens or hundreds of columns. While this is a valid approach, Optimizley PIM also has functionality that you can leverage to cleanse your data directly in the application. This article outlines an approach to speed up your implementations without sacrificing data quality.

MessyDataSample.png

Before we get into it, here are a few caveats:

  • This approach is specifically for what we commonly call 'attribute' data – product data that has a defined list of values. Think of size, color, guage, width, length, material, finish, wattage, and so on.
  • Depending on the number of properties and property values, this can still be a time consuming process. 
  • This approach will not help you clean up data for unique product properties like product description or product title.
  • This approach is only recommended during implementation. You must re-establish strong data governance after cleansing your data to get value from this exercise.

Set up your properties to allow bad data

Properties that ony allow a defined list of values typically make up the bulk of product data. Normally, when data is imported, if a product has a value that is not on the list for this property, the data will not be imported, and the bad data will be isolated. The key to this approach is to intentionally set your properties in a way that allows bad data to be ingested. You must set all of your properties that use a defined list of values to allow ad hoc values. With ad hoc values enabled, no data governance is enforced, and any value will be imported and added to the property's value list.

If you already have all of your properties created, you can easily toggle them to allow ad hoc values:

  1. Filter the property list for properties that use a dropdown control type.
  2. Export properties and choose the option to only included the filtered list.
  3. In the exported file, set 'Allow Adhoc Values' to 'yes' for every row.
  4. Import the properties to the PIM.
SetAdhocs.gif

Import your messy product data

With the properties configured, you can now import your product data. Data governance will not prevent your messy data from being imported for these properties, which is exactly what we want.

  1. Go to imports.
  2. Select your product data file.
  3. Map your data and proceed through the import wizard.
  4. Initiate the product import.

Review & cleanse your data

With the import complete, all of the data, good and bad, has been imported and the property value lists have been updated. This is where the work starts. We are going to review the values for each property and correct any bad ones. To do this, we will filter for the properties we configured to allow ad hoc values and then work our way through the list.

  1. On the property list, apply a filter for properties that use a dropdown control type.
  2. Edit the first property.
  3. Go to the values tab.
  4. Review the list of values for bad data.
  5. For any bad value, click the edit icon and enter the preferred value (for instance, if you want '33 inches to' be '33 in', update this).
  6. When you have cleaned up all the bad values, toggle 'allow ad hoc values' to false and save the property.
  7. Repeat until you are through all properties.
PropertyValueCleanup.gif

When the data was initially imported, any bad values were added to the property list and saved to products that had those values. When you edit the property value and save the property, we will propagate the changes to all products that had the value. This lets you remove duplicate (e.g. 24" & 24 inches) and erroneous values while automatically updating the product data.

Re-establish your data governance to maintain quality

Now that all of our property value lists have been reviewed and the product data is cleansed,  we need to ensure that we don't waste this effort by allowing bad data back into the PIM. In the previous step, we recommended toggling the ad hoc value option off after you finished each property, but make sure you did not miss any. To do this, we can use the export/import approach again similar to when we initially enabled the ad hoc values.

  1. Filter the property list for properties that use a dropdown control type.
  2. Export properties and choose the option to only included the filtered list.
  3. In the exported file, set 'enable ad hoc' to 'no' for every row.
  4. Import the properties to PIM.
Feb 18, 2023

Comments

Please login to comment.
Latest blogs
From Prompting to Production: Optimizely Opal University Cohort and the Future of Agentic MarTech

Most organizations today are still playing with AI. They experiment with prompts, test ideas in isolated chats, and occasionally automate a task or...

Augusto Davalos | Apr 28, 2026

Six Compelling Reasons for Upgrading to CMS 13

Most software updates ask you to keep up. Optimizely CMS 13 asks something different — it asks whether your digital strategy is built for a world...

Muhammad Talha | Apr 28, 2026

Optimizely CMS 13 breaking changes: GetContentTypePropertyDisplayName

When upgrading from CMS 12 to 13, resolving property display names may not work as before. Here’s what changed.

Tomas Hensrud Gulla | Apr 27, 2026 |

Accelerate Optimizely DAM Adoption: Unlocking Business Value with Metadata Bulk Import

Accelerating Optimizely DAM Adoption How a Metadata-Driven Bulk Import Utility Unlocks Real Business Value Executive Summary For enterprises runnin...

Vaibhav | Apr 27, 2026