London Dev Meetup Rescheduled! Due to unavoidable reasons, the event has been moved to 21st May. Speakers remain the same—any changes will be communicated. Seats are limited—register here to secure your spot!

Sameer
May 26, 2020
  2917
(2 votes)

Import assets in bulk and create respective pages in Episerver CMS

Hello Guys,

Recently I came across one of the requirement to import PDF documents from external drive and create respective publication pages in Episerver. I have divided this requirement into below tasks:

  1. Import all PDF's into Episerver in same structure as it is in external drive
  2. Create episerver page for each PDF in same structure

To start with I have created two properties in setting page 1. Reference folder in Asset under which all PDF's will get imported and 2. Container page under which all publication pages will get created.

After creating properties, I have created admin tool to Import PDF. Below is logic to add in import service. 

I still have to extend this feature to copy content from PDF to Publication pages. I will be using iText7 library for extension. Below is sample code for reading PDF's.

I have used below article as inspiration! 

https://blog.nicolaayan.com/2017/03/episerver-how-to-upload-media-assets-in-bulk/

 

This functionality is very useful while migration of data. We can also extend this feature for various types of documents like PDF, Doc etc. 

Thank you!

May 26, 2020

Comments

Kane Made It
Kane Made It May 28, 2020 02:24 AM

Hi Sameer, thank you for your contribution. It would be nice if you can share the code with us to reduce typo when apply your finding. The editor supports insert code with formatting like C#, PHP, Javascript, etc.

Chandrakant Hadpidkar
Chandrakant Hadpidkar May 28, 2020 04:35 PM

Hi Sam,

It is great article 👍, I have one query related to pdf read functionality, reading text from file is bit easier to achieve but what if the pdf has a scanned image, then how do you handle that?

Have came across any such situations during above implementation? 

I mean,how you make sure that all pdf you are reading having readable text or it has a scanned image.

Thanks,

Chan

Sameer
Sameer Jun 1, 2020 01:12 PM

Hi Chan!

Thank you Chan!

Yes, above PDF file component will work to extract content from scanned PDF also. It extract all content even from scanned PDF copies!

//Sameer

Please login to comment.
Latest blogs
Render ContentArea without wrapping them in surrounding div

CustomContentAreaRenderer is a specialized class that overrides the default ContentAreaRenderer. It customizes the rendering behavior for content...

sunylcumar | May 18, 2025

Indexing a content item programatically

public bool IndexContent(int contentId, bool contentOnly, bool childrenOnly, string language) { // Retrieve the content var contentReference = new...

sunylcumar | May 18, 2025

Add a new menu item to the Admin Menu in Optimizely CMS

Create a new Controller called CustomMenuController and decorate with [Authorize(Roles ="CMSAdmins")] so that it will be accessed by admins only...

sunylcumar | May 18, 2025

Display page/block thumbnail based on selected site in multi-site solution

In previous blog we described how to control the visibility of the blocks or properties based on the current site in multisite solution. We can use...

Tomek Juranek | May 16, 2025

Understanding the Infrastructure Powering AI Agents for Marketing

The marketing world is increasingly captivated by the potential of AI agents. However, it's crucial to recognize that these agents are not simply...

Patrick Lam | May 15, 2025

Meet the Newest OMVPs – Winter 2025 Cohort

We're excited to officially welcome the latest winter cohort of Optimizely Most Valuable Professionals (OMVPs) - a group of passionate tech...

Satata Satez | May 14, 2025