KennyG
Jun 10, 2015
  5189
(6 votes)

Automatic PDF thumbnails when uploading to EPiServer

Like any other developer or content editor I'm lazy. Not generally lazy, just when it comes to repetitive tasks. Creating and adding thumbnails for PDF files is one of these cases. I could manually create a thumbnail, upload it, and associate it but there has got to be a better way. I mean the system already creates thumbnails from full-size images.

Override Thumbnail property

So I did some reading and found that thumbnails are stored as binary data associated with IContentMedia using the Blob property type. I took Johan Bjornfot's article and adapted it to store a thumbnail for the PDF.

    [MediaDescriptor(ExtensionString = "pdf")]
    public class PdfFile : MediaData
    {

        [CultureSpecific]
        [Editable(true)]
        [Display(Name = "Description", Description = "PDF description", GroupName = SystemTabNames.Content, Order = 1)]
        public virtual String Description { get; set; }

        public override Blob Thumbnail
        {
            get { return base.Thumbnail; }
            set { base.Thumbnail = value; }
        }

        [Editable(false)]
        public virtual bool ProcessedThumb { get; set; }

    }

Now that I had a place to store it I needed to create it.

Ghostscript.NET to the rescue

I tried many different ghostscript wrapper packages without much luck. Finally I stumbled on Ghostscript.NET which luckily is available via NuGet. You will also need to have the native Ghostscript library installed on your server. I adapted the rasterizer sample to fit in theIInitializableModule example from Johan's article.

using System;
using System.Linq;
using EPiServer.Framework;
using EPiServer.Framework.Initialization;
using EPiServer.ServiceLocation;
using EPiServer.Core;
using Models.Media;
using EPiServer;
using EPiServer.Framework.Blobs;
using EPiServer.Web.Routing;
using Ghostscript.NET;
using Ghostscript.NET.Rasterizer;
using EPiServer.DataAccess;
using EPiServer.Security;
using System.IO;


    [InitializableModule]
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class PDFThumbCreatorModule : IInitializableModule
    {
        public void Initialize(InitializationEngine context)
        {
            var eventRegistry = ServiceLocator.Current.GetInstance();

            var contentEvents = context.Locate.Advanced.GetInstance();

            contentEvents.PublishingContent += (sender, args) =>
            {
                var page = args.Content as PdfFile;
                if (page != null && !page.ProcessedThumb)
                {
                    args.Items["ProcessThumb"] = true;
                }
            };
            contentEvents.PublishedContent += (sender, args) =>
            {
                var page = args.Content as PdfFile;
                if (page != null && args.Items["ProcessThumb"] != null)
                {
                    context.Locate.Advanced.GetInstance().CreateThumb(page);
                }
            };
        }

        public void Preload(string[] parameters) { }

        public void Uninitialize(InitializationEngine context)
        {
            //Add uninitialization logic
        }
    }
    public class PDFThumbCreator
    {
        private IContentRepository _contentRepository;
        private BlobFactory _blobFactory;
        private UrlResolver _urlResolver;

        private GhostscriptVersionInfo _lastInstalledVersion = null;
        private GhostscriptRasterizer _rasterizer = null;

        public PDFThumbCreator(IContentRepository contentRepository, BlobFactory blobFactory, UrlResolver urlResolver)
        {
            _contentRepository = contentRepository;
            _blobFactory = blobFactory;
            _urlResolver = urlResolver;
        }

        public virtual void CreateThumb(PdfFile pdf)
        {

            pdf = pdf.CreateWritableClone() as PdfFile;
            pdf.Thumbnail = _blobFactory.CreateBlob(Blob.GetContainerIdentifier(pdf.ContentGuid), ".png");

            var pdfUrl = UrlResolver.Current.GetUrl(pdf.ContentLink);
            var absolutePdfUrl = UriSupport.CreateAbsoluteUri(pdfUrl);

            System.Drawing.Image img = null;

            using (var stream = pdf.BinaryData.OpenRead())
            {
                int desired_x_dpi = 96;
                int desired_y_dpi = 96;

                _lastInstalledVersion =
                    GhostscriptVersionInfo.GetLastInstalledVersion(
                            GhostscriptLicense.GPL | GhostscriptLicense.AFPL,
                            GhostscriptLicense.GPL);

                _rasterizer = new GhostscriptRasterizer();


                _rasterizer.Open(stream, _lastInstalledVersion, false);

                img = _rasterizer.GetPage(desired_x_dpi, desired_y_dpi, 1);

                _rasterizer.Close();

            }

            using (var writeStream = pdf.Thumbnail.OpenWrite())
            {
                var imgbytes = ImageToByte2(img);
                writeStream.Write(imgbytes, 0, imgbytes.Length);
            }
            pdf.ProcessedThumb = true;
            _contentRepository.Save(pdf, SaveAction.Publish | SaveAction.ForceCurrentVersion | SaveAction.SkipValidation, AccessLevel.NoAccess);
        }

        public static byte[] ImageToByte2(System.Drawing.Image img)
        {
            byte[] byteArray = new byte[0];
            using (MemoryStream stream = new MemoryStream())
            {
                img.Save(stream, System.Drawing.Imaging.ImageFormat.Png);
                stream.Close();

                byteArray = stream.ToArray();
            }
            return byteArray;
        }
    }

A few things I learned the hard way. The first was to have it rasterize only the first page! The second was that I needed to close the rasterizer after it created the image. The sample code didn't seem to do that. Otherwise, I couldn't create any more thumbnails until I killed the process. I also learned that merely uploading a PDF counts as publishing the file, then the thumbnail blob gets created and the reference is added back to the PDF object and it publishes again. I got caught in an endless loop where it generated another thumbnail everytime it updated the blob reference. This is what the ProcessedThumb flag solves.

Now I've got a thumbnail saved, how do I get to it?

You are able to route directly to a blob property on a content instance by appending the blobproperty name. In this case that would be /thumbnail. I used that in my view like so:

var thumbnailUrl = String.Format("{0}/Thumbnail", UrlResolver.Current.GetUrl(item.ContentLink));
    
<img src="@thumbnailUrl" alt="@item.Name" title="@item.Name" class="image-file" />    

I hope you found this post helpful and maybe learned something from my mistakes.

Jun 10, 2015

Comments

valdis
valdis Jun 10, 2015 07:39 PM

This approach looks interesting. I'm most probably even more lazier than you :) I would look for some 3rd party solution that does this - for instance PdfRenderer for ImageResizer. This guy should generate thumbnails on fly ;)

Please login to comment.
Latest blogs
Opti ID overview

Opti ID allows you to log in once and switch between Optimizely products using Okta, Entra ID, or a local account. You can also manage all your use...

K Khan | Jul 26, 2024

Getting Started with Optimizely SaaS using Next.js Starter App - Extend a component - Part 3

This is the final part of our Optimizely SaaS CMS proof-of-concept (POC) blog series. In this post, we'll dive into extending a component within th...

Raghavendra Murthy | Jul 23, 2024 | Syndicated blog

Optimizely Graph – Faceting with Geta Categories

Overview As Optimizely Graph (and Content Cloud SaaS) makes its global debut, it is known that there are going to be some bugs and quirks. One of t...

Eric Markson | Jul 22, 2024 | Syndicated blog

Integration Bynder (DAM) with Optimizely

Bynder is a comprehensive digital asset management (DAM) platform that enables businesses to efficiently manage, store, organize, and share their...

Sanjay Kumar | Jul 22, 2024