Try our conversational search powered by Generative AI!

Searching for PDF files

Vote:
 

Recently we’ve finished the upgrade from episerver 5 to episerver 7.5.

In episever 5 we were using episerver’s SearchDataSource component to search for pages and files, but also had to install adobe’s ifilter in order to retrieve pdf files as a search results for some query.

Now, in episerver 7.5, we have a problem with searching of pdf files.

As it has been announced, the SearchDataSource component search for pages but not for files anymore: http://world.episerver.com/Documentation/Items/Upgrading/EPiServer-CMS/75/Breaking-changes/:

“EPiServer.Web.WebControls.SearchDataSource does not support searching for files anymore

The default support for searching for files in the SearchDataSource has been removed. The SearchService in the Alloy MVC template package can be used as a reference implementation to see how file search can be implemented.”

 

So, we’ve decided to use SearchService as they suggested. But when I try to search, there are no results of pdf files. And I’m not sure if searching for pdf files is supported by default in episerver at all?

 

Has anyone experienced similar problem of searching of pdf files or know anything that can be useful for this problem?

Thanks

#86979
Jun 05, 2014 14:28
Vote:
 

I tried to do the same thing in the Alloy project - uploaded the pdf media file and tried to search for it by name - and it showed up in the results indeed. But when I tried to search for some text existing in the pdf's content, the pdf file didn't show up in the results.

I tried to reindex content, installed and setup the adobe’s ifilter… but still no results in the search. And when I try to do the same for word doc or txt files all works fine.

How do you guys search for content of pdf files? Has anyone achieved to do this? Is that even possible to do by using the default episerver’s search and the search service example from the Alloy MVC project?

#87342
Jun 11, 2014 0:16
Vote:
 

Did you solve the issue with searching inside pdf documents?

I have the same problem that I can't search texts inside a pdf-document. Search for the filename works fine.

#88775
Jul 30, 2014 12:29
Vote:
 

After we wrote to episerver support and got the answer we achieved to solve this problem.

They told us that it sounds like ifilter installation problem if we are not getting pdf content indexed, and suggested to try uninstalling our current ifilter and installing acrobat Reader 9 to see if that helps.

That is what I did, and after some time we saw the proper search results.

So, try to uninstall all ifilters you have now, and install Adobe Reader 9.5.0 - Svenska, that is the only version that works for us.

Good luck :)

#89698
Aug 22, 2014 11:57
Vote:
 

Hi,

I would like to add some information regarding this. This is what you have to do in order to get it to work.

Make sure that no version of Adobe Reader and Adobe Ifilter is installed.

Install Adobe Reader 9.5.0 Svenska  ( ftp://ftp.adobe.com/pub/adobe/reader/win/9.x/9.5.0/sv_SE/ )

Install Adobe IFilter 9 ( ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/PDFiFilter64installer.zip

Add "C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin" to PATH in environmental variables

Reboot computer.

So far i have only got this to work when the searchservice is run in the same applications as the web and the files are stored locally and not in a blobstorage elsewhere.

I have a ongoing conversation with epi support about this and will post any results here.

#90354
Sep 08, 2014 10:53
Vote:
 

Hi Niklas, 

Have you got an answer from EPiServer support?

Out client is chosing between out-of-the-box EPiServer.Search and EPiServer Find. And ability to search in PDF files is important functional requirement...

#119164
Edited, Mar 23, 2015 13:52
Vote:
 

Hi Andrey,

Sorry for not posting the answer here iet! We did not find a solution where you could host the search service as a sparate application and still get the PDF search to work. Hosting the search within the application works fine but you will not get any highlighting out of the box. We are currently investigating using our own ElasticSearch server. 

#119169
Mar 23, 2015 14:09
Vote:
 

Hi Andrey,

Sorry for not posting the answer here iet! We did not find a solution where you could host the search service as a sparate application and still get the PDF search to work. Hosting the search within the application works fine but you will not get any highlighting out of the box. We are currently investigating using our own ElasticSearch server. 

#119170
Mar 23, 2015 14:09
Vote:
 

Thanks for reply.

PDF indexing started to work for me.

Now I am investigating ability to do faceted search and didn't found anything except this blog_post.

I am not sure that we can say that it is supported by EPiServer.Search.

#119197
Mar 24, 2015 7:27
This topic was created over six months ago and has been resolved. If you have a similar question, please create a new topic and refer to this one.
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.