Recently we’ve finished the upgrade from episerver 5 to episerver 7.5.
In episever 5 we were using episerver’s SearchDataSource component to search for pages and files, but also had to install adobe’s ifilter in order to retrieve pdf files as a search results for some query.
Now, in episerver 7.5, we have a problem with searching of pdf files.
As it has been announced, the SearchDataSource component search for pages but not for files anymore: http://world.episerver.com/Documentation/Items/Upgrading/EPiServer-CMS/75/Breaking-changes/:
“EPiServer.Web.WebControls.SearchDataSource does not support searching for files anymore
The default support for searching for files in the SearchDataSource has been removed. The SearchService in the Alloy MVC template package can be used as a reference implementation to see how file search can be implemented.”
So, we’ve decided to use SearchService as they suggested. But when I try to search, there are no results of pdf files. And I’m not sure if searching for pdf files is supported by default in episerver at all?
Has anyone experienced similar problem of searching of pdf files or know anything that can be useful for this problem?
I tried to do the same thing in the Alloy project - uploaded the pdf media file and tried to search for it by name - and it showed up in the results indeed. But when I tried to search for some text existing in the pdf's content, the pdf file didn't show up in the results.
I tried to reindex content, installed and setup the adobe’s ifilter… but still no results in the search. And when I try to do the same for word doc or txt files all works fine.
How do you guys search for content of pdf files? Has anyone achieved to do this? Is that even possible to do by using the default episerver’s search and the search service example from the Alloy MVC project?
Did you solve the issue with searching inside pdf documents?
I have the same problem that I can't search texts inside a pdf-document. Search for the filename works fine.
After we wrote to episerver support and got the answer we achieved to solve this problem.
They told us that it sounds like ifilter installation problem if we are not getting pdf content indexed, and suggested to try uninstalling our current ifilter and installing acrobat Reader 9 to see if that helps.
That is what I did, and after some time we saw the proper search results.
So, try to uninstall all ifilters you have now, and install Adobe Reader 9.5.0 - Svenska, that is the only version that works for us.
Good luck :)
I would like to add some information regarding this. This is what you have to do in order to get it to work.
Make sure that no version of Adobe Reader and Adobe Ifilter is installed.
Install Adobe Reader 9.5.0 Svenska ( ftp://ftp.adobe.com/pub/adobe/reader/win/9.x/9.5.0/sv_SE/ )
Install Adobe IFilter 9 ( ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/PDFiFilter64installer.zip )
Add "C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin" to PATH in environmental variables
So far i have only got this to work when the searchservice is run in the same applications as the web and the files are stored locally and not in a blobstorage elsewhere.
I have a ongoing conversation with epi support about this and will post any results here.
Have you got an answer from EPiServer support?
Out client is chosing between out-of-the-box EPiServer.Search and EPiServer Find. And ability to search in PDF files is important functional requirement...
Sorry for not posting the answer here iet! We did not find a solution where you could host the search service as a sparate application and still get the PDF search to work. Hosting the search within the application works fine but you will not get any highlighting out of the box. We are currently investigating using our own ElasticSearch server.
Thanks for reply.
PDF indexing started to work for me.
Now I am investigating ability to do faceted search and didn't found anything except this blog_post.
I am not sure that we can say that it is supported by EPiServer.Search.