According to "Microsoft Indexing Service" (23-09-2008) EPiServer indexing takes care of indexing the versioning UFS. I cant get it to index (probably find and bind) to the pdf IFilter. MS-Indexing-server is working and placing the same pdf file in a native UFS a MS-indexingserver search with a SearchDataSource is working.
Is EPiServer Indexing-service supposed to bind to a pdf-Ifilter? It seems to index .doc and .xls files. I have recreated the index folders in the VPP folders.
Must the IFilter perhaps be present when the EPiServer installation is made?
The EPiServer Indexing Service should find any IFilter when indexing files. The IFilter must be available when the document is being indexed but not when EPiServer is installed (since filters are discovered at runtime when indexing documents). Which version of EPiServer CMS are you using ?
I'm using CMS 5 R2 SP1 on Windows 2003 R2. Scratch installed. Installed pdf IFilter first from Acrobat reader 9 (IFilter v 6.0) Have tried v 5.0 also.
IndexingService proces loads the filter at startup (scaned with filmon). But no indexing. What is trigging the indexing process to start? EPiServer or a filesystem eventwatch? Is there a way of logging the EPiServer indexing service?
The service queries the database every minute for files that are not indexed (for example new or changed files uploaded inside edit-mode). If the complete index is deleted from disk it will reindex everything from scratch.
Stop the Windows service, then you can start the indexing service exe from a command prompt with "EPiServer.IndexingService.exe DEBUG" to see what it actually does. (That will print all log4net messages to the console).
Thanks. I uploaded the good old Drift av EPiServer 4 pdf (244kB) and this is what the log says:
Scanning configuration 2 for changesDeleting item medlemmar 2008.pdf from indexDeleted 0 item(s) from indexDeleting item Drift av EpiServer 4.pdf from indexDeleted 0 item(s) from indexCreate index document for item Drift av EpiServer 4.pdfException creating index for item Drift av EpiServer 4.pdf - Capacity exceeds maximum capacity.Parameter name: capacityFailed to create index document for item Drift av EpiServer 4.pdfClosing indexGoing to sleep
The line "Deleting item medlemmar 2008.pdf from index" repeats every run (every minute) in all (3) index configurations.
I think this is a problem that has been fixed after SP1, but we have not been able to reproduce it. Basically we do not allocate enough memory for some IFilter chunking implementations. Try uninstalling the latest IFilter and get an older version installed.
I think you need to file a support case to dig any deeper into this if you can't get it to work, we have had several related cases to PDF:s the last month so a reasonable guess is that it is version related since this has worked in the past.
Did you get anywhere when you tried installing a previous version of IFilter? We are experiencing similar problems when using IFilter v 6.0