Try our conversational search powered by Generative AI!

PDF files and EPiServer indexing.

Vote:
 

Hi

According to "Microsoft Indexing Service" (23-09-2008) EPiServer indexing takes care of indexing the versioning UFS. I cant get it to index (probably find and bind) to the pdf IFilter. MS-Indexing-server is working and placing the same pdf file in a native UFS a MS-indexingserver search with a SearchDataSource is working.

Is EPiServer Indexing-service supposed to bind to a pdf-Ifilter? It seems to index .doc and .xls files. I have recreated the index folders in the VPP folders.

Must the IFilter perhaps be present when the EPiServer installation is made?

Any one?

#29106
Apr 06, 2009 16:18
Vote:
 

The EPiServer Indexing Service should find any IFilter when indexing files. The IFilter must be available when the document is being indexed but not when EPiServer is installed (since filters are discovered at runtime when indexing documents). Which version of EPiServer CMS are you using ?

 

 

#29115
Apr 07, 2009 12:11
Vote:
 

Hi

I'm using CMS 5 R2 SP1 on Windows 2003 R2. Scratch installed. Installed pdf IFilter first from Acrobat reader 9 (IFilter v 6.0) Have tried v 5.0 also.

IndexingService proces loads the filter at startup (scaned with filmon). But no indexing. What is trigging the indexing process to start? EPiServer or a filesystem eventwatch? Is there a way of logging the EPiServer indexing service?

#29119
Apr 07, 2009 14:38
Vote:
 

The service queries the database every minute for files that are not indexed (for example new or changed files uploaded inside edit-mode). If the complete index is deleted from disk it will reindex everything from scratch.

Stop the Windows service, then you can start the indexing service exe from a command prompt with "EPiServer.IndexingService.exe DEBUG" to see what it actually does. (That will print all log4net messages to the console).

#29122
Apr 07, 2009 18:14
Vote:
 

Thanks. I uploaded the good old Drift av EPiServer 4 pdf (244kB) and this is what the log says: 

Scanning configuration 2 for changes
Deleting item medlemmar 2008.pdf from index
Deleted 0 item(s) from index
Deleting item Drift av EpiServer 4.pdf from index
Deleted 0 item(s) from index
Create index document for item Drift av EpiServer 4.pdf
Exception creating index for item Drift av EpiServer 4.pdf - Capacity exceeds ma
ximum capacity.
Parameter name: capacity
Failed to create index document for item Drift av EpiServer 4.pdf
Closing index
Going to sleep

The line "Deleting item medlemmar 2008.pdf from index" repeats every run (every minute) in all (3) index configurations.

/Janne 

 

#29124
Apr 08, 2009 8:50
Vote:
 

I think this is a problem that has been fixed after SP1, but we have not been able to reproduce it. Basically we do not allocate enough memory for some IFilter chunking implementations. Try uninstalling the latest IFilter and get an older version installed.

I think you need to file a support case to dig any deeper into this if you can't get it to work, we have had several related cases to PDF:s the last month so a reasonable guess is that it is version related since this has worked in the past.

 

#29137
Apr 08, 2009 15:53
Vote:
 

Hi Jan,

Did you get anywhere when you tried installing a previous version of IFilter?  We are experiencing similar problems when using IFilter v 6.0

 

Thanks

#32867
Sep 22, 2009 12:01
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.