Indexing an Atom feed

Vote:
 

Hi,

I read that if you host an instance of the EPiServer Full Text Indexing Search Service which appears to have been recently released you can index content via an ATOM feed.

Our MindTouch wiki can generate ATOM feeds.  Is this indexing of additional content what this service is intended for?  That's what I understood from the documentation, I just wanted to make sure I was heading along the right track.

Has anyone had any experience of doing this?

#51283
May 31, 2011 15:54
Vote:
 

The FTS service uses the Atom format for its messages. To get search results you do HTTP GET with an atom formatted request and the service returns an atom feed containing the search results. When pushing updates to the indes you do an HTTP POST with an atom formatted request containing data that you want to add to the index. So can it index your atom feed? No, not without some code which polls that feed and pushes updates to the FTS service.

#51291
May 31, 2011 22:28
Vote:
 

Thanks for your reply, Magnus.I see.  

So I need to poll the Mindtouch ATOM feed and then POST any new updates to the FTS service.

I suppose this might get messy if when people start making edits to content in the wiki.  I suppose I'd need to check to make sure that a content item didn't exist (possibly deleting it if it did) before adding it to FTS, otherwise we'd end up with duplicated entries.  That leads me to think about deleted content items.  We'd need to handle the delete events somehow, and remove their associated content from the index somehow.

It doesn't look as though the FTS supports deleting individual records, so presumably we'd have to wipe the named instance and reload from a complete ATOM feed of all content.  We could create many smaller named instances in order to isolate chnages, but then I suppose we'd need to search multiple instances and combine the results.

Mindtouch also uses Lucene to search and has its own index.  Can we point FTS directly at this additional index?  A bit like the previous paragraph, can we query more than one lucene index with one query?

#51326
Jun 01, 2011 16:43
Vote:
 

I'm sorry, I don't know if the query format is specific for the EPiServer implementation of the Lucene service or if you could combine multiple such services in some way.

handling removed items I have no better suggestion than to look at the Relate templates and assemblies (using your favorite IL decompiler) to see what happens when entities are removed.

#51329
Jun 01, 2011 23:52
Vote:
 

Magnus, using an IL decompiler to look at Relate code seems more difficult than to actually open the source files ;)

You're able to update individual index items. There are a bunch of examples, for example Relate includes code to listen to CMS page publications and modifications, and will push the page to the index in response to these changes.

The code is here:

Templates\RelatePlus\CmsModules\CmsSearchHandler.cs (which converts the PageData to an IndexItem (which is searchable)) and 

Templates\RelatePlus\InitializationModules\CmsIntegrationModule.cs (which listens to CMS events and pushes them to the index)

Good luck!

#51632
Jun 17, 2011 16:11
Vote:
 

I meant the Community/Common assemblies, but quite possibly I would use a decompiler for the relate assembly as well, simply by force of habit :)

#51654
Jun 20, 2011 7:45
This thread is locked and should be used for reference only. Please use the Legacy add-ons forum to open new discussions.
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.