You can index attachments, meaning files such as Word and PDF documents. For a list of supported formats, see the Apache Tika documentation.
To index attachments using the .NET API, create an instance of a class that has a property of type Attachment (found in the EPiServer.Find namespace). The Attachment class constructor has a single parameter of type Func<FileStream>. Another class, FileAttachment (also in the EPiServer.Find namespace) requires a file path as a constructor parameter.
Examples
You create a class named Document.
C#
public class Document
{
public string Name { get; set; }
public Attachment Attachment { get; set; }
}
You can index an instance of the Document class to index a Word document along with some meta data (Name in this example).
C#
var path = "TestData/Memoirs.docx";
var document = new Document()
{
Name = "My memoirs",
Attachment = new FileAttachment(path);
}
client.Index(document);
You can search the indexed Word document. For example, if it contains "Banana," the result variable below would contain a hit.
C#
var result = client.Search<Document>()
.For("Banana").GetResult();
Important note
At this time, a REST API issue causes an exception the first time an instance of a type with an Attachment property (document in this example) is indexed. This only happens the first time--after that, everything works as expected.
Do you find this information helpful? Please log in to provide feedback.