Class TextIndexer
Index Html into keywords.
Inheritance
Inherited Members
Namespace: EPiServer.Core.Html
Assembly: EPiServer.dll
Version: 10.10.4Syntax
public class TextIndexer
Constructors
TextIndexer()
Declaration
public TextIndexer()
Methods
StripHtml(String, Int32)
Strip all HTML elements from a HTML string and return text.
Declaration
public static string StripHtml(string htmlText, int maxTextLengthToReturn)
Parameters
Type | Name | Description |
---|---|---|
System.String | htmlText | A HTML string |
System.Int32 | maxTextLengthToReturn | Max string length to return. 0 returns all text in the HTML string. |
Returns
Type | Description |
---|---|
System.String | A string with text |
Remarks
Note that maxTextLengthToReturn will count a HTML entity as one character.
If the stripped text is longer than the specified max length, the string will be truncated and "..." appended at the end.
The algorithm in StripHTML works as follows:
Remove all tags, replacing them with a space. All whitespaces are replaced by space. All consecutive whitespaces are merged into one space.
StripHtml(String, Int32, Int32, String)
Strip all HTML elements from a HTML string and return text.
Declaration
public static string StripHtml(string htmlText, int maxTextLengthToReturn, int maxWordSize, string moreTextMarker)
Parameters
Type | Name | Description |
---|---|---|
System.String | htmlText | The HTML text. |
System.Int32 | maxTextLengthToReturn | The max text length to return. |
System.Int32 | maxWordSize | Maximum word size osed when placing the moreTextMarker at end of string. |
System.String | moreTextMarker | The more text marker. |
Returns
Type | Description |
---|---|
System.String | A string without HTML markup. |