Uses of Package
org.apache.lucene.analysis.tokenattributes
Package
Description
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the
Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.General-purpose attributes for text analysis.
-
-
ClassDescriptionThis attribute can be used if you have the raw term bytes to be indexed.The term text of a Token.Default implementation of
CharTermAttribute
.This attribute can be used to pass different flags down theTokenizer
chain, e.g.This attribute can be used to mark a token as a keyword.The start and end character offset of a Token.Default implementation of the common attributes used by Lucene:CharTermAttribute
TypeAttribute
PositionIncrementAttribute
PositionLengthAttribute
OffsetAttribute
TermFrequencyAttribute
The payload of a Token.Default implementation ofPayloadAttribute
.Determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching.Determines how many positions this token spans.This attribute tracks what sentence a given token belongs to as well as potentially other sentence specific attributes.Sets the custom term frequency of a term within one document.This attribute is requested by TermsHashPerField to index the contents.A Token's lexical type.