public abstract class Analyzer extends Object implements Closeable
 In order to define what analysis is done, subclasses must define their
 TokenStreamComponents in createComponents(String).
 The components are then reused in each call to tokenStream(String, Reader).
 
Simple example:
 Analyzer analyzer = new Analyzer() {
  @Override
   protected TokenStreamComponents createComponents(String fieldName) {
     Tokenizer source = new FooTokenizer(reader);
     TokenStream filter = new FooFilter(source);
     filter = new BarFilter(filter);
     return new TokenStreamComponents(source, filter);
   }
 };
 
 For more examples, see the Analysis package documentation.
 For some concrete implementations bundled with Lucene, look in the analysis modules:
| Modifier and Type | Class and Description | 
|---|---|
| static class  | Analyzer.ReuseStrategyStrategy defining how TokenStreamComponents are reused per call to
  tokenStream(String, java.io.Reader). | 
| static class  | Analyzer.TokenStreamComponentsThis class encapsulates the outer components of a token stream. | 
| Modifier and Type | Field and Description | 
|---|---|
| static Analyzer.ReuseStrategy | GLOBAL_REUSE_STRATEGYA predefined  Analyzer.ReuseStrategythat reuses the same components for
 every field. | 
| static Analyzer.ReuseStrategy | PER_FIELD_REUSE_STRATEGYA predefined  Analyzer.ReuseStrategythat reuses components per-field by
 maintaining a Map of TokenStreamComponent per field name. | 
| Constructor and Description | 
|---|
| Analyzer()Create a new Analyzer, reusing the same set of components per-thread
 across calls to  tokenStream(String, Reader). | 
| Analyzer(Analyzer.ReuseStrategy reuseStrategy)Expert: create a new Analyzer with a custom  Analyzer.ReuseStrategy. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | close()Frees persistent resources used by this Analyzer | 
| protected abstract Analyzer.TokenStreamComponents | createComponents(String fieldName)Creates a new  Analyzer.TokenStreamComponentsinstance for this analyzer. | 
| int | getOffsetGap(String fieldName)Just like  getPositionIncrementGap(java.lang.String), except for
 Token offsets instead. | 
| int | getPositionIncrementGap(String fieldName)Invoked before indexing a IndexableField instance if
 terms have already been added to that field. | 
| Analyzer.ReuseStrategy | getReuseStrategy()Returns the used  Analyzer.ReuseStrategy. | 
| Version | getVersion()Return the version of Lucene this analyzer will mimic the behavior of for analysis. | 
| protected Reader | initReader(String fieldName,
          Reader reader)Override this if you want to add a CharFilter chain. | 
| void | setVersion(Version v)Set the version of Lucene this analyzer should mimic the behavior for for analysis. | 
| TokenStream | tokenStream(String fieldName,
           Reader reader)Returns a TokenStream suitable for  fieldName, tokenizing
 the contents ofreader. | 
| TokenStream | tokenStream(String fieldName,
           String text)Returns a TokenStream suitable for  fieldName, tokenizing
 the contents oftext. | 
public static final Analyzer.ReuseStrategy GLOBAL_REUSE_STRATEGY
Analyzer.ReuseStrategy  that reuses the same components for
 every field.public static final Analyzer.ReuseStrategy PER_FIELD_REUSE_STRATEGY
Analyzer.ReuseStrategy that reuses components per-field by
 maintaining a Map of TokenStreamComponent per field name.public Analyzer()
tokenStream(String, Reader).public Analyzer(Analyzer.ReuseStrategy reuseStrategy)
Analyzer.ReuseStrategy.
 
 NOTE: if you just want to reuse on a per-field basis, it's easier to
 use a subclass of AnalyzerWrapper such as 
 
 PerFieldAnalyerWrapper instead.
protected abstract Analyzer.TokenStreamComponents createComponents(String fieldName)
Analyzer.TokenStreamComponents instance for this analyzer.fieldName - the name of the fields content passed to the
          Analyzer.TokenStreamComponents sink as a readerAnalyzer.TokenStreamComponents for this analyzer.public final TokenStream tokenStream(String fieldName, Reader reader) throws IOException
fieldName, tokenizing
 the contents of reader.
 
 This method uses createComponents(String) to obtain an
 instance of Analyzer.TokenStreamComponents. It returns the sink of the
 components and stores the components internally. Subsequent calls to this
 method will reuse the previously stored components after resetting them
 through Analyzer.TokenStreamComponents.setReader(Reader).
 
 NOTE: After calling this method, the consumer must follow the 
 workflow described in TokenStream to properly consume its contents.
 See the Analysis package documentation for
 some examples demonstrating this.
 
 NOTE: If your data is available as a String, use
 tokenStream(String, String) which reuses a StringReader-like
 instance internally.
fieldName - the name of the field the created TokenStream is used forreader - the reader the streams source reads fromreaderAlreadyClosedException - if the Analyzer is closed.IOException - if an i/o error occurs.tokenStream(String, String)public final TokenStream tokenStream(String fieldName, String text) throws IOException
fieldName, tokenizing
 the contents of text.
 
 This method uses createComponents(String) to obtain an
 instance of Analyzer.TokenStreamComponents. It returns the sink of the
 components and stores the components internally. Subsequent calls to this
 method will reuse the previously stored components after resetting them
 through Analyzer.TokenStreamComponents.setReader(Reader).
 
 NOTE: After calling this method, the consumer must follow the 
 workflow described in TokenStream to properly consume its contents.
 See the Analysis package documentation for
 some examples demonstrating this.
fieldName - the name of the field the created TokenStream is used fortext - the String the streams source reads fromreaderAlreadyClosedException - if the Analyzer is closed.IOException - if an i/o error occurs (may rarely happen for strings).tokenStream(String, Reader)protected Reader initReader(String fieldName, Reader reader)
 The default implementation returns reader
 unchanged.
fieldName - IndexableField name being indexedreader - original Readerpublic int getPositionIncrementGap(String fieldName)
fieldName - IndexableField name being indexed.tokenStream(String,Reader).
         This value must be >= 0.public int getOffsetGap(String fieldName)
getPositionIncrementGap(java.lang.String), except for
 Token offsets instead.  By default this returns 1.
 This method is only called if the field
 produced at least one token for indexing.fieldName - the field just indexedtokenStream(String,Reader).
         This value must be >= 0.public final Analyzer.ReuseStrategy getReuseStrategy()
Analyzer.ReuseStrategy.public void setVersion(Version v)
public Version getVersion()
public void close()
close in interface Closeableclose in interface AutoCloseableCopyright © 2000-2015 Apache Software Foundation. All Rights Reserved.