AnalyzingInfixSuggester (Lucene 4.7.2 API)

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.search.suggest.analyzing
Class AnalyzingInfixSuggester

java.lang.Object
  org.apache.lucene.search.suggest.Lookup
      org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester

All Implemented Interfaces:: Closeable

Direct Known Subclasses:: BlendedInfixSuggester

public class AnalyzingInfixSuggester
extends Lookup
implements Closeable
extends Lookup
implements Closeable

Analyzes the input text and then suggests matches based on prefix matches to any tokens in the indexed text. This also highlights the tokens that match.

This just uses an ordinary Lucene index. It supports payloads, and records these as a BinaryDocValues field. Matches are sorted only by the suggest weight; it would be nice to support blended score + weight sort in the future. This means this suggester best applies when there is a strong apriori ranking of all the suggestions.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.search.suggest.Lookup
`Lookup.LookupPriorityQueue, Lookup.LookupResult`

Field Summary
`static int`	`DEFAULT_MIN_PREFIX_CHARS` Default minimum number of leading characters before PrefixQuery is used (4).
`protected Analyzer`	`indexAnalyzer` Analyzer used at index time
`protected BinaryDocValues`	`payloadsDV` DocValuesField holding the payloads; null if payloads were not indexed.
`protected Analyzer`	`queryAnalyzer` Analyzer used at search time
`protected IndexSearcher`	`searcher` `IndexSearcher` used for lookups.
`protected static String`	`TEXT_FIELD_NAME` Field name used for the indexed text.
`protected BinaryDocValues`	`textDV` DocValuesField holding each suggestion's text.
`protected NumericDocValues`	`weightsDV` DocValuesField holding each suggestion's weight.

Fields inherited from class org.apache.lucene.search.suggest.Lookup
`CHARSEQUENCE_COMPARATOR`

Constructor Summary
`AnalyzingInfixSuggester(Version matchVersion, File indexPath, Analyzer analyzer)` Create a new instance, loading from a previously built directory, if it exists.
`AnalyzingInfixSuggester(Version matchVersion, File indexPath, Analyzer indexAnalyzer, Analyzer queryAnalyzer, int minPrefixChars)` Create a new instance, loading from a previously built directory, if it exists.

Method Summary
`protected void`	`addNonMatch(StringBuilder sb, String text)` Called while highlighting a single result, to append a non-matching chunk of text from the suggestion to the provided fragments list.
`protected void`	`addPrefixMatch(StringBuilder sb, String surface, String analyzed, String prefixToken)` Called while highlighting a single result, to append a matched prefix token, to the provided fragments list.
`protected void`	`addWholeMatch(StringBuilder sb, String surface, String analyzed)` Called while highlighting a single result, to append the whole matched token to the provided fragments list.
`void`	`build(InputIterator iter)` Builds up a new internal `Lookup` representation based on the given `InputIterator`.
`void`	`close()`
`protected List<Lookup.LookupResult>`	`createResults(TopDocs hits, int num, CharSequence charSequence, boolean doHighlight, Set<String> matchedTokens, String prefixToken)` Create the results based on the search hits.
`protected Query`	`finishQuery(BooleanQuery in, boolean allTermsRequired)` Subclass can override this to tweak the Query before searching.
`long`	`getCount()` Get the number of entries the lookup was built with
`protected Directory`	`getDirectory(File path)` Subclass can override to choose a specific `Directory` implementation.
`protected IndexWriterConfig`	`getIndexWriterConfig(Version matchVersion, Analyzer indexAnalyzer)` Override this to customize index settings, e.g.
`protected Query`	`getLastTokenQuery(String token)` This is called if the last token isn't ended (e.g.
`protected FieldType`	`getTextFieldType()` Subclass can override this method to change the field type of the text field e.g.
`protected Object`	`highlight(String text, Set<String> matchedTokens, String prefixToken)` Override this method to customize the Object representing a single highlighted suggestions; the result is set on each `Lookup.LookupResult.highlightKey` member.
`boolean`	`load(DataInput out)` Discard current lookup data and load it from a previously saved copy.
`List<Lookup.LookupResult>`	`lookup(CharSequence key, boolean onlyMorePopular, int num)` Look up a key and return possible completion for this key.
`List<Lookup.LookupResult>`	`lookup(CharSequence key, int num, boolean allTermsRequired, boolean doHighlight)` Retrieve suggestions, specifying whether all terms must match (`allTermsRequired`) and whether the hits should be highlighted (`doHighlight`).
`long`	`sizeInBytes()` Get the size of the underlying lookup implementation in memory
`boolean`	`store(DataOutput in)` Persist the constructed lookup data to a directory.

Methods inherited from class org.apache.lucene.search.suggest.Lookup
`build, load, store`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

TEXT_FIELD_NAME

protected static final String TEXT_FIELD_NAME

Field name used for the indexed text.

See Also:: Constant Field Values

queryAnalyzer

protected final Analyzer queryAnalyzer

Analyzer used at search time

indexAnalyzer

protected final Analyzer indexAnalyzer

Analyzer used at index time

searcher

protected IndexSearcher searcher

IndexSearcher used for lookups.

payloadsDV

protected BinaryDocValues payloadsDV

DocValuesField holding the payloads; null if payloads were not indexed.

textDV

protected BinaryDocValues textDV

DocValuesField holding each suggestion's text.

weightsDV

protected NumericDocValues weightsDV

DocValuesField holding each suggestion's weight.

DEFAULT_MIN_PREFIX_CHARS

public static final int DEFAULT_MIN_PREFIX_CHARS

Default minimum number of leading characters before PrefixQuery is used (4).

See Also:: Constant Field Values

Constructor Detail

AnalyzingInfixSuggester

public AnalyzingInfixSuggester(Version matchVersion,
                               File indexPath,
                               Analyzer analyzer)
                        throws IOException

Create a new instance, loading from a previously built directory, if it exists.

Throws:: IOException

AnalyzingInfixSuggester

public AnalyzingInfixSuggester(Version matchVersion,
                               File indexPath,
                               Analyzer indexAnalyzer,
                               Analyzer queryAnalyzer,
                               int minPrefixChars)
                        throws IOException

Create a new instance, loading from a previously built directory, if it exists.

Parameters:: minPrefixChars - Minimum number of leading characters before PrefixQuery is used (default 4). Prefixes shorter than this are indexed as character ngrams (increasing index size but making lookups faster).
Throws:: IOException

Method Detail

getIndexWriterConfig

protected IndexWriterConfig getIndexWriterConfig(Version matchVersion,
                                                 Analyzer indexAnalyzer)

Override this to customize index settings, e.g. which codec to use.

getDirectory

protected Directory getDirectory(File path)
                          throws IOException

Subclass can override to choose a specific Directory implementation.

Throws:: IOException

build

public void build(InputIterator iter)
           throws IOException

Description copied from class: Lookup

Builds up a new internal Lookup representation based on the given InputIterator. The implementation might re-sort the data internally.

Specified by:: build in class Lookup

Throws:: IOException

getTextFieldType

protected FieldType getTextFieldType()

Subclass can override this method to change the field type of the text field e.g. to change the index options

lookup

public List<Lookup.LookupResult> lookup(CharSequence key,
                                        boolean onlyMorePopular,
                                        int num)

Description copied from class: Lookup

Look up a key and return possible completion for this key.

Specified by:: lookup in class Lookup

Parameters:: key - lookup key. Depending on the implementation this may be a prefix, misspelling, or even infix.; onlyMorePopular - return only more popular results; num - maximum number of results to return
Returns:: a list of possible completions, with their relative weight (e.g. popularity)

getLastTokenQuery

protected Query getLastTokenQuery(String token)
                           throws IOException

This is called if the last token isn't ended (e.g. user did not type a space after it). Return an appropriate Query clause to add to the BooleanQuery.

Throws:: IOException

lookup

public List<Lookup.LookupResult> lookup(CharSequence key,
                                        int num,
                                        boolean allTermsRequired,
                                        boolean doHighlight)

Retrieve suggestions, specifying whether all terms must match (allTermsRequired) and whether the hits should be highlighted (doHighlight).

createResults

protected List<Lookup.LookupResult> createResults(TopDocs hits,
                                                  int num,
                                                  CharSequence charSequence,
                                                  boolean doHighlight,
                                                  Set<String> matchedTokens,
                                                  String prefixToken)
                                           throws IOException

Create the results based on the search hits. Can be overridden by subclass to add particular behavior (e.g. weight transformation)

Throws:: IOException - If there are problems reading fields from the underlying Lucene index.

finishQuery

protected Query finishQuery(BooleanQuery in,
                            boolean allTermsRequired)

Subclass can override this to tweak the Query before searching.

highlight

protected Object highlight(String text,
                           Set<String> matchedTokens,
                           String prefixToken)
                    throws IOException

Override this method to customize the Object representing a single highlighted suggestions; the result is set on each Lookup.LookupResult.highlightKey member.

Throws:: IOException

addNonMatch

protected void addNonMatch(StringBuilder sb,
                           String text)

Called while highlighting a single result, to append a non-matching chunk of text from the suggestion to the provided fragments list.

Parameters:: sb - The StringBuilder to append to; text - The text chunk to add

addWholeMatch

protected void addWholeMatch(StringBuilder sb,
                             String surface,
                             String analyzed)

Called while highlighting a single result, to append the whole matched token to the provided fragments list.

Parameters:: sb - The StringBuilder to append to; surface - The surface form (original) text; analyzed - The analyzed token corresponding to the surface form text

addPrefixMatch

protected void addPrefixMatch(StringBuilder sb,
                              String surface,
                              String analyzed,
                              String prefixToken)

Called while highlighting a single result, to append a matched prefix token, to the provided fragments list.

Parameters:: sb - The StringBuilder to append to; surface - The fragment of the surface form (indexed during build(org.apache.lucene.search.suggest.InputIterator), corresponding to this match; analyzed - The analyzed token that matched; prefixToken - The prefix of the token that matched

store

public boolean store(DataOutput in)
              throws IOException

Description copied from class: Lookup

Persist the constructed lookup data to a directory. Optional operation.

Specified by:: store in class Lookup

Parameters:: in - DataOutput to write the data to.
Returns:: true if successful, false if unsuccessful or not supported.
Throws:: IOException - when fatal IO error occurs.

load

public boolean load(DataInput out)
             throws IOException

Description copied from class: Lookup

Discard current lookup data and load it from a previously saved copy. Optional operation.

Specified by:: load in class Lookup

Parameters:: out - the DataInput to load the lookup data.
Returns:: true if completed successfully, false if unsuccessful or not supported.
Throws:: IOException - when fatal IO error occurs.

close

public void close()
           throws IOException

Specified by:: close in interface Closeable

Throws:: IOException

sizeInBytes

public long sizeInBytes()

Description copied from class: Lookup

Get the size of the underlying lookup implementation in memory

Specified by:: sizeInBytes in class Lookup

Returns:: ram size of the lookup implementation in bytes

getCount

public long getCount()

Description copied from class: Lookup

Get the number of entries the lookup was built with

Specified by:: getCount in class Lookup

Returns:: total number of suggester entries

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.search.suggest.analyzing Class AnalyzingInfixSuggester

TEXT_FIELD_NAME

queryAnalyzer

indexAnalyzer

searcher

payloadsDV

textDV

weightsDV

DEFAULT_MIN_PREFIX_CHARS

AnalyzingInfixSuggester

AnalyzingInfixSuggester

getIndexWriterConfig

getDirectory

build

getTextFieldType

lookup

getLastTokenQuery

lookup

createResults

finishQuery

highlight

addNonMatch

addWholeMatch

addPrefixMatch

store

load

close

sizeInBytes

getCount

org.apache.lucene.search.suggest.analyzing
Class AnalyzingInfixSuggester