Package org.apache.lucene.wordnet

This package uses synonyms defined by WordNet.

See:
          Description

Class Summary
AnalyzerUtil Various fulltext analysis utilities avoiding redundant code in several classes.
SynExpand Expand a query by looking up synonyms for every term.
SynLookup Test program to look up synonyms.
SynonymMap Loads the WordNet prolog file wn_s.pl into a thread-safe main-memory hash map that can be used for fast high-frequency lookups of synonyms for any given (lowercase) word string.
SynonymTokenFilter Injects additional tokens for synonyms of token terms fetched from the underlying child stream; the child stream must deliver lowercase tokens for synonyms to be found.
Syns2Index Convert the prolog file wn_s.pl from the WordNet prolog download into a Lucene index suitable for looking up synonyms and performing query expansion (SynExpand.expand(...)).
 

Package org.apache.lucene.wordnet Description

This package uses synonyms defined by WordNet. There are two methods: query expansion and analysis. Both methods first require you to download the WordNet prolog database Inside this archive is a file named wn_s.pl, which contains the WordNet synonyms.

Query Expansion Method

This method creates Lucene index storing the synonyms, which in turn can be used for query expansion. You normally run Syns2Index once to build the query index/"database", and then call SynExpand.expand(...) to expand a query.

Instructions

  1. Invoke Syn2Index as appropriate to build a synonym index. It'll take 2 arguments, the path to wn_s.pl from the WordNet download, and the index name.
  2. Update your UI so that as appropriate you call SynExpand.expand(...) to expand user queries with synonyms.

Analysis Method

This method injects additional synonym tokens for tokens from a child TokenStream.

Instructions

  1. Create a SynonymMap, passing in the path to wn_s.pl
  2. Add a SynonymTokenFilter to your analyzer. Note: SynonymTokenFilter should be after LowerCaseFilter, because it expects terms to already be in lowercase.



Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.