public class PathHierarchyTokenizerFactory extends TokenizerFactory
PathHierarchyTokenizer
.
This factory is typically configured for use only in the index
Analyzer (or only in the query
Analyzer, but never both).
For example, in the configuration below a query for
Books/NonFic
will match documents indexed with values like
Books/NonFic
, Books/NonFic/Law
,
Books/NonFic/Science/Physics
, etc. But it will not match
documents indexed with values like Books
, or
Books/Fic
...
<fieldType name="descendent_path" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory" /> </analyzer> </fieldType>
In this example however we see the oposite configuration, so that a query
for Books/NonFic/Science/Physics
would match documents
containing Books/NonFic
, Books/NonFic/Science
,
or Books/NonFic/Science/Physics
, but not
Books/NonFic/Science/Physics/Theory
or
Books/NonFic/Law
.
<fieldType name="descendent_path" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /> </analyzer> </fieldType>
args, luceneMatchVersion
Constructor and Description |
---|
PathHierarchyTokenizerFactory() |
Modifier and Type | Method and Description |
---|---|
Tokenizer |
create(Reader input)
Creates a TokenStream of the specified input
|
void |
init(Map<String,String> args)
Require a configured pattern
|
availableTokenizers, forName, lookupClass, reloadTokenizers
assureMatchVersion, getArgs, getBoolean, getBoolean, getInt, getInt, getInt, getLines, getLuceneMatchVersion, getPattern, getSnowballWordSet, getWordSet, setLuceneMatchVersion, splitFileNames
public void init(Map<String,String> args)
init
in class AbstractAnalysisFactory
public Tokenizer create(Reader input)
TokenizerFactory
create
in class TokenizerFactory
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.