Class GraphTokenStreamFiniteStrings

java.lang.Object
org.apache.lucene.util.graph.GraphTokenStreamFiniteStrings

public final class GraphTokenStreamFiniteStrings extends Object
Consumes a TokenStream and creates an Automaton where the transition labels are terms from the TermToBytesRefAttribute. This class also provides helpers to explore the different paths of the Automaton.
  • Constructor Details

  • Method Details

    • hasSidePath

      public boolean hasSidePath(int state)
      Returns whether the provided state is the start of multiple side paths of different length (eg: new york, ny)
    • getTerms

      public List<AttributeSource> getTerms(int state)
      Returns the list of tokens that start at the provided state
    • getTerms

      public Term[] getTerms(String field, int state)
      Returns the list of terms that start at the provided state
    • getFiniteStrings

      public Iterator<TokenStream> getFiniteStrings() throws IOException
      Get all finite strings from the automaton.
      Throws:
      IOException
    • getFiniteStrings

      public Iterator<TokenStream> getFiniteStrings(int startState, int endState)
      Get all finite strings that start at startState and end at endState.
    • articulationPoints

      public int[] articulationPoints()
      Returns the articulation points (or cut vertices) of the graph: https://en.wikipedia.org/wiki/Biconnected_component