Class SolrStopwordsCarrot2LexicalDataFactory

  • All Implemented Interfaces:
    org.carrot2.text.linguistic.ILexicalDataFactory

    @Bindable
    public class SolrStopwordsCarrot2LexicalDataFactory
    extends Object
    implements org.carrot2.text.linguistic.ILexicalDataFactory
    An implementation of Carrot2's ILexicalDataFactory that adds stop words from a field's StopFilter to the default stop words used in Carrot2, for all languages Carrot2 supports. Completely replacing Carrot2 stop words with Solr's wouldn't make much sense because clustering needs more aggressive stop words removal. In other words, if something is a stop word during indexing, then it should also be a stop word during clustering, but not the other way round.
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • core

        @Input
        @Attribute(key="solrCore")
        public SolrCore core
      • fieldNames

        @Processing
        @Input
        @Attribute(key="solrFieldNames")
        public Set<String> fieldNames
      • carrot2LexicalDataFactory

        public org.carrot2.text.linguistic.DefaultLexicalDataFactory carrot2LexicalDataFactory
        Carrot2's default lexical resources to use in addition to Solr's stop words.
    • Constructor Detail

      • SolrStopwordsCarrot2LexicalDataFactory

        public SolrStopwordsCarrot2LexicalDataFactory()
    • Method Detail

      • getLexicalData

        public org.carrot2.text.linguistic.ILexicalData getLexicalData​(org.carrot2.core.LanguageCode languageCode)
        Specified by:
        getLexicalData in interface org.carrot2.text.linguistic.ILexicalDataFactory