public class TikaLanguageIdentifierUpdateProcessor extends LanguageIdentifierUpdateProcessor
allMapFieldsSet, docIdField, enabled, enableMapping, enforceSchema, fallbackFields, fallbackValue, inputFields, langField, langPattern, langsField, langWhitelist, lcMap, mapFields, mapIndividual, mapIndividualFieldsSet, mapKeepOrig, mapLcMap, mapOverwrite, mapPattern, mapReplaceStr, maxFieldValueChars, maxTotalChars, overwrite, schema, threshold, tikaSimilarityPattern
next
DOCID_FIELD_DEFAULT, DOCID_LANGFIELD_DEFAULT, DOCID_LANGSFIELD_DEFAULT, DOCID_PARAM, DOCID_THRESHOLD_DEFAULT, ENFORCE_SCHEMA, FALLBACK, FALLBACK_FIELDS, FIELDS_PARAM, LANG_FIELD, LANG_WHITELIST, LANGS_FIELD, LANGUAGE_ID, LCMAP, MAP_ENABLE, MAP_FL, MAP_INDIVIDUAL, MAP_INDIVIDUAL_FL, MAP_KEEP_ORIG, MAP_LCMAP, MAP_OVERWRITE, MAP_PATTERN, MAP_PATTERN_DEFAULT, MAP_REPLACE, MAP_REPLACE_DEFAULT, MAX_FIELD_VALUE_CHARS, MAX_FIELD_VALUE_CHARS_DEFAULT, MAX_TOTAL_CHARS, MAX_TOTAL_CHARS_DEFAULT, OVERWRITE, THRESHOLD
Constructor and Description |
---|
TikaLanguageIdentifierUpdateProcessor(SolrQueryRequest req,
SolrQueryResponse rsp,
UpdateRequestProcessor next) |
Modifier and Type | Method and Description |
---|---|
protected String |
concatFields(SolrInputDocument doc)
Concatenates content from multiple fields
|
protected List<DetectedLanguage> |
detectLanguage(SolrInputDocument doc)
Detects language(s) from a string.
|
getMappedField, isEnabled, normalizeLangCode, process, processAdd, resolveLanguage, resolveLanguage, setEnabled
close, doClose, finish, processCommit, processDelete, processMergeIndexes, processRollback
public TikaLanguageIdentifierUpdateProcessor(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next)
protected List<DetectedLanguage> detectLanguage(SolrInputDocument doc)
LanguageIdentifierUpdateProcessor
detectLanguage
in class LanguageIdentifierUpdateProcessor
doc
- The content to identifyprotected String concatFields(SolrInputDocument doc)
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.