Class IndexRearranger
Rearranging works in 3 steps: 1. Assume all docs in the original index are live and create the rearranged index using the segment selectors. 2. Go through the rearranged index and apply deletes requested by the deletes selector. 3. Reorder the segments to match the order of the selectors and check the validity of the rearranged index.
NB: You can't produce segments that only contain deletes. If you select all documents in a segment for deletion, the entire segment will be discarded.
Example use case: You are testing search performance after a change to indexing. You can index the same content using the old and new indexers and then rearrange one of them to the shape of the other. Using rearrange will give more accurate measurements, since you will not be introducing noise from index geometry.
TODO: another possible (faster) approach to do this is to manipulate FlushPolicy and MergePolicy at indexing time to create small desired segments first and merge them accordingly for details please see: https://markmail.org/message/lbtdntclpnocmfuf
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interface
Select document within a CodecReader -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final IndexWriterConfig
protected final IndexRearranger.DocumentSelector
protected final Directory
protected final Directory
protected final List
<IndexRearranger.DocumentSelector> -
Constructor Summary
ConstructorsConstructorDescriptionIndexRearranger
(Directory input, Directory output, IndexWriterConfig config, List<IndexRearranger.DocumentSelector> segmentSelectors) Constructor with no deletes to applyIndexRearranger
(Directory input, Directory output, IndexWriterConfig config, List<IndexRearranger.DocumentSelector> segmentSelectors, IndexRearranger.DocumentSelector deletedDocsSelector) All args constructor -
Method Summary
-
Field Details
-
input
-
output
-
config
-
segmentSelectors
-
deletedDocsSelector
-
-
Constructor Details
-
IndexRearranger
public IndexRearranger(Directory input, Directory output, IndexWriterConfig config, List<IndexRearranger.DocumentSelector> segmentSelectors, IndexRearranger.DocumentSelector deletedDocsSelector) All args constructor- Parameters:
input
- input diroutput
- output dirconfig
- index writer configsegmentSelectors
- specify which documents are desired in the rearranged index segments; each selector corresponds to one segmentdeletedDocsSelector
- specify which documents are to be marked for deletion in the rearranged index; this selector should be thread-safe
-
IndexRearranger
public IndexRearranger(Directory input, Directory output, IndexWriterConfig config, List<IndexRearranger.DocumentSelector> segmentSelectors) Constructor with no deletes to apply
-
-
Method Details
-
execute
- Throws:
Exception
-