Class CheckIndex
- All Implemented Interfaces:
Closeable
,AutoCloseable
As this tool checks every byte in the index, on a large index it can take quite a long time to run.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
- Please make a complete backup of your index before using this to exorcise corrupted documents from your index!
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
The marker RuntimeException used by CheckIndex APIs when index integrity failure is detected.static class
Run-time configuration options for CheckIndex commands.static class
Returned fromcheckIndex()
detailing the health and status of the index.static class
Walks the entire N-dimensional points space, verifying that all points fall within the last cell's boundaries. -
Constructor Summary
ConstructorDescriptionCheckIndex
(Directory dir) Create a new CheckIndex on the directory.CheckIndex
(Directory dir, Lock writeLock) Expert: create a directory with the specified lock. -
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
Check whether asserts are enabled or not.Returns aCheckIndex.Status
instance detailing the state of the index.checkIndex
(List<String> onlySegments) Returns aCheckIndex.Status
instance detailing the state of the index.checkIndex
(List<String> onlySegments, ExecutorService executorService) Returns aCheckIndex.Status
instance detailing the state of the index.void
close()
int
doCheck
(CheckIndex.Options opts) Actually perform the index checkboolean
void
exorciseIndex
(CheckIndex.Status result) Repairs the index using previously returned result fromcheckIndex()
.boolean
SeegetChecksumsOnly()
.boolean
SeesetFailFast(boolean)
.static void
Command-line interface to check and exorcise corrupt segments from an index.static CheckIndex.Options
parseOptions
(String[] args) Parse command line args into fieldsvoid
setChecksumsOnly
(boolean v) If true, only validate physical integrity for all files.void
setDoSlowChecks
(boolean v) If true, additional slow checks are performed.void
setFailFast
(boolean v) If true, just throw the original exception immediately when corruption is detected, rather than continuing to iterate to other segments looking for more corruption.void
setInfoStream
(PrintStream out) Set infoStream where messages should go.void
setInfoStream
(PrintStream out, boolean verbose) Set infoStream where messages should go.void
setThreadCount
(int tc) Set threadCount used for parallelizing index integrity checking.testDocValues
(CodecReader reader, PrintStream infoStream, boolean failFast) Test docvalues.testFieldInfos
(CodecReader reader, PrintStream infoStream, boolean failFast) Test field infos.testFieldNorms
(CodecReader reader, PrintStream infoStream, boolean failFast) Test field norms.testLiveDocs
(CodecReader reader, PrintStream infoStream, boolean failFast) Test live docs.testPoints
(CodecReader reader, PrintStream infoStream, boolean failFast) Test the points indextestPostings
(CodecReader reader, PrintStream infoStream) Test the term index.testPostings
(CodecReader reader, PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast) Test the term index.testSort
(CodecReader reader, Sort sort, PrintStream infoStream, boolean failFast) Tests index sort order.testStoredFields
(CodecReader reader, PrintStream infoStream, boolean failFast) Test stored fields.testTermVectors
(CodecReader reader, PrintStream infoStream) Test term vectors.testTermVectors
(CodecReader reader, PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast) Test term vectors.testVectors
(CodecReader reader, PrintStream infoStream, boolean failFast) Test the vectors index
-
Constructor Details
-
CheckIndex
Create a new CheckIndex on the directory.- Throws:
IOException
-
CheckIndex
Expert: create a directory with the specified lock. This should really not be used except for unit tests!!!! It exists only to support special tests (such as TestIndexWriterExceptions*), that would otherwise be more complicated to debug if they had to close the writer for each check.
-
-
Method Details
-
close
- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Throws:
IOException
-
setDoSlowChecks
public void setDoSlowChecks(boolean v) If true, additional slow checks are performed. This will likely drastically increase time it takes to run CheckIndex! -
doSlowChecks
public boolean doSlowChecks() -
setFailFast
public void setFailFast(boolean v) If true, just throw the original exception immediately when corruption is detected, rather than continuing to iterate to other segments looking for more corruption. -
getFailFast
public boolean getFailFast()SeesetFailFast(boolean)
. -
getChecksumsOnly
public boolean getChecksumsOnly()SeegetChecksumsOnly()
. -
setChecksumsOnly
public void setChecksumsOnly(boolean v) If true, only validate physical integrity for all files. Note that the returned nested status objects (e.g. storedFieldStatus) will be null. -
setThreadCount
public void setThreadCount(int tc) Set threadCount used for parallelizing index integrity checking. -
setInfoStream
Set infoStream where messages should go. If null, no messages are printed. If verbose is true then more details are printed. -
setInfoStream
Set infoStream where messages should go. SeesetInfoStream(PrintStream,boolean)
. -
checkIndex
Returns aCheckIndex.Status
instance detailing the state of the index.As this method checks every byte in the index, on a large index it can take quite a long time to run.
WARNING: make sure you only call this when the index is not opened by any writer.
- Throws:
IOException
-
checkIndex
Returns aCheckIndex.Status
instance detailing the state of the index.- Parameters:
onlySegments
- list of specific segment names to checkAs this method checks every byte in the specified segments, on a large index it can take quite a long time to run.
- Throws:
IOException
-
checkIndex
public CheckIndex.Status checkIndex(List<String> onlySegments, ExecutorService executorService) throws IOException Returns aCheckIndex.Status
instance detailing the state of the index.This method allows caller to pass in customized ExecutorService to speed up the check.
WARNING: make sure you only call this when the index is not opened by any writer.
- Throws:
IOException
-
testSort
public static CheckIndex.Status.IndexSortStatus testSort(CodecReader reader, Sort sort, PrintStream infoStream, boolean failFast) throws IOException Tests index sort order.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testLiveDocs
public static CheckIndex.Status.LiveDocStatus testLiveDocs(CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException Test live docs.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testFieldInfos
public static CheckIndex.Status.FieldInfoStatus testFieldInfos(CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException Test field infos.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testFieldNorms
public static CheckIndex.Status.FieldNormStatus testFieldNorms(CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException Test field norms.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testPostings
public static CheckIndex.Status.TermIndexStatus testPostings(CodecReader reader, PrintStream infoStream) throws IOException Test the term index.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testPostings
public static CheckIndex.Status.TermIndexStatus testPostings(CodecReader reader, PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast) throws IOException Test the term index.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testPoints
public static CheckIndex.Status.PointsStatus testPoints(CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException Test the points index- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testVectors
public static CheckIndex.Status.VectorValuesStatus testVectors(CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException Test the vectors index- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testStoredFields
public static CheckIndex.Status.StoredFieldStatus testStoredFields(CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException Test stored fields.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testDocValues
public static CheckIndex.Status.DocValuesStatus testDocValues(CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException Test docvalues.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testTermVectors
public static CheckIndex.Status.TermVectorStatus testTermVectors(CodecReader reader, PrintStream infoStream) throws IOException Test term vectors.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
testTermVectors
public static CheckIndex.Status.TermVectorStatus testTermVectors(CodecReader reader, PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast) throws IOException Test term vectors.- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
exorciseIndex
Repairs the index using previously returned result fromcheckIndex()
. Note that this does not remove any of the unreferenced files after it's done; you must separately open anIndexWriter
, which deletes unreferenced files when it's created.WARNING: this writes a new segments file into the index, effectively removing all documents in broken segments from the index. BE CAREFUL.
- Throws:
IOException
-
assertsOn
public static boolean assertsOn()Check whether asserts are enabled or not.- Returns:
- true iff asserts are enabled
-
main
Command-line interface to check and exorcise corrupt segments from an index.Run it like this:
java -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex pathToIndex [-exorcise] [-verbose] [-segment X] [-segment Y]
-exorcise
: actually write a new segments_N file, removing any problematic segments. *LOSES DATA*-segment X
: only check the specified segment(s). This can be specified multiple times, to check more than one segment:-segment _2 * -segment _a
. You can't use this with the -exorcise option.
WARNING:
-exorcise
should only be used on an emergency basis as it will cause documents (perhaps many) to be permanently removed from the index. Always make a backup copy of your index before running this! Do not run this tool on an index that is actively being written to. You have been warned!Run without -exorcise, this tool will open the index, report version information and report any exceptions it hits and what action it would take if -exorcise were specified. With -exorcise, this tool will remove any segments that have issues and write a new segments_N file. This means all documents contained in the affected segments will be removed.
This tool exits with exit code 1 if the index cannot be opened or has any corruption, else 0.
- Throws:
IOException
InterruptedException
-
parseOptions
Parse command line args into fields- Parameters:
args
- The command line arguments- Returns:
- An Options struct
- Throws:
IllegalArgumentException
- if any of the CLI args are invalid
-
doCheck
Actually perform the index check- Parameters:
opts
- The options to use for this check- Returns:
- 0 iff the index is clean, 1 otherwise
- Throws:
IOException
InterruptedException
-