org.apache.lucene.index
Class CheckIndex

java.lang.Object
  extended by org.apache.lucene.index.CheckIndex

public class CheckIndex
extends Object

Basic tool and API to check the health of an index and write a new segments file that removes reference to problematic segments.

As this tool checks every byte in the index, on a large index it can take quite a long time to run.

WARNING: This API is experimental and might change in incompatible ways in the next release.
Please make a complete backup of your index before using this to fix your index!

Nested Class Summary
static class CheckIndex.Status
          Returned from checkIndex() detailing the health and status of the index.
 
Constructor Summary
CheckIndex(Directory dir)
          Create a new CheckIndex on the directory.
 
Method Summary
 CheckIndex.Status checkIndex()
          Returns a CheckIndex.Status instance detailing the state of the index.
 CheckIndex.Status checkIndex(List<String> onlySegments)
          Returns a CheckIndex.Status instance detailing the state of the index.
 void fixIndex(CheckIndex.Status result)
          Repairs the index using previously returned result from checkIndex().
 boolean getCrossCheckTermVectors()
          See setCrossCheckTermVectors(boolean).
static void main(String[] args)
          Command-line interface to check and fix an index.
 void setCrossCheckTermVectors(boolean v)
          If true, term vectors are compared against postings to make sure they are the same.
 void setInfoStream(PrintStream out)
          Set infoStream where messages should go.
 void setInfoStream(PrintStream out, boolean verbose)
          Set infoStream where messages should go.
static CheckIndex.Status.DocValuesStatus testDocValues(AtomicReader reader, PrintStream infoStream)
          Test docvalues.
static CheckIndex.Status.FieldNormStatus testFieldNorms(AtomicReader reader, PrintStream infoStream)
          Test field norms.
static CheckIndex.Status.TermIndexStatus testPostings(AtomicReader reader, PrintStream infoStream)
          Test the term index.
static CheckIndex.Status.TermIndexStatus testPostings(AtomicReader reader, PrintStream infoStream, boolean verbose)
          Test the term index.
static CheckIndex.Status.StoredFieldStatus testStoredFields(AtomicReader reader, PrintStream infoStream)
          Test stored fields.
static CheckIndex.Status.TermVectorStatus testTermVectors(AtomicReader reader, PrintStream infoStream)
          Test term vectors.
static CheckIndex.Status.TermVectorStatus testTermVectors(AtomicReader reader, PrintStream infoStream, boolean verbose, boolean crossCheckTermVectors)
          Test term vectors.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CheckIndex

public CheckIndex(Directory dir)
Create a new CheckIndex on the directory.

Method Detail

setCrossCheckTermVectors

public void setCrossCheckTermVectors(boolean v)
If true, term vectors are compared against postings to make sure they are the same. This will likely drastically increase time it takes to run CheckIndex!


getCrossCheckTermVectors

public boolean getCrossCheckTermVectors()
See setCrossCheckTermVectors(boolean).


setInfoStream

public void setInfoStream(PrintStream out,
                          boolean verbose)
Set infoStream where messages should go. If null, no messages are printed. If verbose is true then more details are printed.


setInfoStream

public void setInfoStream(PrintStream out)
Set infoStream where messages should go. See setInfoStream(PrintStream,boolean).


checkIndex

public CheckIndex.Status checkIndex()
                             throws IOException
Returns a CheckIndex.Status instance detailing the state of the index.

As this method checks every byte in the index, on a large index it can take quite a long time to run.

WARNING: make sure you only call this when the index is not opened by any writer.

Throws:
IOException

checkIndex

public CheckIndex.Status checkIndex(List<String> onlySegments)
                             throws IOException
Returns a CheckIndex.Status instance detailing the state of the index.

Parameters:
onlySegments - list of specific segment names to check

As this method checks every byte in the specified segments, on a large index it can take quite a long time to run.

WARNING: make sure you only call this when the index is not opened by any writer.

Throws:
IOException

testFieldNorms

public static CheckIndex.Status.FieldNormStatus testFieldNorms(AtomicReader reader,
                                                               PrintStream infoStream)
Test field norms.

WARNING: This API is experimental and might change in incompatible ways in the next release.

testPostings

public static CheckIndex.Status.TermIndexStatus testPostings(AtomicReader reader,
                                                             PrintStream infoStream)
Test the term index.

WARNING: This API is experimental and might change in incompatible ways in the next release.

testPostings

public static CheckIndex.Status.TermIndexStatus testPostings(AtomicReader reader,
                                                             PrintStream infoStream,
                                                             boolean verbose)
Test the term index.

WARNING: This API is experimental and might change in incompatible ways in the next release.

testStoredFields

public static CheckIndex.Status.StoredFieldStatus testStoredFields(AtomicReader reader,
                                                                   PrintStream infoStream)
Test stored fields.

WARNING: This API is experimental and might change in incompatible ways in the next release.

testDocValues

public static CheckIndex.Status.DocValuesStatus testDocValues(AtomicReader reader,
                                                              PrintStream infoStream)
Test docvalues.

WARNING: This API is experimental and might change in incompatible ways in the next release.

testTermVectors

public static CheckIndex.Status.TermVectorStatus testTermVectors(AtomicReader reader,
                                                                 PrintStream infoStream)
Test term vectors.

WARNING: This API is experimental and might change in incompatible ways in the next release.

testTermVectors

public static CheckIndex.Status.TermVectorStatus testTermVectors(AtomicReader reader,
                                                                 PrintStream infoStream,
                                                                 boolean verbose,
                                                                 boolean crossCheckTermVectors)
Test term vectors.

WARNING: This API is experimental and might change in incompatible ways in the next release.

fixIndex

public void fixIndex(CheckIndex.Status result)
              throws IOException
Repairs the index using previously returned result from checkIndex(). Note that this does not remove any of the unreferenced files after it's done; you must separately open an IndexWriter, which deletes unreferenced files when it's created.

WARNING: this writes a new segments file into the index, effectively removing all documents in broken segments from the index. BE CAREFUL.

WARNING: Make sure you only call this when the index is not opened by any writer.

Throws:
IOException

main

public static void main(String[] args)
                 throws IOException,
                        InterruptedException
Command-line interface to check and fix an index.

Run it like this:

    java -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex pathToIndex [-fix] [-verbose] [-segment X] [-segment Y]
    

WARNING: -fix should only be used on an emergency basis as it will cause documents (perhaps many) to be permanently removed from the index. Always make a backup copy of your index before running this! Do not run this tool on an index that is actively being written to. You have been warned!

Run without -fix, this tool will open the index, report version information and report any exceptions it hits and what action it would take if -fix were specified. With -fix, this tool will remove any segments that have issues and write a new segments_N file. This means all documents contained in the affected segments will be removed.

This tool exits with exit code 1 if the index cannot be opened or has any corruption, else 0.

Throws:
IOException
InterruptedException


Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.