Class BlockReader
- java.lang.Object
-
- org.apache.lucene.index.TermsEnum
-
- org.apache.lucene.index.BaseTermsEnum
-
- org.apache.lucene.codecs.uniformsplit.BlockReader
-
- All Implemented Interfaces:
Accountable
,BytesRefIterator
- Direct Known Subclasses:
IntersectBlockReader
,STBlockReader
public class BlockReader extends BaseTermsEnum implements Accountable
Seeks the block corresponding to a given term, read the block bytes, and scans the block terms.Reads fully the block in
blockReadBuffer
. Then scans the block terms in memory. The details region is lazily decoded withtermStatesReadBuffer
which shares the same byte array withblockReadBuffer
. SeeBlockWriter
andBlockLine
for the block format.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.index.TermsEnum
TermsEnum.SeekStatus
-
-
Field Summary
Fields Modifier and Type Field Description protected BlockDecoder
blockDecoder
protected int
blockFirstLineStart
Offset of the start of the first line of the current block (just after the header), relative to the block start.protected BlockHeader
blockHeader
Current block header.protected BlockHeader.Serializer
blockHeaderReader
protected IndexInput
blockInput
IndexInput
on theblock file
.protected BlockLine
blockLine
Current block line.protected BlockLine.Serializer
blockLineReader
protected ByteArrayDataInput
blockReadBuffer
In-memory read buffer for the current block.protected long
blockStartFP
Current block start file pointer, absolute in theblock file
.protected IndexDictionary.Browser
dictionaryBrowser
Holds theIndexDictionary.Browser
once loaded.protected IndexDictionary.BrowserSupplier
dictionaryBrowserSupplier
IndexDictionary.Browser
supplier for lazy loading.protected FieldMetadata
fieldMetadata
protected BytesRefBuilder
forcedTerm
Set whenseekExact(BytesRef, TermState)
is called.protected int
lineIndexInBlock
Current line index in the block.protected PostingsReaderBase
postingsReader
protected BytesRef
scratchBlockBytes
protected BlockLine
scratchBlockLine
protected BlockTermState
scratchTermState
protected BlockTermState
termState
Current block line details.protected boolean
termStateForced
Whether the currentTermState
has been forced with a call toseekExact(BytesRef, TermState)
.protected DeltaBaseTermStateSerializer
termStateSerializer
protected ByteArrayDataInput
termStatesReadBuffer
In-memory read buffer for the details region of the current block.-
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
BlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
clearTermState()
protected int
compareToMiddleAndJump(BytesRef searchedTerm)
Compares the searched term to the middle term of the block.protected BlockHeader.Serializer
createBlockHeaderSerializer()
protected BlockLine.Serializer
createBlockLineSerializer()
protected DeltaBaseTermStateSerializer
createDeltaBaseTermStateSerializer()
protected BytesRef
decodeBlockBytesIfNeeded(int numBlockBytes)
int
docFreq()
protected IndexDictionary.Browser
getOrCreateDictionaryBrowser()
ImpactsEnum
impacts(int flags)
protected void
initializeBlockReadLazily()
protected void
initializeHeader(BytesRef searchedTerm, long targetBlockStartFP)
Reads and setsblockHeader
.protected boolean
isBeyondLastTerm(BytesRef searchedTerm, long blockStartFP)
Indicates whether the searched term is beyond the last term of the field.protected boolean
isCurrentTerm(BytesRef searchedTerm)
protected CorruptIndexException
newCorruptIndexException(String msg, Long fp)
BytesRef
next()
protected BytesRef
nextTerm()
Moves to the next term line and reads it, it may be in the next block.long
ord()
PostingsEnum
postings(PostingsEnum reuse, int flags)
long
ramBytesUsed()
protected BlockHeader
readHeader()
Reads the block header.protected BlockLine
readLineInBlock()
Reads the current block line.protected BlockTermState
readTermState()
Reads theBlockTermState
on the current line.protected BlockTermState
readTermStateIfNotRead()
Reads theBlockTermState
if it is not already set.TermsEnum.SeekStatus
seekCeil(BytesRef searchedTerm)
void
seekExact(long ord)
Not supported.boolean
seekExact(BytesRef searchedTerm)
void
seekExact(BytesRef term, TermState state)
Positions thisBlockReader
without re-seeking the term dictionary.protected TermsEnum.SeekStatus
seekInBlock(BytesRef searchedTerm)
Seeks to the provided term in this block.protected TermsEnum.SeekStatus
seekInBlock(BytesRef searchedTerm, long blockStartFP)
Seeks to the provided term in the block starting at the provided file pointer.BytesRef
term()
TermState
termState()
long
totalTermFreq()
-
Methods inherited from class org.apache.lucene.index.BaseTermsEnum
attributes
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
-
-
-
Field Detail
-
blockInput
protected IndexInput blockInput
IndexInput
on theblock file
.
-
postingsReader
protected final PostingsReaderBase postingsReader
-
fieldMetadata
protected final FieldMetadata fieldMetadata
-
blockDecoder
protected final BlockDecoder blockDecoder
-
blockHeaderReader
protected BlockHeader.Serializer blockHeaderReader
-
blockLineReader
protected BlockLine.Serializer blockLineReader
-
blockReadBuffer
protected ByteArrayDataInput blockReadBuffer
In-memory read buffer for the current block.
-
termStatesReadBuffer
protected ByteArrayDataInput termStatesReadBuffer
In-memory read buffer for the details region of the current block. It shares the same byte array asblockReadBuffer
, with a different position.
-
termStateSerializer
protected DeltaBaseTermStateSerializer termStateSerializer
-
dictionaryBrowserSupplier
protected final IndexDictionary.BrowserSupplier dictionaryBrowserSupplier
IndexDictionary.Browser
supplier for lazy loading.
-
dictionaryBrowser
protected IndexDictionary.Browser dictionaryBrowser
Holds theIndexDictionary.Browser
once loaded.
-
blockStartFP
protected long blockStartFP
Current block start file pointer, absolute in theblock file
.
-
blockHeader
protected BlockHeader blockHeader
Current block header.
-
blockLine
protected BlockLine blockLine
Current block line.
-
termState
protected BlockTermState termState
Current block line details.
-
blockFirstLineStart
protected int blockFirstLineStart
Offset of the start of the first line of the current block (just after the header), relative to the block start.
-
lineIndexInBlock
protected int lineIndexInBlock
Current line index in the block.
-
termStateForced
protected boolean termStateForced
Whether the currentTermState
has been forced with a call toseekExact(BytesRef, TermState)
.- See Also:
forcedTerm
-
forcedTerm
protected BytesRefBuilder forcedTerm
Set whenseekExact(BytesRef, TermState)
is called.This optimizes the use-case when the caller calls first
seekExact(BytesRef, TermState)
and thenpostings(PostingsEnum, int)
. In this case we don't access the terms block file (we don't seek) but directly the postings file because we already have theTermState
with the file pointers to the postings file.
-
scratchBlockBytes
protected BytesRef scratchBlockBytes
-
scratchTermState
protected final BlockTermState scratchTermState
-
scratchBlockLine
protected BlockLine scratchBlockLine
-
-
Constructor Detail
-
BlockReader
protected BlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) throws IOException
- Parameters:
dictionaryBrowserSupplier
- to load theIndexDictionary.Browser
lazily inseekCeil(BytesRef)
.blockDecoder
- Optional block decoder, may be null if none. It can be used for decompression or decryption.- Throws:
IOException
-
-
Method Detail
-
seekCeil
public TermsEnum.SeekStatus seekCeil(BytesRef searchedTerm) throws IOException
- Specified by:
seekCeil
in classTermsEnum
- Throws:
IOException
-
seekExact
public boolean seekExact(BytesRef searchedTerm) throws IOException
- Overrides:
seekExact
in classBaseTermsEnum
- Throws:
IOException
-
isCurrentTerm
protected boolean isCurrentTerm(BytesRef searchedTerm)
-
isBeyondLastTerm
protected boolean isBeyondLastTerm(BytesRef searchedTerm, long blockStartFP)
Indicates whether the searched term is beyond the last term of the field.- Parameters:
blockStartFP
- The current block start file pointer.
-
seekInBlock
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm, long blockStartFP) throws IOException
Seeks to the provided term in the block starting at the provided file pointer. Does not exceed the block.- Throws:
IOException
-
seekInBlock
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm) throws IOException
Seeks to the provided term in this block.Does not exceed this block;
TermsEnum.SeekStatus.END
is returned if it follows the block.Compares the line terms with the
searchedTerm
, taking advantage of the incremental encoding properties.Scans linearly the terms. Updates the current block line with the current term.
- Throws:
IOException
-
compareToMiddleAndJump
protected int compareToMiddleAndJump(BytesRef searchedTerm) throws IOException
Compares the searched term to the middle term of the block. If the searched term is lexicographically equal or after the middle term then jumps to the second half of the block directly.- Returns:
- The comparison between the searched term and the middle term.
- Throws:
IOException
-
readLineInBlock
protected BlockLine readLineInBlock() throws IOException
Reads the current block line. SetsblockLine
and incrementslineIndexInBlock
.- Returns:
- The
BlockLine
; or null if there no more line in the block. - Throws:
IOException
-
seekExact
public void seekExact(BytesRef term, TermState state)
Positions thisBlockReader
without re-seeking the term dictionary.The block containing the term is not read by this method. It will be read lazily only if needed, for example if
next()
is called. Callingpostings(org.apache.lucene.index.PostingsEnum, int)
after this method does require the block to be read.- Overrides:
seekExact
in classBaseTermsEnum
-
seekExact
public void seekExact(long ord)
Not supported.
-
next
public BytesRef next() throws IOException
- Specified by:
next
in interfaceBytesRefIterator
- Throws:
IOException
-
nextTerm
protected BytesRef nextTerm() throws IOException
Moves to the next term line and reads it, it may be in the next block. The term details are not read yet. They will be read only when needed withreadTermStateIfNotRead()
.- Returns:
- The read term bytes; or null if there is no more term for the field.
- Throws:
IOException
-
initializeHeader
protected void initializeHeader(BytesRef searchedTerm, long targetBlockStartFP) throws IOException
Reads and setsblockHeader
. Sets null if there is no block for the field anymore.- Parameters:
searchedTerm
- The searched term; or null if none.targetBlockStartFP
- The file pointer of the block to read.- Throws:
IOException
-
initializeBlockReadLazily
protected void initializeBlockReadLazily() throws IOException
- Throws:
IOException
-
createBlockHeaderSerializer
protected BlockHeader.Serializer createBlockHeaderSerializer()
-
createBlockLineSerializer
protected BlockLine.Serializer createBlockLineSerializer()
-
createDeltaBaseTermStateSerializer
protected DeltaBaseTermStateSerializer createDeltaBaseTermStateSerializer()
-
readHeader
protected BlockHeader readHeader() throws IOException
Reads the block header. SetsblockHeader
.- Returns:
- The block header; or null if there is no block for the field anymore.
- Throws:
IOException
-
decodeBlockBytesIfNeeded
protected BytesRef decodeBlockBytesIfNeeded(int numBlockBytes) throws IOException
- Throws:
IOException
-
readTermStateIfNotRead
protected BlockTermState readTermStateIfNotRead() throws IOException
Reads theBlockTermState
if it is not already set. SetstermState
.- Throws:
IOException
-
readTermState
protected BlockTermState readTermState() throws IOException
Reads theBlockTermState
on the current line. SetstermState
.Overriding method may return null if there is no
BlockTermState
(in this case the extending class must support a nulltermState
).- Returns:
- The
BlockTermState
; or null if none. - Throws:
IOException
-
docFreq
public int docFreq() throws IOException
- Specified by:
docFreq
in classTermsEnum
- Throws:
IOException
-
totalTermFreq
public long totalTermFreq() throws IOException
- Specified by:
totalTermFreq
in classTermsEnum
- Throws:
IOException
-
termState
public TermState termState() throws IOException
- Overrides:
termState
in classBaseTermsEnum
- Throws:
IOException
-
postings
public PostingsEnum postings(PostingsEnum reuse, int flags) throws IOException
- Specified by:
postings
in classTermsEnum
- Throws:
IOException
-
impacts
public ImpactsEnum impacts(int flags) throws IOException
- Specified by:
impacts
in classTermsEnum
- Throws:
IOException
-
ramBytesUsed
public long ramBytesUsed()
- Specified by:
ramBytesUsed
in interfaceAccountable
-
getOrCreateDictionaryBrowser
protected IndexDictionary.Browser getOrCreateDictionaryBrowser() throws IOException
- Throws:
IOException
-
clearTermState
protected void clearTermState()
-
newCorruptIndexException
protected CorruptIndexException newCorruptIndexException(String msg, Long fp)
-
-