Class BlockReader
- All Implemented Interfaces:
Accountable
,BytesRefIterator
- Direct Known Subclasses:
IntersectBlockReader
,STBlockReader
Reads fully the block in blockReadBuffer
. Then scans the block terms in memory. The
details region is lazily decoded with termStatesReadBuffer
which shares the same byte
array with blockReadBuffer
. See BlockWriter
and BlockLine
for the block
format.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.index.TermsEnum
TermsEnum.SeekStatus
-
Field Summary
Modifier and TypeFieldDescriptionprotected final BlockDecoder
protected int
Offset of the start of the first line of the current block (just after the header), relative to the block start.protected BlockHeader
Current block header.protected BlockHeader.Serializer
protected IndexInput
IndexInput
on theblock file
.protected BlockLine
Current block line.protected BlockLine.Serializer
protected ByteArrayDataInput
In-memory read buffer for the current block.protected long
Current block start file pointer, absolute in theblock file
.protected IndexDictionary.Browser
Holds theIndexDictionary.Browser
once loaded.protected final IndexDictionary.BrowserSupplier
IndexDictionary.Browser
supplier for lazy loading.protected final FieldMetadata
protected BytesRefBuilder
Set whenseekExact(BytesRef, TermState)
is called.protected int
Current line index in the block.protected final PostingsReaderBase
protected BytesRef
protected BlockLine
protected final BlockTermState
protected BlockTermState
Current block line details.protected boolean
Whether the currentTermState
has been forced with a call toseekExact(BytesRef, TermState)
.protected DeltaBaseTermStateSerializer
protected ByteArrayDataInput
In-memory read buffer for the details region of the current block.Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
Constructor Summary
ModifierConstructorDescriptionprotected
BlockReader
(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) -
Method Summary
Modifier and TypeMethodDescriptionprotected void
protected int
compareToMiddleAndJump
(BytesRef searchedTerm) Compares the searched term to the middle term of the block.protected BlockHeader.Serializer
protected BlockLine.Serializer
protected DeltaBaseTermStateSerializer
protected BytesRef
decodeBlockBytesIfNeeded
(int numBlockBytes) int
docFreq()
protected IndexDictionary.Browser
impacts
(int flags) protected void
protected void
initializeHeader
(BytesRef searchedTerm, long targetBlockStartFP) Reads and setsblockHeader
.protected boolean
isBeyondLastTerm
(BytesRef searchedTerm, long blockStartFP) Indicates whether the searched term is beyond the last term of the field.protected boolean
isCurrentTerm
(BytesRef searchedTerm) protected CorruptIndexException
newCorruptIndexException
(String msg, Long fp) next()
protected BytesRef
nextTerm()
Moves to the next term line and reads it, it may be in the next block.long
ord()
postings
(PostingsEnum reuse, int flags) long
protected BlockHeader
Reads the block header.protected BlockLine
Reads the current block line.protected BlockTermState
Reads theBlockTermState
on the current line.protected BlockTermState
Reads theBlockTermState
if it is not already set.void
seekExact
(long ord) Not supported.boolean
void
Positions thisBlockReader
without re-seeking the term dictionary.protected TermsEnum.SeekStatus
seekInBlock
(BytesRef searchedTerm) Seeks to the provided term in this block.protected TermsEnum.SeekStatus
seekInBlock
(BytesRef searchedTerm, long blockStartFP) Seeks to the provided term in the block starting at the provided file pointer.term()
long
Methods inherited from class org.apache.lucene.index.BaseTermsEnum
attributes
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
blockInput
IndexInput
on theblock file
. -
postingsReader
-
fieldMetadata
-
blockDecoder
-
blockHeaderReader
-
blockLineReader
-
blockReadBuffer
In-memory read buffer for the current block. -
termStatesReadBuffer
In-memory read buffer for the details region of the current block. It shares the same byte array asblockReadBuffer
, with a different position. -
termStateSerializer
-
dictionaryBrowserSupplier
IndexDictionary.Browser
supplier for lazy loading. -
dictionaryBrowser
Holds theIndexDictionary.Browser
once loaded. -
blockStartFP
protected long blockStartFPCurrent block start file pointer, absolute in theblock file
. -
blockHeader
Current block header. -
blockLine
Current block line. -
termState
Current block line details. -
blockFirstLineStart
protected int blockFirstLineStartOffset of the start of the first line of the current block (just after the header), relative to the block start. -
lineIndexInBlock
protected int lineIndexInBlockCurrent line index in the block. -
termStateForced
protected boolean termStateForcedWhether the currentTermState
has been forced with a call toseekExact(BytesRef, TermState)
.- See Also:
-
forcedTerm
Set whenseekExact(BytesRef, TermState)
is called.This optimizes the use-case when the caller calls first
seekExact(BytesRef, TermState)
and thenpostings(PostingsEnum, int)
. In this case we don't access the terms block file (we don't seek) but directly the postings file because we already have theTermState
with the file pointers to the postings file. -
scratchBlockBytes
-
scratchTermState
-
scratchBlockLine
-
-
Constructor Details
-
BlockReader
protected BlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) throws IOException - Parameters:
dictionaryBrowserSupplier
- to load theIndexDictionary.Browser
lazily inseekCeil(BytesRef)
.blockDecoder
- Optional block decoder, may be null if none. It can be used for decompression or decryption.- Throws:
IOException
-
-
Method Details
-
seekCeil
- Specified by:
seekCeil
in classTermsEnum
- Throws:
IOException
-
seekExact
- Overrides:
seekExact
in classBaseTermsEnum
- Throws:
IOException
-
isCurrentTerm
-
isBeyondLastTerm
Indicates whether the searched term is beyond the last term of the field.- Parameters:
blockStartFP
- The current block start file pointer.
-
seekInBlock
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm, long blockStartFP) throws IOException Seeks to the provided term in the block starting at the provided file pointer. Does not exceed the block.- Throws:
IOException
-
seekInBlock
Seeks to the provided term in this block.Does not exceed this block;
TermsEnum.SeekStatus.END
is returned if it follows the block.Compares the line terms with the
searchedTerm
, taking advantage of the incremental encoding properties.Scans linearly the terms. Updates the current block line with the current term.
- Throws:
IOException
-
compareToMiddleAndJump
Compares the searched term to the middle term of the block. If the searched term is lexicographically equal or after the middle term then jumps to the second half of the block directly.- Returns:
- The comparison between the searched term and the middle term.
- Throws:
IOException
-
readLineInBlock
Reads the current block line. SetsblockLine
and incrementslineIndexInBlock
.- Returns:
- The
BlockLine
; or null if there no more line in the block. - Throws:
IOException
-
seekExact
Positions thisBlockReader
without re-seeking the term dictionary.The block containing the term is not read by this method. It will be read lazily only if needed, for example if
next()
is called. Callingpostings(org.apache.lucene.index.PostingsEnum, int)
after this method does require the block to be read.- Overrides:
seekExact
in classBaseTermsEnum
-
seekExact
public void seekExact(long ord) Not supported. -
next
- Specified by:
next
in interfaceBytesRefIterator
- Throws:
IOException
-
nextTerm
Moves to the next term line and reads it, it may be in the next block. The term details are not read yet. They will be read only when needed withreadTermStateIfNotRead()
.- Returns:
- The read term bytes; or null if there is no more term for the field.
- Throws:
IOException
-
initializeHeader
Reads and setsblockHeader
. Sets null if there is no block for the field anymore.- Parameters:
searchedTerm
- The searched term; or null if none.targetBlockStartFP
- The file pointer of the block to read.- Throws:
IOException
-
initializeBlockReadLazily
- Throws:
IOException
-
createBlockHeaderSerializer
-
createBlockLineSerializer
-
createDeltaBaseTermStateSerializer
-
readHeader
Reads the block header. SetsblockHeader
.- Returns:
- The block header; or null if there is no block for the field anymore.
- Throws:
IOException
-
decodeBlockBytesIfNeeded
- Throws:
IOException
-
readTermStateIfNotRead
Reads theBlockTermState
if it is not already set. SetstermState
.- Throws:
IOException
-
readTermState
Reads theBlockTermState
on the current line. SetstermState
.Overriding method may return null if there is no
BlockTermState
(in this case the extending class must support a nulltermState
).- Returns:
- The
BlockTermState
; or null if none. - Throws:
IOException
-
term
-
ord
public long ord() -
docFreq
- Specified by:
docFreq
in classTermsEnum
- Throws:
IOException
-
totalTermFreq
- Specified by:
totalTermFreq
in classTermsEnum
- Throws:
IOException
-
termState
- Overrides:
termState
in classBaseTermsEnum
- Throws:
IOException
-
postings
- Specified by:
postings
in classTermsEnum
- Throws:
IOException
-
impacts
- Specified by:
impacts
in classTermsEnum
- Throws:
IOException
-
ramBytesUsed
public long ramBytesUsed()- Specified by:
ramBytesUsed
in interfaceAccountable
-
getOrCreateDictionaryBrowser
- Throws:
IOException
-
clearTermState
protected void clearTermState() -
newCorruptIndexException
-