Package org.apache.lucene.document
Document
for indexing and searching.
The document package provides the user level logical representation of content to be indexed and searched. The
package also provides utilities for working with Document
s and IndexableField
s.
Document and IndexableField
A Document
is a collection of IndexableField
s. A
IndexableField
is a logical representation of a user's content that needs to be indexed or stored.
IndexableField
s have a number of properties that tell Lucene how to treat the content (like indexed, tokenized,
stored, etc.) See the Field
implementation of IndexableField
for specifics on these properties.
Note: it is common to refer to Document
s having Field
s, even though technically they have
IndexableField
s.
Working with Documents
First and foremost, a Document
is something created by the user application. It is your job
to create Documents based on the content of the files you are working with in your application (Word, txt, PDF, Excel or any other format.)
How this is done is completely up to you. That being said, there are many tools available in other projects that can make
the process of taking a file and converting it into a Lucene Document
.
The DateTools
is a utility class to make dates and times searchable. IntPoint
, LongPoint
,
FloatPoint
and DoublePoint
enable indexing
of numeric values (and also dates) for fast range queries using PointRangeQuery
-
Class Summary Class Description BinaryDocValuesField Field that stores a per-documentBytesRef
value.BinaryPoint An indexed binary field for fast range filters.DateTools Provides support for converting dates to strings and vice-versa.Document Documents are the unit of indexing and search.DocumentStoredFieldVisitor AStoredFieldVisitor
that creates aDocument
from stored fields.DoubleDocValuesField Syntactic sugar for encoding doubles as NumericDocValues viaDouble.doubleToRawLongBits(double)
.DoublePoint An indexeddouble
field for fast range filters.DoubleRange An indexed Double Range field.DoubleRangeDocValuesField DocValues field for DoubleRange.FeatureField Field
that can be used to store static scoring factors into documents.Field Expert: directly create a field for a document.FieldType Describes the properties of a field.FloatDocValuesField Syntactic sugar for encoding floats as NumericDocValues viaFloat.floatToRawIntBits(float)
.FloatPoint An indexedfloat
field for fast range filters.FloatRange An indexed Float Range field.FloatRangeDocValuesField DocValues field for FloatRange.IntPoint An indexedint
field for fast range filters.IntRange An indexed Integer Range field.IntRangeDocValuesField DocValues field for IntRange.LatLonDocValuesField An per-document location field.LatLonDocValuesPointInPolygonQuery Polygon query forLatLonDocValuesField
.LatLonPoint An indexed location field.LatLonShape An geo shape utility class for indexing and searching gis geometries whose vertices are latitude, longitude values (in decimal degrees).LongPoint An indexedlong
field for fast range filters.LongRange An indexed Long Range field.LongRangeDocValuesField DocValues field for LongRange.NumericDocValuesField Field that stores a per-documentlong
value for scoring, sorting or value retrieval.ShapeField A base shape utility class used for both LatLon (spherical) and XY (cartesian) shape fields.ShapeField.DecodedTriangle Represents a encoded triangle usingShapeField.decodeTriangle(byte[], DecodedTriangle)
.ShapeField.Triangle polygons are decomposed into tessellated triangles usingTessellator
these triangles are encoded and inserted as separate indexed POINT fieldsSortedDocValuesField Field that stores a per-documentBytesRef
value, indexed for sorting.SortedNumericDocValuesField Field that stores a per-documentlong
values for scoring, sorting or value retrieval.SortedSetDocValuesField Field that stores a set of per-documentBytesRef
values, indexed for faceting,grouping,joining.StoredField A field whose value is stored so thatIndexSearcher.doc(int)
andIndexReader.document()
will return the field and its value.StringField A field that is indexed but not tokenized: the entire String value is indexed as a single token.TextField A field that is indexed and tokenized, without term vectors.XYDocValuesField An per-document location field.XYDocValuesPointInGeometryQuery XYGeometry query forXYDocValuesField
.XYPointField An indexed XY position field.XYShape A cartesian shape utility class for indexing and searching geometries whose vertices are unitless x, y values. -
Enum Summary Enum Description DateTools.Resolution Specifies the time granularity.Field.Store Specifies whether and how a field should be stored.ShapeField.DecodedTriangle.TYPE type of triangleShapeField.QueryRelation Query Relation Types