Policy for producing smaller index out of an input index, by examining its terms
and removing from the index some or all of their data as follows:
all terms of a certain field - see
all data of a certain term - see
all positions of a certain term in a certain document - see #pruneAllPositions(TermPositions, Term)
some positions of a certain term in a certain document - see #pruneSomePositions(int, int, Term)
The pruned, smaller index would, for many types of queries return nearly
identical top-N results as compared with the original index, but with increased performance.