public abstract class CharacterUtils extends Object
CharacterUtils
provides a unified interface to Character-related
operations to implement backwards compatible character operations based on a
Version
instance.Modifier and Type | Class and Description |
---|---|
static class |
CharacterUtils.CharacterBuffer
A simple IO buffer to use with
fill(CharacterBuffer, Reader) . |
Constructor and Description |
---|
CharacterUtils() |
Modifier and Type | Method and Description |
---|---|
abstract int |
codePointAt(char[] chars,
int offset,
int limit)
Returns the code point at the given index of the char array where only elements
with index less than the limit are used.
|
abstract int |
codePointAt(CharSequence seq,
int offset)
Returns the code point at the given index of the
CharSequence . |
abstract int |
codePointCount(CharSequence seq)
Return the number of characters in
seq . |
boolean |
fill(CharacterUtils.CharacterBuffer buffer,
Reader reader)
Convenience method which calls
fill(buffer, reader, buffer.buffer.length) . |
abstract boolean |
fill(CharacterUtils.CharacterBuffer buffer,
Reader reader,
int numChars)
Fills the
CharacterUtils.CharacterBuffer with characters read from the given
reader Reader . |
static CharacterUtils |
getInstance()
Returns a
CharacterUtils implementation. |
static CharacterUtils |
getJava4Instance()
Deprecated.
Only for n-gram backwards compat
|
static CharacterUtils.CharacterBuffer |
newCharacterBuffer(int bufferSize)
Creates a new
CharacterUtils.CharacterBuffer and allocates a char[]
of the given bufferSize. |
abstract int |
offsetByCodePoints(char[] buf,
int start,
int count,
int index,
int offset)
Return the index within
buf[start:start+count] which is by offset
code points from index . |
int |
toChars(int[] src,
int srcOff,
int srcLen,
char[] dest,
int destOff)
Converts a sequence of unicode code points to a sequence of Java characters.
|
int |
toCodePoints(char[] src,
int srcOff,
int srcLen,
int[] dest,
int destOff)
Converts a sequence of Java characters to a sequence of unicode code points.
|
void |
toLowerCase(char[] buffer,
int offset,
int limit)
Converts each unicode codepoint to lowerCase via
Character.toLowerCase(int) starting
at the given offset. |
void |
toUpperCase(char[] buffer,
int offset,
int limit)
Converts each unicode codepoint to UpperCase via
Character.toUpperCase(int) starting
at the given offset. |
public static CharacterUtils getInstance()
CharacterUtils
implementation.CharacterUtils
implementation according to the given
Version
instance.@Deprecated public static CharacterUtils getJava4Instance()
public abstract int codePointAt(CharSequence seq, int offset)
CharSequence
.seq
- a character sequenceoffset
- the offset to the char values in the chars array to be convertedNullPointerException
- - if the sequence is null.IndexOutOfBoundsException
- - if the value offset is negative or not less than the length of
the character sequence.public abstract int codePointAt(char[] chars, int offset, int limit)
chars
- a character arrayoffset
- the offset to the char values in the chars array to be convertedlimit
- the index afer the last element that should be used to calculate
codepoint.NullPointerException
- - if the array is null.IndexOutOfBoundsException
- - if the value offset is negative or not less than the length of
the char array.public abstract int codePointCount(CharSequence seq)
seq
.public static CharacterUtils.CharacterBuffer newCharacterBuffer(int bufferSize)
CharacterUtils.CharacterBuffer
and allocates a char[]
of the given bufferSize.bufferSize
- the internal char buffer size, must be >= 2
CharacterUtils.CharacterBuffer
instance.public final void toLowerCase(char[] buffer, int offset, int limit)
Character.toLowerCase(int)
starting
at the given offset.buffer
- the char buffer to lowercaseoffset
- the offset to start atlimit
- the max char in the buffer to lower casepublic final void toUpperCase(char[] buffer, int offset, int limit)
Character.toUpperCase(int)
starting
at the given offset.buffer
- the char buffer to UPPERCASEoffset
- the offset to start atlimit
- the max char in the buffer to lower casepublic final int toCodePoints(char[] src, int srcOff, int srcLen, int[] dest, int destOff)
public final int toChars(int[] src, int srcOff, int srcLen, char[] dest, int destOff)
public abstract boolean fill(CharacterUtils.CharacterBuffer buffer, Reader reader, int numChars) throws IOException
CharacterUtils.CharacterBuffer
with characters read from the given
reader Reader
. This method tries to read numChars
characters into the CharacterUtils.CharacterBuffer
, each call to fill will start
filling the buffer from offset 0
up to numChars
.
In case code points can span across 2 java characters, this method may
only fill numChars - 1
characters in order not to split in
the middle of a surrogate pair, even if there are remaining characters in
the Reader
.
This method guarantees
that the given CharacterUtils.CharacterBuffer
will never contain a high surrogate
character as the last element in the buffer unless it is the last available
character in the reader. In other words, high and low surrogate pairs will
always be preserved across buffer boarders.
A return value of false
means that this method call exhausted
the reader, but there may be some bytes which have been read, which can be
verified by checking whether buffer.getLength() > 0
.
buffer
- the buffer to fill.reader
- the reader to read characters from.numChars
- the number of chars to readfalse
if and only if reader.read returned -1 while trying to fill the bufferIOException
- if the reader throws an IOException
.public final boolean fill(CharacterUtils.CharacterBuffer buffer, Reader reader) throws IOException
fill(buffer, reader, buffer.buffer.length)
.IOException
public abstract int offsetByCodePoints(char[] buf, int start, int count, int index, int offset)
buf[start:start+count]
which is by offset
code points from index
.Copyright © 2000-2015 Apache Software Foundation. All Rights Reserved.