org.apache.lucene.util
Class UnicodeUtil

java.lang.Object
  extended by org.apache.lucene.util.UnicodeUtil

public final class UnicodeUtil
extends Object

Class to encode java's UTF16 char[] into UTF8 byte[] without always allocating a new byte[] as String.getBytes("UTF-8") does.

WARNING: This API is a new and experimental and may suddenly change.


Nested Class Summary
static class UnicodeUtil.UTF16Result
           
static class UnicodeUtil.UTF8Result
           
 
Field Summary
static int UNI_REPLACEMENT_CHAR
           
static int UNI_SUR_HIGH_END
           
static int UNI_SUR_HIGH_START
           
static int UNI_SUR_LOW_END
           
static int UNI_SUR_LOW_START
           
 
Constructor Summary
UnicodeUtil()
           
 
Method Summary
static void UTF16toUTF8(char[] source, int offset, int length, UnicodeUtil.UTF8Result result)
          Encode characters from a char[] source, starting at offset for length chars.
static void UTF16toUTF8(char[] source, int offset, UnicodeUtil.UTF8Result result)
          Encode characters from a char[] source, starting at offset and stopping when the character 0xffff is seen.
static void UTF16toUTF8(String s, int offset, int length, UnicodeUtil.UTF8Result result)
          Encode characters from this String, starting at offset for length characters.
static void UTF8toUTF16(byte[] utf8, int offset, int length, UnicodeUtil.UTF16Result result)
          Convert UTF8 bytes into UTF16 characters.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UNI_SUR_HIGH_START

public static final int UNI_SUR_HIGH_START
See Also:
Constant Field Values

UNI_SUR_HIGH_END

public static final int UNI_SUR_HIGH_END
See Also:
Constant Field Values

UNI_SUR_LOW_START

public static final int UNI_SUR_LOW_START
See Also:
Constant Field Values

UNI_SUR_LOW_END

public static final int UNI_SUR_LOW_END
See Also:
Constant Field Values

UNI_REPLACEMENT_CHAR

public static final int UNI_REPLACEMENT_CHAR
See Also:
Constant Field Values
Constructor Detail

UnicodeUtil

public UnicodeUtil()
Method Detail

UTF16toUTF8

public static void UTF16toUTF8(char[] source,
                               int offset,
                               UnicodeUtil.UTF8Result result)
Encode characters from a char[] source, starting at offset and stopping when the character 0xffff is seen. Returns the number of bytes written to bytesOut.


UTF16toUTF8

public static void UTF16toUTF8(char[] source,
                               int offset,
                               int length,
                               UnicodeUtil.UTF8Result result)
Encode characters from a char[] source, starting at offset for length chars. Returns the number of bytes written to bytesOut.


UTF16toUTF8

public static void UTF16toUTF8(String s,
                               int offset,
                               int length,
                               UnicodeUtil.UTF8Result result)
Encode characters from this String, starting at offset for length characters. Returns the number of bytes written to bytesOut.


UTF8toUTF16

public static void UTF8toUTF16(byte[] utf8,
                               int offset,
                               int length,
                               UnicodeUtil.UTF16Result result)
Convert UTF8 bytes into UTF16 characters. If offset is non-zero, conversion starts at that starting point in utf8, re-using the results from the previous call up until offset.



Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.