TUM CCSM Commons

edu.tum.cs.commons.filesystem
Enum EByteOrderMark

java.lang.Object
  extended by java.lang.Enum<EByteOrderMark>
      extended by edu.tum.cs.commons.filesystem.EByteOrderMark
All Implemented Interfaces:
java.io.Serializable, java.lang.Comparable<EByteOrderMark>

public enum EByteOrderMark
extends java.lang.Enum<EByteOrderMark>

Enumeration of the UTF byte order marks (BOM). The actual values are taken from http://unicode.org/faq/utf_bom.html

The order of the values in this enum is chosen such that BOMs that are a prefix of other BOMs are at the end, i.e. UTF-32 is before UTF-16. This way we can check the BOM prefix in the order of the enum values' appearance.

Version:
$Rev: 29722 $
Author:
hummelb, $Author: deissenb $
Rating:
GREEN Hash: 2AAB6CBCE60BACE98E4803B711962593

Enum Constant Summary
UTF_16BE
          UTF-16 with big endian encoding.
UTF_16LE
          UTF-16 with little endian encoding.
UTF_32BE
          UTF-32 with big endian encoding.
UTF_32LE
          UTF-32 with little endian encoding.
UTF_8_BOM
          UTF-8.
 
Field Summary
static int MAX_BOM_LENGTH
          The maximal length of a BOM.
 
Method Summary
static EByteOrderMark determineBOM(byte[] data)
          This method checks the start of the provided data array to find a BOM.
 byte[] getBOM()
          Returns the byte order mark.
 int getBOMLength()
          Returns the size of the BOM in bytes.
 java.lang.String getEncoding()
          Returns the encoding.
static EByteOrderMark valueOf(java.lang.String name)
          Returns the enum constant of this type with the specified name.
static EByteOrderMark[] values()
          Returns an array containing the constants of this enum type, in the order they are declared.
 
Methods inherited from class java.lang.Enum
clone, compareTo, equals, finalize, getDeclaringClass, hashCode, name, ordinal, toString, valueOf
 
Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait
 

Enum Constant Detail

UTF_32BE

public static final EByteOrderMark UTF_32BE
UTF-32 with big endian encoding.


UTF_32LE

public static final EByteOrderMark UTF_32LE
UTF-32 with little endian encoding.


UTF_16BE

public static final EByteOrderMark UTF_16BE
UTF-16 with big endian encoding.


UTF_16LE

public static final EByteOrderMark UTF_16LE
UTF-16 with little endian encoding.


UTF_8_BOM

public static final EByteOrderMark UTF_8_BOM
UTF-8. Note that for UTF-8 the endianess is not relevant and that the BOM is optional.

Field Detail

MAX_BOM_LENGTH

public static final int MAX_BOM_LENGTH
The maximal length of a BOM.

See Also:
Constant Field Values
Method Detail

values

public static EByteOrderMark[] values()
Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:
for (EByteOrderMark c : EByteOrderMark.values())
    System.out.println(c);

Returns:
an array containing the constants of this enum type, in the order they are declared

valueOf

public static EByteOrderMark valueOf(java.lang.String name)
Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)

Parameters:
name - the name of the enum constant to be returned.
Returns:
the enum constant with the specified name
Throws:
java.lang.IllegalArgumentException - if this enum type has no constant with the specified name
java.lang.NullPointerException - if the argument is null

getEncoding

public java.lang.String getEncoding()
Returns the encoding.


getBOM

public byte[] getBOM()
Returns the byte order mark. This returns a copy, so the array may be modified.


getBOMLength

public int getBOMLength()
Returns the size of the BOM in bytes.


determineBOM

public static EByteOrderMark determineBOM(byte[] data)
This method checks the start of the provided data array to find a BOM. If a BOM is found, the corresponding enum value is returned. Otherwise, null is returned. If possible, the provided data should at least be of size 4. Otherwise the encoding might not be detected correctly. However, the method also works with shorter arrays (e.g. if a file consists of only 3 bytes).


TUM CCSM Commons

TUM CCSM Commons - 2.7