Encyclopedia > Byte Order Mark

  Article Content

Byte Order Mark

A Byte Order Mark (BOM) is the character at code point FEFF (ZERO-WIDTH NO-BREAK SPACE), when that character is used to denote the Endianness of an encoded string of UCS/Unicode characters.

A BOM can be used to indicate that unlabeled text is UTF-16 or UTF-8 encoded, as well as indicating the byte-order of UTF-16 text, whether labeled or not.

In UTF-16, a BOM is expressed as the 8-bit byte sequence FE FF at the beginning of the encoded string, to indicate that the encoded characters that follow it use big-endian byte order; or it is expressed as the byte sequence FF FE to indicate little-endian order.

UTF-8 text can also use a BOM, although this is rare, since UTF-8 prescribes a fixed byte order, and since UTF-8 is often assumed or implicit, so it doesn't need a signature. The UTF-8 representation of the BOM is the byte sequence EF BB BF.

External Links



All Wikipedia text is available under the terms of the GNU Free Documentation License

 
  Search Encyclopedia

Search over one million articles, find something about almost anything!
 
 
  
  Featured Article
Dennis Gabor

...     Contents Dennis Gabor Dennis Gabor (Gábor Dénes) (1900-1979) was a Hungarian physicist. He invented holography in 1947, for ...

 
 
 
This page was created in 41 ms