Unicode is a character encoding that attempts to assign a single unique integer, called a code point, to every character of every written human language. Code ponts range from 0 to 10FFFF hexidecimal (over 1.1 million code points), and are equivalent to the universal character set (standard ISO 10646.).The first 65535 code points (0-FFFF hexidecimal) are a subset known as the Basic Multilingual Plane, or BMP, and are sufficient for all languages except the character-rich Chinese, Japanese and Korean. UCS-2 is a 16-bit hexidecimal representation of every character point in the BMP. The first 256 code points (0-FF hexidecimal) match those of ISO 8859-1, often called Latin-1, the most popular 8-bit character encoding in the Western world The first 128 code points match ASCII (American Standard Code for Information Interchange), a 7-bit subset in the range 20 to 7F hexidecimal (32-127 decimal). ASCII has traditionally been used for plain text encoding in English.
Search Encyclopedia
|
Featured Article
|