Unicode is a universal international standard character encoding that is capable of representing most of the world's written languages.

Unicode is a standard designed to consistently and uniquely encode characters used in written languages throughout the world.

The Unicode standard uses hexadecimal to express a character.

For example , the value 0x0041= A

  • Problem of using old character sets

    This caused two problems:

    • A particular code value corresponds to different letters in the various language standards.

    • The encodings for languages with large character sets have variable length.Some common characters are encoded as single bytes, other require two or more byte.

  • Solution of above problem

    To solve these problems, a new language standard was developed i.e. Unicode System. In unicode, character holds 2 byte, so java also uses 2 byte for characters.

    • lowest value: \u0000 .

    • highest value: \uFFFF.

  • Why Java uses Unicode?

    To enable a computer system for storing text and numbers which is understandable by humans, there should be a code that transforms characters into numbers.

    The central objective of Unicode is to unify different language encoding schemes in order to avoid confusion among computer systems that uses limited encoding standards such as ASCII, EBCDIC etc.

What next?

The next topic is Operators in Java

Share this page