Binary codes are used to represent characters because computers use binary data to store and process information. Characters are transformed into a series of 1s and 0s, which can be easily stored and processed by computers. All of the characters that a computer can use are called a character set.
A character set is a standardised collection of characters that includes letters, numbers, symbols, and punctuation. The set of characters in a character set determine what can be represented and displayed by a computer, the character set is stored in binary.
There are 2 main character sets
ASCII - 7 bit (128 possible characters)
Extended ASCII - 8 bit (256 possible characters)
Unicode - 16 bit (65,536 )
ASCII uses seven bits, giving a character set of 128 characters. The characters are represented in a table, called the ASCII table. The 128 characters include:
32 control codes
26 upper case letters
26 lower case letters
32 punctuation codes, symbols, and space
numeric digits 0-9
Example of an ASCII characer set
Extended ASCII has an extra bit compared to standard ASCII to increase the number of characters that can be represented in digital form. In standard ASCII, each character is represented using 7 bits, which allows for a total of 128 characters to be represented. However, this limited character set can be insufficient for some applications that require representation of more characters, especially characters from different languages and symbols. By adding an extra bit to the standard ASCII character set, the number of characters that can be represented increases to 256, allowing for more characters to be represented. This extra bit allows for the representation of additional characters, including special characters, accented characters, and characters from different languages.
Unicode is a 16-bit character set, providing support for 65,536 characters, which includes most of the world's written languages, as well as a wide range of symbols and special characters. This means that UNICODE can be used to represent text in almost any written language, making it a widely adopted standard for internationalization and localization of software applications.
The first 256 characters are the same as ASCII to provide backward compatiblity
Art style for this page - Vintage art