Data Representation · 3 question types
Past paper frequency (2018 to 2024)
This topic accounts for approximately 5% of your exam marks.
ASCII/Unicode, sound sampling and pixel/colour depth appear regularly as 2 to 4 mark questions.
Unicode was developed in the 1990s to fix the limitations of ASCII. The key change is that Unicode uses far more bits per character.
| Feature | Unicode |
|---|---|
| Bits per character | A minimum of 16 bits (and up to 32 in some encodings) |
| Number of characters | At least 2¹⁶ = 65 536 |
| Covers | Every major writing system in the world: Latin, Greek, Cyrillic, Chinese, Japanese, Korean, Arabic, Hebrew, Hindi, Thai, and many more. Also includes mathematical symbols, technical symbols, and emoji |
A nice piece of design: the first 128 Unicode code points are identical to ASCII. This means any text written in plain ASCII is already valid Unicode, so Unicode is backwards-compatible with ASCII for English text.