Information Coding Systems Flashcards
(21 cards)
What is a character?
Single unit of textual information
What is a character set?
Scheme of mapping that maps a set of characters to binary numbers recognised by a computer system
What is a character code?
A UNIQUE integer representation of a character that is interpreted by a computer
What is ASCII?
A character set created to encode alphanumeric characters (English language) as well as special characters (punctuation symbols + non printable control codes)
7 bits to store each character
In 8 bit systems, extra bit can be used as parity bit
What is Unicode?
- A character set developed with the aim of representing every possible character.
- It uses up to 32 bits to represent each character
- First 128 characters are the same as ASCII making it backwards-compatible
- UTF-8 and UTF-16 use variable numbers of bytes to represent characters
Advantages of parity bits?
- Simplest implementation of error detection
- Very little overhead as it only involves one additional bit.
- Can make use of spare bit in ASCII
How is a parity bit used by a computer?
- A parity bit is generated with the value of the parity bit is set to either 1 or 0 to satisfy the type of parity being enforced
- Data sent to receiver
- The number of 1s is counted by receiver
- If number of ones don’t match expected parity (odd or even), then a corruption is detected and data is usually retransmitted
What were the limitations of ASCII?
7 bits per character - cant represent more than 128 characters - mostly from English languages
Limited usefulness in other countries using languages with different characters/symbols
Extended ASCII character set introduced - full 8 bits - represent up to 256 characters
still not enough to represent all symbols used by different languages around the world.
X ASCII different versions for different languages
*Documents unreadable if incorrect character encoding used by a text reading application
What are the advantages of the parity bit?
- Relatively easy to implement
- Does not require significant extra data to be included in the transmission or retrieval
What are the disadvantages of the parity bit?
- If an even number of bits corrupted, parity is not affected and corruption is not detected
- Not possible to identify which bit has been corrupted
- Data correction not possible
What is majority voting?
Each bit is sent multiple times (odd number greater than 2). The receiver checks each group of 3 bits and if they are not all the same it assumes the one it received the most copies of is the correct/authoritative value of the bit
What are the advantages of majority voting?
- Allows for error correction as it knows exactly where the error occurred
- No need for retransmission
What are the disadvantages of majority voting?
- Triples the amount of data that is sent
- Is not entirely reliable as 2 transpositional errors in a group of 3 bits can cause an incorrect correction
What is a check digit?
Extra digit appended to a number in order to confirm that the number has been transmitted or retrieved correctly
algorithm applied to number, value of result compared to check digit
if the values do not match then an error has occurred and number is retransmitted/rescanned
What is a checksum?
Similar to a check digit except a larger sum is transmitted with the data rather than a single digit
checksum derived by applying algorithm to data before transmission or storage.
algorithm applied to received data, if result matches checksum, data is unchanged
What makes a good checksum or check digit algorithm?
It will produce very different values even if there is a small change to the original data to make it clear that there is corruption
What are the uses of a checksum?
- Confirm integrity of whole files or applications
- Checksum is provided by the owners/publisher so that the checksums of applications can be checked to ensure file/application correctly download + not corrupted
How can data become corrupted?
- Cosmic rays
- Magnets changing magnetic media
- Scratches on a CD
- Cables become damaged
How does a check sum work?
- Checksum is calculated from the binary data before it is transmitted using a checksum algorithm
- Checksum is recalculated when the packet is received
- If the checksum received in packet matches recalculated checksum then data is correct. If it doesn’t match, it is corrupted
Decimal digit character representations
- character representations of decimal numbers take up more space to store and more bandwidth to transmit as they require 7/8 bits per digit
- not possible to perform mathematical operations on the character representation of digits without casting them to numeric types
*important to cast
Why is error checking important?
checks for corruption of data to ensure that data is not corrupt