Data Rep - Coding Systems Flashcards
(46 cards)
ASCII Full Form
American Standard Code for Information Interchange
Limitations of ASCII
- 256 characters are not sufficient to represent all possible characters, numbers and symbols.
- Due to being initially developed in English, it does not support other languages.
What are the two common encodings of Unicode?
- UTF-8, 256 total characters.
- UTF-16, 65,536 total characters.
What is a parity bit?
- Most ASCII characters only use 7 bits.
- The MSB (the 8th bit) can be used as a parity bit.
- This is a method of detecting errors during data transmission.
- However, it does not identify all errors.
What is one cause for the corruption of data being transmitted?
- Data is sent on carrier waves, and slight variations in the frequency can mean that a 0 is misinterpreted as a 1, making data very unreliable.
- Depending on the nature of the data, it could be critical, corrupting it entirely.
How do parity bits work?
- Counts the number of 1s in each byte before sending, to check whether it’s even/odd.
- Even parity: The number of 1s is counted, and the parity bit is either 1 or 0, in order to make the total 1s even.
- Odd parity: The parity bit is either 1 or 0, to ensure the overall number of 1s in the bit is odd.
- The received data is then checked, and if the no. of 1s is still even/odd, the data is assumed to have been received correctly.
Majority Voting
- A method of identifying errors in transmission.
- Sends the same bit multiple times (must be an odd number).
- If the repeated bits aren’t the same, then then majority voting checks which bit occurs most frequently and assumes this to be the correct bit.
- For example, 000 111 110 010 would be 0110, where majority voting occurs for the last 2 bits.
Advantage and disadvantage of Majority Voting
- The data does not have to be requested again, as majority voting decides the most likely correct bit.
- The volume of bits being transmitted is much larger, increasing time for data transmission.
Check Digits
- A form of redundancy check used for error detection on identification numbers, such as an ISBN-10 number used on books.
- This uses a process called modulo-11:
- Takes the original code, multiplies it by a weight (2, 3, 4…), then adds the products, then divides by 11, and subtracts the remainder from 11, and this remainder is the check digit.
- The number 23045 becomes 230456.
- The check digit can be added to the other numbers, and divided by 11. If the answer is whole, the check digit is correct.
- If not, the data is resent.
Colour Depth
- The amount of memory allocated to each pixel, in bits.
- If 24 bits were allocated to each pixel, it would give you 2^24 combinations or 16,777,216 different colours.
- 24 bits is the most common bit depth, with 8 bits allocated to each primary colour.
How do you calculate the total storage used by a bitmap image?
- Resolution x Bit Depth (answer in bits)
- 1920 x 1080 x 24 = 49,766,400 bits
- 49,766,400/8 = 6,220,800 bytes (B)
- 6,220,800/1,000,000 = 6.2208 megabytes (MB)
State what metadata is, and include 4 examples
- Data about data, it’s information that describes a file such as an image.
- Includes information such as:
- File type (png, jpeg…)
- Resolution
- Colour Depth
- Location picture was taken
- Date and time picture was taken
State what an ADC is and describe how it works
- Analogue to digital convertor, converts analogue signals to digital bit patterns.
- Records the amplitude of an analogue sound signal at regular intervals, and records the value as a bit pattern. This is called sampling, and the frequency of each sample is defined as the sampling rate.
DAC
- Digital to analogue convertor, converts digital bit patterns which represent sound into analogue signals.
- Reads a bit pattern representing an analogue signal, and outputs it into alternating, analogue, electrical signals.
What is a MIDI? How does it ‘record’ sound?
- Musical Instrument Digital Interface, a device which takes data in from a musical instrument which may be analogue.
- Stores sound as a series of ‘event messages’.
- Each event message is a series of instructions used to recreate a piece of music. They contain information such as:
- The duration of a note.
- The instrument with which a note is played.
- How loud a note is.
- If a note should be sustained.
Analogue vs Digital data
- Analogue data is continuous, meaning it can take any value at any point in time.
- Digital data is discrete, so it can only take a set value at any given time.
- Analogue data can change value as frequently as required, digital data can only change as specified time intervals.
How do computers represent sound?
- A sequence of sound samples, each of which takes a discrete digital value of bit patterns.
Define sampling resolution, and the drawback to a high sampling resolution
- The number of bits allocated to each sample.
- A higher sample resolution results in a higher quality audio, but an increased file size.
Calculating the size of a sound file (including the units of each variable)
- Duration of Sample (seconds) x Sampling Rate (hertz) x Sample Resolution (bits)
- A minute long, 44 kHz audio file with a sample resolution of 24:
60 x 44,000 x 24 = 63,360,000
63,360,000 / 8 = 7,920,000 bytes, 7.92 MB
- Additional metadata can add to this total file size.
The Nyquist Theorem, and its implication on the average sampling rate for audio
- The sampling rate of a digital audio file must be at least twice the frequency of the sound, in order to accurately represent the sound.
- Hence, we often use 44 kHz to represent sounds, as it is just above twice the human hearing range of 20 kHz.
Advantages and Disadvantages of MIDI
- It allows for easy manipulation of music (e.g. the duration of a note can be altered), without a loss of quality.
- MIDI files are smaller in size than sampled audio files, and are lossless.
- MIDI cannot be used for storing speech, and sometimes results in a less realistic sound than sampled recordings.
Lossless Compression
- A data compression technique that employs an algorithm to reduce the size of a file without permanently discarding any of the original data.
- The original data can be perfectly reconstructed from the compressed data.
Lossy Compression
- A data compression technique that permanently discards non-essential data from a file, leading to a decrease in the accuracy of the data, however a significant decrease in file size.
- Data removed from the original file is non-recoverable.
Why do we use data compression techniques?
- To reduce file sizes.
- This reduce the storage requirements of a file.
- This makes it quicker to transmit the data in these files.