1.3.1 Compression, Encryption, Hashing Flashcards
(20 cards)
Compression
- Process used to Reduce file size to take up less storage space or for faster transfer over a network
- increases the number of files that can be transferred in a given time
- allows faster download
- downloading a compressed file is faster than downloading the full version
Lossy compression
- reduces file size while also removing some information
- slightly reduces quality but significantly reduced file size [1] e.g images can tolerate some reduction in quality
- reduces image quality e.g more pixelated images or less clear audio recording
- suitable for image audio and video
- not suitable for text documents as important info may be lost and text may become unreadable
Lossless compression
- reduces file size without losing any information maintaining the original data [1]
- the original file can be recovered/recreated when it is uncompressed
- suitable for executable files and documents or vector style images, cartoons and logos
- maintains data integrity
Lossless compression types
- run length encoding
- dictionary encoding
Run length encoding
- a method of lossless compression
- Condenses identical elements to a single occurrence with its count next to it
- used in bitmap images to condense sequences of the same colour
- relies on consecutive pieces of data being the same so more effective when data has a lot of repetition
- doesn’t offer a great reduction in file size if there is little repetition
Dictionary encoding
- a method of lossless compression
- that replaces frequently occurring pieces of data with a shorter, unique code/index
- compressed data is stored alongside a dictionary
- the dictionary matches frequently occurring data to an index/code
- original data can be restored using the dictionary
- this method is effective for both text binary data
Encryption
- is used to keep data secure from unauthorised access when it’s being transmitted
- is used to convert readable data into an unreadable format
- uses keys which are specialised programs designed to scramble and unscramble data
Symmetric encryption
- is a method of encryption where the same key is used for both encryption and decryption of data [1]
- requires both parties to have a copy of the key [1]
- the sender encrypts information using a private key before transmission
- the receiver uses the same key to decrypt the data
- if the key is intercepted, any communications can also be intercepted
Symmetric encryption advantages / sutability
- usually faster making it ideal for encrypting large amounts of data/ faster and efficient for bulk data
- easier to implement as it uses a single key for both encryption and decryption/ same person encrypts and decrypts e.g when backing up
- less resource incentive
Symmetric encryptions CONS
- loss/interception of the key means the data encrypted with it can be compromised and intercepted
- so requires key to be shared securely with the other party
Asymmetric encryption
- uses 2 keys, a public key for encryption and a private key for decryption [1]
- public keys are shared openly allowing anyone to encrypt data [1]
- the private key is kept securely/locally on the receivers side and is kept private [1]
- use case: used when exchanging confidential data /secret communications .eg. Credit card details over the internet
Asymmetric encryption pros
- use of a separate private private key makes it more secure
- if the public key is lost or stolen data won’t be compromised
Hashing
- the process in which an input (e.g string of characters) is turned into a fixed sized value (called a hash) using a hash function
- even a slight change in the input message produces a totally different hash value [1]
- unlike encryption the hash function can’t be reversed to form the input message. [1]
Advantage of storing password/data as a hash
- hashing is useful for storing
passwords - a password entered by a user can be hashed and checked against the stored hash value to see if it’s correct [1]
- only the hash values are stored, not the actual passwords itself so a successful hacker would only gain access to hash values which cant be reversed to gain the passwords [1]
Hash tables
• Hash tables are data structures that store key-value pairs using a hash function to compute an index for fast data retrieval.
• They offer average-case O(1) time complexity for insertion, deletion, and lookup.
• Collisions are handled using techniques like chaining or linear probing
Hash table : data insertion
- Compute the hash of the key, using a hash function
- map it to an index, and place the key-value pair at that index.
Characteristic of good hash function
- low chance of collisions [1]
- quick to calculate (as lots of data needs to hashed) [1]
- provides an output that is smaller than its input, so quicker to compare hashes than original data [1]
Collision
When two keys/inputs produce the same hash
Collision resolution
- Chaining: items are stored together in a linked list under the hash value/index
- linear probing : the algorithm checks the next sequential slots until an empty one is found.
Asymmetric encryption cons
- it is slower than symmetric encryption
- uses more reasources
- if the private key is lost, there is no way to decrypt the info