UNIT 4: Exchanging Data Flashcards

(106 cards)

1
Q

Capturing data

A

digital (by card)
manual - filling out forms

EH

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

capturing data - automated methods

A
  • smart card readers
  • barcode readers
  • scanners
  • optical character recognition (OCR)
  • optical mark recognition (OMR)
  • magnetic ink character recognition
  • sensors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Inputting data

A

once data has been collected it can be transferred to a database

  • automaticallly using the DBMS software
  • by typing it into a customised form
  • importing it from a spreadsheet or file
  • using EDI (electronic data interchange) - this is used to transfer data between one computer system and another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does EDI stand for

A

Electronic data interchange

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

EDI

A

Elecetronic data interchange is the computer to computer exchange of documents such as purchase orders, invoices and shipping documnets between 2 companies or business partners

replaces post, email or fax

all documents must be in a standard format so that the computer can understand them

EDI translation software may be used to translate the EDI format so the data can be input directly to a company database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Transaction processing

A

in the context of databases a single logical operation is defined as a transaction

it may consist of several operations for example a customer order may consist of several order lines…

  • all must be processed
  • quantity of each product adjusted on the stock file
  • credit card details checked
  • payment accepted or rejected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does ACID stand for

A

Atomicity, consistency, isolation, durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ACID

A

set of properties to ensure that the integrity of the database is maintained under all circumstances

guarentes that transactions are processed reliably

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ATOMICITY

A

this property requires that a transaction is processed in its entirety or not at all

in any situation, including power cuts or hard disk crashes, it is not possible to process only part of a transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CONSISTANCY

A

This property ensures that no transaction can violate any of the defined validation rules

referential integrity, specified when the database is set up, will always be upheld

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is referential integrity

A

Referential integrity in database management ensures consistency between related tables by maintaining valid relationships between primary and foreign keys. It means that a foreign key in one table (the “child” table) must reference a valid primary key in another table (the “parent” table). This prevents inconsistencies that can arise from orphaned records or mismatched data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ISOLATION

A

the isolation property ensures that concurrent execution of transactions leads to the same result as if transactions were processed one after the other

This is crucial in a multi-user database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DURABILITY

A

this ensures that once a transaction has been committed, it will remain so, even in the event of a power cut

as each part of a transaction is completed, it is held in a buffer on disk until all elements of the transaction are completed

only then will the changes to the database tables be made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Potential problems with multi-user databases

A

allowing multiple users to simultaneously access a database could potentially cause one of the updates to be lost

for example

  • when an item is to be updated, the entire block in which the record is located is read into the user’s own local memory at the workstation
  • when the record is saved the block is rewritten to the file server
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Record locking

A

prevents simultaneous access to objects in a database in order to prevent updates being lost or inconsistencies in the data arising

User a record to memeory cahges x address 9.15 to 9.22
user b copies record, changes x balance 9.17 to 9.20

A saves update, B’s changes are lost

user record is locked when a user retrieves if for editing or updating

anyone else attempting to retrieve it is denied access until the transaction is completed or cancelled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Problems with record locking

A

if 2 users are attempting to update two records, a situation can arise in which neither can proceed, known as a dead lock

ken is attempting to make a transfer from customers a account to customers b account

meanwhile Paul is attempting to make a transfer from customers b account to customers a account

  • keep waiting - both are waiting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Serialisation

A

the data base management system (DBMS) must prevent such situations from arising

seralisation ensures that transactions do not overlap in time and therefore cannot interfere with each other or lead to updates being lost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what does DBMS stand for

A

data base management system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

seralistaion tehcniques include:

A
  • timestamp ordering
  • commitment ordering
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Timestamp ordering

A

every object in the database has a read timestamp and a write timestamp

these are updated whenever an object is read or written

….when a user tries to save an update, if the read timestamp isn’t the same as when they started the transaction, the DBMS knows another user has accessed the same object

a - 9.05 - 9.10
b - 9.06 - 9.10

A cant save update
B does

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Commitment ordering

A

this is another serialisation technique to ensure that no transactions are lost if two clients are simultaneously trying to update a recored

transactions are ordered in terms of their dependencies on one another as well as the time they were initiated

  • it can be sued to prevent deadlock by blocking one request until another is completed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Redundancy

A

many organistions cannot afford to have their computer systems go down for even a short time

many organisations have built in redundancy in their computer systems

duplicate hardware, located in different geographical ares, mirrors every transaction that takes place on the main system

if this fails the back up system automatically takes over

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

exmaple of referential integirty

A

Referential integrity ensures that relationships between tables in a database are valid and consistent.
A good example is a hotel booking system where a booking table references a room table. Referential integrity ensures that every booking entry references an existing room ID in the room table, preventing “orphaned” bookings that point to non-existent rooms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Data transfer and storage

data is constantly being moved around….

A

Data is constantly being moved around systems and networks

  • transfer is usually high speed and accurate
  • as distances get longer, transfer is slower and more susceptible to interference
  • storage space can be limited
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Reducing data requirements
text, image and sound data can be significantly reduced in size
26
reducing the amount of data to send or store ensures that
data is sent more quickly less bandwidth is used as transfer limits may apply buffering on audio and video streams is less likely to occur less storage is required
27
name the 2 types of compression
lossy and lossless
28
lossy
non essential data is permanntly removed, for example, different shades of the same colour in an image or frequencies of sound outside the range of human hearing
29
lossless
patterns in the data are spotted and summaried in shorter format wihtout permanently removing any information
30
lossy compression with a JPG
Removes data permanently to reduce file size tries to reconstruct an image without the missing data more pixalated effect, more blurry
31
lossy compression with an MP3
removes the sound in the frequency ranges that we cant so easily hear or that least affect the percieved playback quality quieter notes played at the same time as louder sounds are removed
32
lossless compression
works by recording patterns in the data rather than the data itself - using this pattern information, a new file can be replicated exactly without any loss of data - the reduction in file size is less than for lossy compression - why is used for compressing text files or software code
33
what does RLE stand for
Run Length Encoding
34
RLE
a basic methods of compression that summaries consecutive patterns of the same data works well with image and sound data where data could be repeated many times
35
RLE of sounds
a sound recording could have many thousands of samples taken every second (typically 44,000) RLE records one exmaple of the sample and how many times is consecutively repeats
36
dictionary compression
spots regularly occuring data and stores it spearately in a dictionary - refrence to the entry in the dictionary is stored in the main file thereby reducting the original data stored - even though the dictionary produces additional overheads the space saving negates this problem
37
compressing larger volumes
in a text document each letter could be stored as an ASCII code of 8 bits the word could be added to a dictionary and assigned the binary code 01 which is a reduction of 38 bits for each occurrence
38
lossy - MP4
records changes in differences between picture frames of a video rather than each entire frame
39
lossy and lossless with different types of files
MP3, MP4, JPG - LOSSY ZIP- LOSSLESS
40
how do u find the RLE compressed file size in bytes
count up all the markings and times by 2
41
Encryption
a way of making sure data cannot be understood if you dont possess the means to decrypt it
42
how encryption works
- plaintext of a message is ecnrypted using a cipher alogrithm and key into equivalent ciphertext - when recieved, the cipher text is decrypted back to plaintext using the same or different key - two methods at the opposite end oof the security are the caesar cipher and the vernam cipher Caaesar is low level of sectuity Vernam is high
43
Caesar Cipher
just shifts the letters of the laphabet most basic and most insecure
44
Brute force attack - encryption
attempts every possible key to decrypt ciphertext until one works
45
Frequency analysis
letters are not used equally often - E is most common in englihsh
46
Vernam Cipher
encryption key, also known as the one time pag, is the only cipher proven to be unbreakable key must be: - a truly random sequence greater or equal in lengthj than the plaintext and only ever used once - shared with the recipent by hand, independently of the message and destroyed immedietably after use
47
Decoding
encryption and decryption of the message is performed bit by bit using an exlusive or (XOR) operation with the shared key
48
what is XOR
(eXclusive OR) A Boolean logic operation that is widely used in cryptography as well as in generating parity bits for error checking and fault tolerance. XOR compares two input bits and generates one output bit. The logic is simple. If the bits are the same, the result is 0. If the bits are different, the result is 1. PT CT KEY. up! u know what im talking about
49
one time pad
one time pad must truly be random, generated from a physical and unpredictable phenomenon sources my include: atmospheric noise, radioactive decay, the movements of a mouse or snapshots of a lava lamp - truly random key will render any frequency analysis useless as it would have a uniform distribution - computer generated random sequeneces aren't actually random
50
algorithmic security
ciphers are based on computational security - keys are determined using a computer alogirthm - key derived from an algorithm, can also be unpicked - given enough ciphertext, computer power and time, any key (except the one time pad) can be determined and the message cracked
51
symmetric encryption / private key encryption
symmetric encryption is also known as private key encryption same key is used to encrypt and decrypt data this means that the key must also be transferred to the recipient key can be intercepted easily obvious secutiry problem
52
asymmetric encryption
this uses 2 seperate but related key one key, known as the public key, is made public so that others wishing to send you data can use this key to ecnrypt it| the public key cannot decrypt the data a prvate key, known only to you, is used to decrypt the data study image
53
use of hashing
hashing function provides a mapping between an arbitrary length input and a usually fixed length or smaller output its one way, you cant get back the original this is useful for storing encrypted PINs and passwords so that the cant be read by a hacker - to verify a users password the software applies teh has function to the user input and compares the hashed resiult with the one stored
54
FLAT FILE or a simple database
the simplest kind of database is a flat file, consisting of information about a single entity
55
what is an entity
Entity is a category of object, person, event or thing of interest about which data needs to be recorded for example you might hold data about club members or concert venues
56
Database design
most databases hold data about several entities
57
writing an entity description
each entity in the database has attributes customer(custID, title, firstname, surname,email) product(productID, title, subject,level,price)
58
another name for entity identifier
in a relational database, the identifier is known as the primary key
59
entity identifier (primary key)
it is underlined in the entitiy description Customer(CUSTID, title, firstname, surname, emial)..... if there is no natural attribute for a primary key, one should be introduced
60
Composite primary key
Composite keys (sometimes called compound keys) are keys that are made up of more than one field. If the cinema system was extended to record bookings, two more tables would be required: Customer (CustomerId, FirstName, LastName, PhoneNumber) Booking (CustomerId, ShowingId, BookingDate, NumSeats) The Customer table has a single field CustomerId which is designated as the primary key. The Booking table uses CustomerId together with ShowingId as a composite primary key. This happens because each field on its own is not unique in the Booking table, but the combination of the two is guaranteed to be unique.
61
secondary key
In A-Level Computer Science, a secondary key is a column (or set of columns) in a database table that is not the primary key, but which can be used to quickly and efficiently retrieve data based on its values. the primary key field is automatically indexed so that any particular record can be found very quickly in some databases, searches may often need to be made on other fields in the product table, product(productID, title,subject,level,price) if searches often need to be made on title or subject either or both of these fields could be defined as a secondary key -- they would then be indexed for faster lookups
62
what does. E-R stand for
Entity relationship diagrams
63
there a 3 E-R, name them
one to one - husband and wife one to many - school to pupil many to many - actor and film
64
E-R diagrams
one to one - line one to many - line with one <- many to many - ->-------<-
65
E-R diagram
multiple relationshijps between entities
66
Database structure
each entity is represented by a table tables in a relationsal database are commonly referred to as relations a dataabse contains one or more relations a relation has rows, each row containing one record the columns in the relation each contain one field (attritbute) belonging to the records
67
Creating a relationship
to create a relationship between Customer and Subscription, we need to inlcude custID in the entitiy decription of Subescription ProductID also need to be included in the entity description of Subscription custID and productID are foreign keys in Subscription, shown in italics Subscription(subID, startDate, endDATE, *custID*, *productID*)
68
foreign key defintion
a foreign key is an attribute that creates a join between two tables (relations) It is the primary key in the first relation
69
referential inegrity
it means that no foreign key in one table can refernce a non-existnt record in a related table can't delete Student and not delete their results first
70
relational database design
in a relation database, data is held in tables, also known as relations one row in the table holds one record each column represents one attribute each relation should hold data about a single entity
71
what is normalisation
What is normalisation? In A Level Computer Science, normalisation is the process of organising a database to reduce data duplication and improve data accuracy and consistency Achieved by applying a set of guidelines (forms), each with specific rules and requirements Enhances database efficiency and maintainability Provides consistency within the database Normalisation is a technique used to help reduce data duplication when designing data structures, also resulting in an improvement in data integrity. tables should be organised so that data is not duplicated in the same table or in different tables the structure should allow complex querires to be made 3 stages in normalisation - 1nf,2nf,3nf
72
1NF
a table is in first normal form if it contains no repeating attributes or groups of attributes all attributes must be atomic - a single attribute cannot consist of 2 data items such as firstname and surname this would make it diffcult or impossible to sort on surname ADA---------- 1NF Each record has a primary key Data is atomic No repeating groups of attributes
73
2NF
its in 2NF if it is in first normal form and contains no partial dependencies can only occur if the primary key is a composite key Partial dependencies arise where a table has a composite key. In the context of database design, a partial dependency occurs when a non-key attribute (a non-prime attribute) in a table is functionally dependent on only a part of the candidate key, rather than the entire key. This happens when the key is composite (made up of multiple attributes)
74
whats a non prime /non key attribute
A non-prime attribute is an attribute in a relational database table that is not part of any candidate key. In simpler terms, it's an attribute that doesn't help uniquely identify a row in the table
75
3NF
"All attributes are depdent on the key, the whole key and nothing but the key" its in 3NF if its in 2NF and contains No non-key (transitive) dependencies
76
what is non key depdency
In the context of database normalization, a non-key dependency occurs when a non-primary key attribute (also known as a non-key attribute) depends on another non-primary key attribute, which in turn depends on the primary key.
77
advantages of normalisation
- easier to maintain and change a normalised database - there is no unnecessary duplication of data - data integrity is maintained - if a person changes address, the update needs to be made only once to a single table -having smaller tables with fewer fields means faster searches and savings in storage
78
data integirty meaning
means that there is no possibility of having 2 different addresses (or any other attribute) for a person or item in the database
79
what does SQL stand for
Structured Query Language
80
what type of language is SQL ?
Declarative language used for querying and updating tables in a relational database can also be used to create tables
81
SELECT...FROM...WHERE
SELECT list of fields to be displayed FROM list of table or tables the data will come from WHERE list of search criteria ORDER BY list the fields that the data is to be sorted on (ASC or DESC, deafulat is ASCending order) SELECT productID, productNAME, subject, price FROM tblPRODUCT WHERE level = 4 ORDER BY productName
82
using a wild card
SELECT * FROM tblPRODUCT WHERE subject LIKE 'Comp" Like used to search for a pattern
83
other operators in the WHERE clause
BETWEEN between an inclusive range (BETWEEN 5 AND 10) IN specify multiple possible values for a column
84
operators in the WHERE clause
= <> < > >= <= AND, OR, NOT
85
use of a semicolon in SQL
some database systems require a semicolon at the end of each SQL statement (standard way to separate each SQL statement) DO NOT PUT A SEMICOLON AT THE END OF EACH LINE (NOT OCR PRACTICE)
86
using SQL, you can combine...
data from 2 or more tables by specifying the links between the tables
87
Attributes from linked tables
When you are selecting attributes from linked tables, if the attribute name occurs in more than one table, you should specify the table name If the attribute name occurs in only one table, speicfying the table name is optional
88
using the JOIN keyword
data from 2 linked tables can be extracted using the JOIN keyword (an alternative to the WHERE clause) SELECT sljfdsl,sdklfjdsf,sdkf FROM tblTeam, tblPlayer JOIN tblPlayer ON tblTeam.teamID = tblPlayer.teamID WHERE team.teamName = "Binham"
89
creating a new table using SQL
CREATE TABLE tblProduct ( ProudctID CHAR(4) NOT NULL PRIMARY KEY, Description VARCHAR(20) NOT NULL, Price CURRENCY )
90
Name the common data types
CHAR(n) VARCHAR(n) BOOLEAN INTERGER (INT) FLOAT DATE TIME CURRENCY
91
CHAR(n)
Character string of fixed length n
92
VARCHAR(n)
Character string variable length, max. n
93
BOOLEAN
TRUE or FALSE
94
INTERGER
integer - numbers mate
95
FLOAT
number with a floating decimal point
96
DATE
day, month, year values
97
TIME
hour, minute, second
98
CURRENCY
formats numbers in the currency used in your region
99
altering table structures
ALTER TABLE statment is used to add, delete or modify columns in an existing table ALTER TABLE tblProduct ADD Qtylnstock INTEGER
100
deleting a table
ALTER TABLE tblproduct DROP qualtijltstock change the data type ALTER TABLE tblProduct MODIFY COLUMN Decsciprtion VARCHAR(30) NOT NULL
101
inserting data using SQL
INSERT INFO stamtent is used to insert a new record into a table ProductID CHAR(4) NOT NULL PRIMARY KEY blah blaj blah INSERT INTO Product (ProductID, Description, Price), VALUES ("A345", "Pink Rabbit", 7.50)
102
Updating data using SQL
the UPDATE statement is used to update a record in a table UPDATE product, SET Descpriton = "Blue Rabbit", Price = 8.25 WHERE ProductID = "A345"
103
Deleting a record using SQL
DELETE FROM Product WHERE ProductID = "A345"
104
what is a database?
In A Level Computer Science, a database is an organised collection of data
105
what is an attribute
In databases, an attribute is a characteristic or property of an entity that describes it. In simpler terms, it's a column in a table that holds data values. For example, a "Student" entity might have attributes like "name", "age", and "roll number
106
atomicity in normalisation
Each column in a table must contain single, indivisible values