Databases & Big Data Flashcards

1
Q

What is an entity?

A

Data inside of a database to be stored.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are attributes?

A

Characteristics of other information about entities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an entity identifier?

A

Unique attribute given to an entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an entity description?

A

Describes how information about data is stored in the table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are Relational Databases?

A

Idea that tables within a database can be related - linked with common attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a primary key in databases?

A

Attribute which provides a unique identifier for every identity within table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a foreign key?

A

Attribute within table which is the primary key of another table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a composite primary key?

A

Primary key formed by a combination of attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why are databases normalised?

A

Allow efficiency without compromising integrity of data.
Ensures no redundant or repeated data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is first normal form?

A

First stage of normalisation

Database table will contain no repeating attributes
Atomic - No column contains more than one value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is second normal form?

A

Partial keys dependencies removed
- Attributes not dependant on the whole composite key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is third normal form?

A

Database fits second form, and has no non-key dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are client-server databases?

A

Database which allows simultaneous access for multiple clients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is concurrent access?

A

When 2 users attempt to request access to the same fields at the same time.
Result in database updates being lost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can concurrent access issues be managed?

A

Record locks .
Serialisation
Timestamp / commitment ordering.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are record locks?

A

Record locked when user has accessed it. Unlocked when user finished using it.

17
Q

What is serialisation

A

Requests from other user placed in a queue - When first user finished, the next command in queue is executed.

18
Q

What is timestamp / commitment ordering?

A

Timestamp - Commands executed in order of timestamps of sent request.
Commitment - Algorithm used to work out optimum order in which to execute commands - impact of commands etc to minimise issues from occurring.

19
Q

What is Big Data?

A

Term used for data which won’t fit in usual containers.

20
Q

What are the 3 defining features of Big Data?

A

Volume - Too much data for conventional HDD or servers - Has to be spread over multiple servers.

Velocity - Data on servers modified and created rapidly.

Variety - Data held on servers consist of many different types of data - Binary, multimedia etc.

21
Q

What is the main problem plaguing Big Data?

A

Lack of structure - not massively volume of data.

22
Q

How is machine learning being used in Big Data?

A

Unstructured nature of Big Data makes it hard to extract useful information - Machine learning used to discern patterns in data.

23
Q

What is Functional Programming?

A

Solution to problem of processing data over multiple machines.

24
Q

How does functional programming work?

A

Programs are stateless and use immutable data structures.
Supports higher-order function - Using functions as inputs and outputs.

25
Q

What is the fact-based model for representing data?

A

Big data doesn’t store well in columns and tables.
Using immutable data removes risk of data being lost from human error, removes need for index and new data simply appended as dataset is created.

26
Q

How does the fact-based model work to store data?

A

Information stored as an immutable fact.
Stored with a timestamp - allow computer to use most recent information.

27
Q

How is Big Data represented using Graph Schema?

A

Graph Schema - using graphs of nodes and edges to graphically represent structure of dataset - Nodes are entities, edges are relationships.