06 Database Systems Flashcards

1
Q

What is a data warehouse?

A

A data warehouse is a large collection of archived data from multiple sources used for decision making. Huge quantities of data are stored in a consistent order to make interrogation more productive. In summary, a data warehouse is a huge database specifically structured for information access and reporting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a large collection of archived data from multiple sources used for decision making?

A

A data warehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is data mining?

A

Data mining is the ‘interrogation’ of a data warehouse to help the organisation make decisions. It involves drilling down into the structure of the data to discover meaningful patterns. Users can then identify trends over time and can test theories for accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the ‘interrogation’ of a data warehouse to help the organisation make decisions?

A

Data mining.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name advantages to a company of using a data warehouse? (Advantages are similar to data mining)

A
  • Helps identify a list of customers likely to buy a certain product, which they can then use to target with a mail shot.
  • The organisation can make comparisons with competitors.
  • Can run ‘what if’ queries for modelling exercises.
  • Can predict future sales
  • Can find the best locations for new shops
  • Can analyse sales patterns
  • Test analysis of results for plausibility
  • Can create reports to help decision making
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is data independence?

A

Data should be independent of the underlying database structure, for example, if a business has an application for them without having to change the structure or relationships between data tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is data redundancy?

A

Data is redundant when it is duplicated unnecessarily. I.e. a customer’s address should only be stored in one place, so it can be updated easily.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data should be independent of the underlying database structure is known as what?

A

Data independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data is redundant when it is duplicated unnecessarily is known as what?

A

Data redundancy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is data consistency?

A

Data must be consistence as it moves from input to processing to output, for instance, if the designer decides the date should be in the format 12/04/18, it shouldn’t sometimes appear as 12th April 2018.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data must be consistence as it moves from input to processing to output is known as what?

A

Data consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is data integrity?

A

Data integrity is the correctness of data. If the value is updated in one place, it should be the same updated value that appears in all the applications built on top of that database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the correctness of data?

A

Data integrity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A hospital unit uses a relational database. A patient is allocated to a ward and to a physiotherapist. One table in this database is structured as follows: tblWard (WardID, WardName, Capacity, FreeBeds) Name two other tables you could expect to see in this database.

A

Would be something like:

tblpatient (patientID[underlined], firstname, surname, dateofbirth, bloodtype, address, nextofkin, wardID, physioID)

tblphysio (psysioID[underlined], firstname, surname, phone)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a distributed database?

A

A distributed database has data stored on a number of computers at different locations, but appears as one logical database to the user. Users can then access data as if it were held centrally in a single source.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What has data stored on a number of computers at different locations, but appears as one logical database to the user?

A

A distributed database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

List advantages of a relational database over a flat file spreadsheet approach.

A
  • Data will be stored in one place and not be redundant. For example, the customers phone number will only be stored in one field of one table.
  • Data will be independent of the database structure, for example, you could create an application for a shop for entering their sales using the same data tables they already use for marketing.
  • Data will be consistent from input to processing to output, so dates should always remain in the same format like DD/MM/YY.
  • Data is more secure because you can set different access levels for different users, so, a receptionist would be able to see some of the data, whereas a manager would be able to see all of it.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Helps identify a list of customers likely to buy a certain product, which they can then use to target with a mail shot is an advantage of what?

A

A data warehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The organisation can make comparisons with competitors is an advantage of what?

A

A data warehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Can run ‘what if’ queries for modelling exercises is an advantage of what?

A

A data warehouse.

21
Q

Can predict future sales is an advantage of what?

A

A data warehouse.

22
Q

Can find the best locations for new shops is an advantage of what?

A

A data warehouse.

23
Q

Can analyse sales patterns is an advantage of what?

A

A data warehouse.

24
Q

Test analysis of results for plausibility is an advantage of what?

A

A data warehouse.

25
Q

Can create reports to help decision making is an advantage of what?

A

A data warehouse.

26
Q

Advantages of data mining to an organisation?

A
  • Identify a list of customers likely to buy a certain product, which they can then use to target with a mail shot.
  • Make comparisons with competitors
  • Run ‘what if’ queries for modelling exercises
  • Predict future sales
  • Find best locations for new shops
  • Analyse sales patterns
  • Test analysis of results for plausibility
  • Create reports to help decision making
27
Q

Identifying a list of customers likely to buy a certain product is an advantage of what?

A

Data mining to an organisation

28
Q

Make comparisons with competitors is an advantage of what?

A

Data mining to an organisation

29
Q

Run ‘what if’ queries for modelling exercises is an advantage of what?

A

Data mining to an organisation

30
Q

Predict future sales is an advantage of what?

A

Data mining to an organisation

31
Q

Find best locations for new shops is an advantage of what?

A

Data mining to an organisation

32
Q

Analyse sales patterns is an advantage of what?

A

Data mining to an organisation

33
Q

Test analysis of results for plausibility is an advantage of what?

A

Data mining to an organisation

34
Q

Create reports to help decision making is an advantage of what?

A

Data mining to an organisation

35
Q

Advantages to a large company of having a distributed database?

A
  • Less network traffic. Data is held closer to where it is processed. This means there is less chance of bottlenecks in the network and data queries will run more quickly.
  • There is no ‘single point of failure’. If a network connection is lost, it is likely that data will still be available to users
  • There is improved security because data is not all held in one place that could be vulnerable to attack or accidental loss.
  • Resilience - some data is replicate, so applications will continue to run if one copy of the data is lost.
36
Q

Less network traffic is an advantage of what?

A

A large company having a distributed database

37
Q

No ‘single point of failure’ is an advantage of what?

A

A large company having a distributed database

38
Q

Improved security is an advantage of what?

A

A large company having a distributed database

39
Q

Resilience is an advantage of what?

A

A large company having a distributed database

40
Q

Disadvantages of using distributed databases?

A
  • They’re more complex. So they require experienced staff to manage and maintain.
  • Data must be synchronised across multiple locations. This can lead to corruption, especially when trying to restore after a crash.
  • Transfer of data across the network could be vulnerable to attack or loss, unless encryption is used.
  • Security passwords must be enforced at all locations if an employee leaves, their account must be deleted on all systems.
  • All locations must be protected from viruses with antivirus software and an employee code of conduct.
  • Data may become inaccessible if the critical server fails.
41
Q

What are more complex and require experienced staff to manage and maintain?

A

Distributed databases.

42
Q

Data must be synchronised across multiple locations which can lead to corruption. This is a disadvantage of what?

A

Distributed databases.

43
Q

Transferring data across the network could be vulnerable to attack or loss is a disadvantage of what?

A

Distributed databases.

44
Q

Security passwords must be enforced at all locations if an employee leaves is a disadvantage of what?

A

Distributed databases.

45
Q

All locations must be protected from viruses with antivirus software and an employee code of conduct is a disadvantage of what?

A

Distributed databases.

46
Q

Data possibly becoming inaccessible if the critical server fails is a disadvantage of what?

A

Distributed databases.

47
Q

What is data normalisation?

A

A staged mathematical process that removes repeated groups of data.

A staged mathematical process that removes data duplication and inconsistencies

48
Q

A staged mathematical process that removes repeated groups of data is known as what?

A

Data normalisation.

49
Q

Name security weaknesses associated with a distributed database and list solutions.

A
  • Data is vulnerable to hacking while it is transferred between locations over a network. A solution to prevent this is encryption
  • Data is stored in multiple locations and some local sites may not be as secure as central headquarters. Therefore, usernames and passwords should always be required and passwords should be changed regularly.