Data Life Cycle & Environment Flashcards

Week 2.1 (39 cards)

1
Q

5 stages of the data life cycle

A
  1. data collection
  2. data storage
  3. data processing
  4. data analysis
  5. data disposal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

describe data collection stage

A
  • gathering raw data from various sources
  • importance of accurate and relevant data collection
  • potential sources of data: primary and secondary
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

challenges of data collection

A

quality, accuracy, completeness, and ethical/regulatory considerations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

describe data storage

A
  • securely holding active data in physical or cloud storage
  • types
  • security measures: encryption, access controls
  • compliance considerations for sensitive data: GDPR, DPA 2018
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

describe data processing

A
  • transforming raw data into usable formats
  • processing techniques
  • ensuring data quality and standardization
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

describe data analysis

A
  • applying statistical, machine learning, or visualisation techniques to gain insights
  • types of analysis: descriptive, diagnostic, predictive, prescriptive
  • highlight the value of insights for decision-making
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

describe data disposal stage

A
  • securely deleting or archiving data that is no longer needed
  • methods: deletion, anonymization, archiving
  • archiving data for historical, legal, or regulatory reasons is moved to long-term storage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

challenges of data disposal

A

ensuring data is permanently removed to prevent unauthorised access

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

describe a file-based system

A
  • a system where data is stored in files on a computer and managed through specific application programs
  • organised into separate files
  • each file is independent and accessed by specific programs
  • data is stored in logical formats such as sequential
  • reading in sequential order
  • simple to design and implement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

use of file-based system

A

small, straightforward tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

5 limitations of file-based systems

A
  1. separation and isolation of data
  2. duplication of data
  3. data dependence
  4. incompatible file formats
  5. fixed queries/proliferation of application programs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

2 advantages of CSV

A
  1. human readability
  2. based compatability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

2 disadvantages of CSV

A
  1. not hierarchical structure
  2. data types
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

2 advantages of JSOn

A
  1. hierarchical structure
  2. data types
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2 disadvantages of JSON

A
  1. larger file size
  2. parsing overhead
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

define databases

A
  • a shared collection of logically related data and its description, designed to meet the information needs of an organisation
  • all data items are integrated into a larger repository
17
Q

define database management system (DBMS)

A
  • a software that manages databases, providing tools for storage, retrieval and data management
  • DBMS interacts with the application programs and the database
18
Q

describe database application programs

A
  • application that interacts with the database by issuing requests to the DBMS - appropriate SQL queries
  • users interact with the database through database application programs
  • action = transaction
  • to prevent interference between operations on the database, all transaction posses the ACID properties
19
Q

atomicity

A

a transaction must be performed or not performed at all

20
Q

consistency

A

a transaction must transform the database from one consistent state to another consistent state

21
Q

isolation

A

transactions execute independently of one another

22
Q

durability

A

the effects of a successful transaction are permanently recorded in the database

23
Q

advantages of data application programs

A
  • control of data redundancy
  • data consistency
  • sharing of data becomes easier across entire organisation
  • data integrity
  • improved security
  • enforcement of standards
  • data accessibility and responsiveness
  • increased productivity- improved maintenance through data independence
  • increased concurrency
  • improved backup and recovery services
24
Q

disadvantages of database application programs

A
  • requires specialised knowledge
  • dependency of centralised systems
  • indexing overhead
  • space overhead
25
5 components in DBMS environment
1. hardware 2. software 3. data 4. procedures 5. people
26
define database environment
a collective system of components that regulates the group of data, management and use of data
27
describe use of hardware in DBMS environment
- used for keeping and accessing the database, running the DBMS and application programs - run on a range of machines
28
describe use of software in DBMS applications
- collection of programs - each DBMS has its own tools and database access language for query processing, reports and forms generation
29
describe use of data in DBMS applications
- bridge between machine and human components - contains both operation data and metadata
30
describe use of procedures in DBMS environment
- instructions and rules that govern the design and use of the database - can be used for data validation, access control, or to reduce network traffic between clients and the DBMS servers
31
describe use of people in DBMS
1. data and database administrator 2. database designers 3. application developers 4. end-users
32
types of databases
1. relational 2. NoSQL 3. hierarchical 4. OOP
33
define relational databses
relationships between tables defined using primary and foreign keys
34
define NoSQL databases
- unstructured or semi-structured data - no fixed schema - depends on how data is organised
35
types of NoSQL databases
- social networks - IOT - real-time analytics
36
describe hierarchical databases
data organised in a tree-like structure with parent-child relationships
37
types of hierarchical databses
- legacy systems - organisational charts
38
describe OOP databases
stores data as objects
39
type of OOP databases
- multimedia systems - simulations