M6 U3 - Data Management - Q2 Flashcards Preview

11637 Foundations of Computational Data Science > M6 U3 - Data Management - Q2 > Flashcards

Flashcards in M6 U3 - Data Management - Q2 Deck (35)
Loading flashcards...
1
Q

What phases are involved in data understanding? (2)

A
  • data acquisition (aka data gathering)
  • data preparation
2
Q

What is data acquisition?

A

Also known as data gathering, it involves gathering data from different sources and transforming the data into formats that are suitable for analytic solution development.

3
Q

What happens when the requirements phase is completed?

A

the data science team will embark on data acquisition or data gathering

4
Q

What’s data wrangling? (3 actions, 3 results)

A

The process of gathering, selecting, and transforming data to ensure that it is usable, free of noise and has as little bias as possible to meet defined analytic objectives.

5
Q

What steps are involved in data wrangling? (3)

A
  • Checking for missing values
  • Identifying outliers
  • Formatting the data.
6
Q

What is data management?

  • It’s an organization’s way of ________________ (4) data.
    Makes sure that the data housed within an organization is ______________ (2)
A
  • It’s an organization’s way of acquiring, storing, securing and processing data.
  • Makes sure that the data housed within an organization is accessible and accurate
7
Q

What group(s) manage data management?

A
  • Managed by the IT team in an organization.
  • Business users will participate too
8
Q

List the organization responsible from the 11 knowledge areas for data management. List the areas.

A
  • Data management body of knowledge (DAMA-DMBOK)
9
Q

Who is involved in the data management process?

A
  • Multiple departments
10
Q

Who’s responsible for designing an organization’s data management framework?

A

data architects

11
Q

Data Governance

A
  • Defines how data is accessed and managed within an organization.
  • planning, oversight, and control over management of data and the use of data and data-related resources.
12
Q

Data Architecture

A

the overall structure of data and data-related resources as an integral part of the enterprise architecture

13
Q

Data Modeling & Design

A

analysis, design, building, testing, and maintenance (was Data Development in the DAMA-DMBOK 1st edition)

14
Q

Data Storage & Operations

A

structured physical data assets storage deployment and management (was Data Operations in the DAMA-DMBOK 1st edition)

15
Q

Data Security

A

ensuring privacy, confidentiality and appropriate access

16
Q

Data Integration & Interoperability

A

acquisition, extraction, transformation, movement, delivery, replication, federation, virtualization and operational support ( a Knowledge Area new in DMBOK2)

17
Q

Documents & Content

A

storing, protecting, indexing, and enabling access to data found in unstructured sources (electronic files and physical records), and making this data available for integration and interoperability with structured (database) data.

18
Q

Describe Reference & Master Data (idea (1), how (2), and results (2))

A

Idea: Managing shared data

How:

  • Standardizing data definitions
  • Standardizing the use of data values

Results:

  • Reduce redundancy
  • Ensure better data quality
19
Q

Data Warehousing & Business Intelligence

A

managing analytical data processing and enabling access to decision support data for reporting and analysis .

20
Q

Metadata

A

collecting, categorizing, maintaining, integrating, controlling, managing, and delivering metadata .

21
Q

The Data Quality area involves what? (3 actions on 1, 1 action on another)

A
  • Defining, monitoring, maintaining data integrity
  • Improving data quality.
22
Q

Your client has an established data management structure in place, this means that:

A

Your client regards their data as a resource that should be reliable, and should be kept secure.

Correct: Data management ensures that a company is mindful of the security, integrity, and overall quality of their data and data infrastructure.

23
Q

When setting analytic objectives, it is good practice to define a data statement. This ensures that you have assessed a business’s data science readiness. Why is data gathering not conducted during the data science readiness assessment?

A

Data gathering is best conducted after all analytic requirements are gathered.

Correct: Although analytic objectives have been set prior to this stage, it is important to define the analytic and business requirements to ensure that the right questions are answered and the right data is gathered.

24
Q

What does data governance impact?

A
  • High level: Impacts the decisions that can be made from the data available to the organization.
  • Has positive implications for the: quality, security, and integrity of data
25
Q

Who should have a data governance strategy?

A

Any organization that stores and utilizes data

26
Q

What are some benefits of data governance? (4)

A
  • Accessibility: Provides a reliable and consistent view of enterprise wide data
  • Quality improvement: It ensures that there is a plan for improved quality of data
  • Reduces silos: Maps the location of data in the enterprise reducing the scourge of data silos
  • Improves data management overall
27
Q

What group(s) administer data governace?

A

Data management team

28
Q

In the context of data, what is a stakeholder?

A

An individual or group that make or is affected by data driven decisions within an organization

29
Q

What are Data governance best practices for organizations? (3)

A
  • Data has integrity
  • Data related decisions and controls are transparent and can be audited
  • Data is unbiased
30
Q

What is unbiased data?

A

general idea: data represents all members of a population that could be served by an organization.

This best practice will positively influence the development of ethical models and algorithms for analytic solutions.

31
Q

Is the data science project team involved in setting up a data governance framework for a client?

A

No

32
Q

The Data Management Body of Knowledge is a useful resource to a business/organization because:

A

It provides data management best practices to organizations and professionals looking to create and manage their data infrastructure.

33
Q

What do Data Architects do?

A
  • handle the development of a data management plan
  • work with other IT team members especially data engineers to ensure that data is collected and stored securely.
34
Q

What’s The Data Management- Body of Knowledge (DAMA-DMBOK)?

A
  • Provides a best practices guide for data management professionals and organizations with a data infrastructure
  • highlights eleven (11) areas of knowledge that will guide a data management professional in ensuring that an organization’s data infrastructure follows best practices.
35
Q

Data management is an organization’s way of doing what with/to data? (4)

A
  • Acquiring
  • Storing
  • Securing
  • Processing