Data Resource Management Flashcards

1
Q

the practices for achieving consistent access to and delivery of data across the spectrum of data subject areas and data structure types in the enterprise

A

data management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

processed, organized, and structured data

A

information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

often used to query tables

A

Structured Query Language (SQL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

concerned with data policies, data procedures, access control, backup and recovery, and data classification standards to govern business-critical data

A

data governance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

looks for patterns and anomalies in data and then tries to discover meaningful patterns

A

Data discovery, or data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

which is a broad range of tools and practices that endeavors to provide better business strategic decision-making and even claims to predict the future

A

business intelligence (BI)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

store large amounts of unstructured data in their raw form and allow for flexible analysis

A

data lakes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

well-thought-out collections of computer files that are storehouses of data for use by managers in making decisions

A

database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

are where a database holds data

A

tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

rows

A

records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

columns

A

fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

questioned

A

queried

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

an application software that is used to create a collection of related files that consist of records of data separated by fields that can be queried to produce populations of information

A

Database Management Systems (DBMS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A field in a database table that uniquely identifies a record in the table

A

Primary key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A field in a database table that provides a link between two tables in a relational database

A

Foreign key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The organization or layout of a database that defines the tables, fields and constraints, keys, and integrity of the database

A

Schema

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

data that is collected from all over the internet and other data sources

A

big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

4 Vs of Big Data

A
  1. volume
  2. variety
  3. veracity
  4. velocity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

amount of data

A

Volume

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

form of the data

A

Variety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

quality of data

A

Veracity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

speed the data is created

A

Velocity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

the examination of huge sets of data to find patterns, connections, outliers, and hidden relationships

A

data mining (data discovery)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

resides in fixed formats

A

structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
**unorganized** data that cannot be easily read or processed by a computer because it is not stored in rows and columns like traditional data tables
Unstructured data
26
Data that **can be converted into structured data** with a lot of work
Semi-structured data
27
**set of software** that allows businesses to **gather** a large amount of **data** and use it to make business decisions based on what they find
Data mining tools
28
are **used to consolidate disparate data** in a **central location**
Digital warehouses
29
are **one trillion tera**bytes of data
Yottabytes
30
**one thousand giga**bytes
terabyte
31
are **smaller** and the systems are designed to **support the needs** of that **specific department**
Data mart data sets
32
two main big data tools
ETL & Hadoop
33
What does ETL stand for?
*Extract * Transform * Load
34
**describes tools** that are used to **standardize data across systems** and allows the data to be queried | data integration process (type of software)
ETL
35
a **tool** that **lets you ask your data questions** that in turn lead to answers and assist in making decisions
Querying
36
Where are data often extracted | Big Data Tools
CRM or ERP systems
37
What is the next step after finding out where the data is coming from? | Big Data Tools
extract
38
What is the next step after extracting data? | Big Data Tools
Transform
39
may involve removing decimals and dollar signs from financial transactions so it will fit into the structured data table | Big Data Tools
transform
40
What is the next step after transforming data? | Big Data Tools
Load
41
The second main big data tool set | Big Data Tools
Hadoop
42
an infrastructure for storing and processing large sets of data across multiple servers | Big Data Tools (type of software)
hadoop
43
designed to handle unstructured and semi-structured data, which traditional databases may struggle with | Big Data Tools
Hadoop
44
uses a distributed file system that allows files to be stored on multiple servers | Big Data Tools
Hadoop
45
An alternative solution to Hadoop that has been getting a wider adoptability as of recently
Apache Sparks
46
the first step in data output
Sourcing data
47
the most widely used standard computer language for relational databases, as it allows a programmer to manipulate and query data
SQL
48
One commonly used tool for data output is software called...
Tableau
49
produces interactive data visualizations focused on business intelligence
Tableau
50
helps with simplifying raw data into information using different formats such as graphs, charts, and numerical analysis
Tableau
51
Which type of data is typically associated with social media posts?
unstructured data
52
What does the term variety refer to in the context of generating and collecting big data?
**Forms** of data
53
Which restriction applies to data located in the primary field of a database?
Each key must be **unique**
54
Which tool can a data analyst use to collect, process, and analyze unstructured data for storage in a company’s data warehouse? | type of data process (software)
Extract, Transform, Load (ETL) software
55
Which process can a data analyst use to identify useful patterns and hidden relationships in a large set of social media data?
Data mining
56
Which software tool is appropriate for a business analyst to use when creating visualizations to present social media data as business intelligence to an executive team?
Tableau
57
a business intelligence software used for creating interactive and visually appealing dashboards and visualizations that can be used to present data insights.
Tableau
58
Which approach should the company use to reduce the time associated with data management and better support the needs of individual departments, given that only specific departments are using 20% of the company’s data warehouse capacity?
Using a data mart
59
a smaller and **more targeted version of a data warehouse**, designed to meet the specific needs of a department or business unit
Data Mart
60
can be defined as **acquiring** data, **ensuring** the data are valid, and then **storing** and **processing** the **data** into usable information for a business
Data management processes
61
used to describe the process of **transforming data** into an accurate, clean, and **error-free form**
scrubbing the data
62
Three Steps for Collecting Data
1. **determine** **purpose** and reason **for obtaining data** 2. **develop business-related questions** 3. **determine tools to acquire** data
63
having a **good plan** for organization and **ensuring** the **integrity** of your **data**
master data management (MDM)
64
a **methodology** or process used to define, organize, and **manage** all the **data** of an organization that provides a reference for decision-making
Master data management (MDM)
65
is **managing** the availability, integrity, and security of the data to ensure that the data remain high quality and **valid for data analytics**
Data Governance
66
Clean data starts when the database is created by including **database field (column) controls**
validity checks
67
**requires** the **whole organization to buy into** being stakeholders of the data, not just the database administrators or the programmers or the executives
Data governance
68
measures the **gain or loss** generated by intelligent data management relative to the amount of money invested
return on investment (ROI)
69
What is the purpose of data governance in an organization?
Manage and improve the quality of data
70
manage and improve the quality of data across an organization, ensuring it is accurate, complete, and consistent
data governance
71
involves establishing policies, procedures, and controls to ensure data integrity and reliability
data governance
72
Which function is included within the scope of data governance?
Maintaining updated data
73
The term that encompasses **patterns**, **correlations**, and **hidden data relationships**
data relationships
74
methodology of **reviewing** raw **data** using **qualitative** and **quantitative methods**
data analytics
75
It **looks for patterns** and hidden information to exploit **for enhanced productivity and business success**
data analytics
76
Benefits of Data Management
* find data relationships * predictive analytics * business intelligence * data analysis
77
helps organizations make better decisions
business intelligence
78
a **database technology** that has been optimized for **querying and reporting**, instead of processing transactions | type of processing
Online Analytical Processing (OLAP)
79
are designed to speed up the retrieval of data
OLAP databases
80
is **applying statistics** and logic techniques to **define, illustrate, and evaluate data**
Data analysis
81
attempts to **make sense of** an organization's **collected data**, turn those data into useful information, and validate the organization's future decisions
data analysis
82
enables you to **sift through** large sets of **data** and **identify** the most common and **most important topics** in an easy, fast, and scalable way
Topic analytics
83
is the process of **extracting information from written sources** such as websites, e-books, and emails and inserting the data into a database to evaluate and interpret relevance or to understand customers' feedback on products and services
Text analytics (text mining)
84
attempts to **make connections** between data so organizations can try to **predict future trends** that may give them a competitive advantage
Business analytics
85
**builds on predictive analysis** to make decisions about future industries and marketplaces | forms of business analytics
Prescriptive analytics
86
**attempts to reveal future patterns** in a marketplace, essentially trying to predict the future by looking for data correlations between one thing and any other things that pertain to it | forms of business analytics
Predictive analytics
87
defines past data you already have that **can be grouped into significant pieces**, like a department's sales results, and starts to reveal trends | forms of business analytics
Descriptive analytics
88
**looks at an organization's internal data**, analyzes external conditions like supply abundance, and **endorses the best action** | forms of business analytics
decision analytics
89
A company wants to improve its marketing strategies by analyzing customer data. What is the purpose of data mining in this context?
To identify patterns and correlations in the data
90
A data analyst wants to analyze social media posts to discover patterns in customer behavior and sentiments. What type of analytics is suitable for this task?
Text analytics
91
As the volume of data continues to grow exponentially, businesses **face the challenge of managing diverse data types** (structured, semi-structured, and unstructured) and processing them in real time | Challenges of Data Analytics and Business Intelligence
Handling the volume, variety, and velocity of data
92
Modern data analytics and business intelligence solutions increasingly **rely on AI and machine learning algorithms** to extract insights, make predictions, and automate decision-making | Challenges of Data Analytics and Business Intelligence
Incorporating AI and machine learning
93
As data volume and complexity grow, businesses must **ensure** that their **data** analytics and business intelligence **solutions** are **scalable**, **capable** of handling increased workloads and adapting to evolving needs | Challenges of Data Analytics and Business Intelligence
Scalability
94
With increasing **data protection regulations**, such as GDPR and CCPA, businesses must ensure they **handle data securely** and **comply** with relevant **legislation** | Challenges of Data Analytics and Business Intelligence
Data privacy and compliance
95
Effective data analytics and business intelligence initiatives **require collaboration** between different stakeholders, including data scientists, analysts, IT professionals, and business users | Challenges of Data Analytics and Business Intelligence
Collaboration and communication
96
**Empowering business users** with self-service analytics tools and **easy access to data** can improve decision-making across the organization | Challenges of Data Analytics and Business Intelligence
Democratizing data access
97
**addresses** the **intangible values of data loss** or a **decrease in operating efficiencies**
qualitative ROI
98
where businesses **implement processes to protect the actual data** from getting stolen or tampered with in the database computers
Data level security
99
**encrypting the data** so that only those with authorized access can know how to unencrypt | ****
encryption
100
**protecting the hardware** that the database resides on and other communications equipment from malicious software that tries to enter the system
System level security
101
starts with **log-on IDs and passwords** but can go further in verification to restrict the user from visiting unauthorized websites or downloading from untrusted sources
User-level security
102
Which level of security protects the hardware that a database resides on?
System level security
103
What is the meaning of return on investment (ROI) based on qualitative investments in an organization?
Earnings from investments in intangible assets that are difficult to quantify but result in positive outcomes
104
basic processing technique used to determine counts of information from a database
OLAP
105
will reduce redundancy in the database
normalization
106
results in putting data into a consistent structure
scrubbing the data
107
are not located on a physical server within a corporation | type of database
cloud databases
108
is the field in a database that links two tables together
foreign key
109
the process of retrieving data to load into the database
extraction
110
Which process should a data analyst use to remove missing, misplaced, or duplicate data from a dataset?
Normalization
111
is the process of removing redundancies in data
normalization
112
provide a visual representation of data, making it easy to see patterns, trends, and relationships
spreadsheets
113
used for tracking inventory, project management, budgeting, and various other tasks that require data management
spreadsheets
114
the average value of a dataset
Mean
115
can be used to calculate the mean of a range of values
AVERAGE function
116
middle value of a dataset
Median
117
can be used to calculate the median of a range of value
MEDIAN function
118
is the value that occurs most frequently in a dataset
Mode
119
used to calculate the mode of a range of values
Mode Function
120
a measure of the amount of variation or dispersion in a dataset
Standard Deviation
121
can be used to calculate the standard deviation of a range of values
STDEV function
122
the lowest and highest values in a dataset
Minimum and Maximum
123
can be used to calculate the minimum and maximum values of a range of values
MIN and MAX functions
124
What describes an argument in an Excel "IF" statement?
A value used to determine the outcome
125
Why is it important to conduct data hygiene practices?
Because data become decayed and outdated
126
An analyst uses software to analyze data in a company’s data warehouse and produce information presented in understandable charts and graphs on a dashboard. This information is used to inform decisions in the organization. Which software is used to conduct this data mining for presentation of the information?
Business intelligence
127
is used to simplify raw data into different formats that can be understood using graphs, charts, and numerical analyses
Business intelligence software
128
Which term describes the field that provides a link between two tables in a relational database table?
Foreign key
129
is a field in a table that links to the primary key in a different table in a database
foreign key
130
Which level of security is required to protect the hardware that supports a database?
system-level
131
A data analyst is using the ETL process to enter data into a company's relational database. The data contain many redundancies. Which process transforms the data into an accurate, clean, and error-free form?
normalization
132
the process of removing redundancies in datqa and can be part of the "transform" stem of ETL
normalization