Glossary of terms Flashcards

(229 cards)

1
Q

A/B Testing

A

The process of testing two variations of the same web page to determine which page is more successful at attracting user traffic and generating revenue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Access Control

A

Features such as password protection, user permissions, and encryption that are used to protect a spreadsheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Accuracy

A

The degree to which the data conforms to the actual entity being measured or described

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Action-oriented questions

A

A question whose answers lead to change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Administrative metadata

A

Metadata that indicates the technical source of a digital asset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Agenda

A

A list of scheduled appointments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Algorithm

A

A process or set of rules followed for a specific task

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Analytical skills

A

Qualities and characteristics associated with using facts to solve problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Attribute

A

A characteristic or quality of data used to label a column in a table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Audio file

A

Digitized audio storage usually in an MP3, AAC, or other compressed format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

AVERAGE

A

A spreadsheet function that returns an average of the values from a selected range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Bad data source

A

A data source that is not reliable, original, comprehensive, current, and cited (ROCCC)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bias

A

A conscious or subconscious preference in favor of or against a person, group of people, or a thing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Big data

A

Large, complex datasets typically involving long periods of time, which enable data analysts to address far-reaching business problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Boolean data

A

A data type with only two possible values, usually true or false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Borders

A

Lines that can be added around two or more cells on a spreadsheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Business task

A

The question or problem data analysis resolves for a business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

CASE

A

A SQL statement that returns records that meet conditions by including an if/then statement in a query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

CAST

A

A SQL function that converts data from one datatype to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Cell reference

A

A cell or a range of cells in a worksheet typically used in formulas and functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Changelog

A

A file containing a chronologically ordered list of modifications made to a project

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Clean data

A

Data that is complete, correct, and relevant to the problem being solved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Cloud

A

A place to keep data online, rather than a computer hard drive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

COALESCE

A

A SQL function that returns non-null values in a list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Compatibility
How well two or more datasets are able to work together
26
Completeness
The degree to which the data contains all desired components or measures
27
CONCAT
A SQL function that adds strings together to create new text strings that can be used as unique keys
28
CONCATENATE
A spreadsheet function that joins together two or more text strings
29
Conditional formatting
A spreadsheet tool that changes how cells appear when values meet specific conditions
30
Confidence interval
A range of values that conveys how likely a statistical estimate reflects the population
31
Confidence level
The probability that a sample size accurately reflects the greater population
32
Confirmation bias
The tendency to search for or interpret information in a way that confirms pre-existing beliefs
33
Consent
The aspect of data ethics that presumes an individual's right to know how and why their personal data will be used before agreeing to provide it
34
Consistency
The degree to which data is repeatable from a different point of entry or collection
35
Context
The condition in which something exists or happens
36
Continuous data
Data that is measured and can have almost any numeric value
37
Cookie
A small file stored on a computer that contains information about its users
38
COUNT
A spreadsheet function that counts the number of cells in a range that meet specific criteria
39
COUNTIF
A spreadsheet function that returns the number of cells that match a specified value
40
Cross-field validation
A process that ensures certain conditions for multiple data fields are satisfied
41
CSV (comma-separated values) file
A delimited text file that users a comma to separate values
42
Currency
The aspect of data ethics that presumes individuals should be aware of financial transactions resulting from the use of their personal data and the scale of those transactions
43
Dashboard
A tool that monitors live, incoming data
44
Data
A collection of facts
45
Data analysis
The collection, transformation, and organization of data in order to draw conclusions make predictions, and drive informed decision-making.
46
Data analysis process
The six phases of ask, prepare, process, analyze, share, and act whose purpose is to gain insights that drive informed decision-making
47
Data analyst
Someone who collects transforms, and organizes data in order to drive informed decision-making
48
Data analytics
The science of data
49
Data anonymization
The process of protecting people's private or sensitive data by eliminating identifying information
50
Data bias
When a preference in favor of or against a person, group of people or thing systematically skews data analysis results in a certain direction
51
Data constraints
The criteria that determine whether a piece of data is clean and valid
52
Data design
How information is organized
53
Data-driven decision-making
Using facts to guide business strategy.
54
Data ecosystem
The various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data
55
Data element
A piece of information in a dataset
56
Data engineer
A professional who transforms data into a useful format for analysis and gives it a reliable infrastructure
57
Data ethics
Well-founded standards of right and wrong that dictate how data is collected, shared, and used
58
Data governance
A process for ensuring the formal management of a company's data assets
59
Data-inspired decision-making
The process of exploring different data sources to find out what they have in common
60
Data integrity
The accuracy, completeness, consistency, and trustworthiness of data throughout its life cycle
61
Data interoperability
A key factor leading to the successful use of open data among companies and governments
62
Data life cycle
The sequence of stages that data experiences, which include plan, capture, manage, analyze, archive, and destroy
63
Data manipulation
The process of changing data to make it more organized and easier to read
64
Data mapping
The process of matching fields from one data source to another
65
Data merging
The process of combining two or more datasets into a single dataset
66
Data model
A tool for organizing data elements and how they relate to one another
67
Data privacy
Preserving a data subject's information any time a data transaction occurs
68
Data range
Numerical values that fall between predefined maximum and minimum values
69
Data replication
The process of storing data in multiple locations
70
Data science
A field of study that uses raw data to create new ways of modeling and understanding the unknown
71
Data security
Protecting data from unauthorized access or corruption by adopting safety measures
72
Data strategy
The management of the people, processes, and tools used in data analysis
73
Data transfer
The process of copying data from a storage device to computer memory or from one computer to another
74
Data type
An attribute that describes a piece of data based on its values, its programming language, or the operations it can perform
75
Data validation
A tool for checking the accuracy and quality of data
76
Data visualization
The graphical representation of data
77
Data warehousing specialist
A professional who develops processes and procedures to effectively store and organize data
78
Database
A collection of data stored in a computer system
79
Dataset
A collection of data that can be manipulated or analyzed as one unit
80
DATEIF
A spreadsheet function that calculates the number of days, months, or years between two dates
81
Delimiter
A character that indicates the beginning or end of a data item
82
Descriptive metadata
Metadata that describes a piece of data and can be used to identify it at a later point in time
83
Digital photo
An electronic or computer-based image usually in BMP or JPG format
84
Dirty data
Data that is incomplete, incorrect, or irrelevant to the problem to be solved
85
Discrete data
Data that is counted and has a limited number of values
86
DISTINCT
A keyword that is added to a SQL SELECT statement to retrieve only non-duplicate entries
87
Duplicate data
Any record that inadvertently shares data with another record
88
Equation
A calculation of data that involves addition, subtraction, multiplication, or division (also called math expression)
89
Estimated response rate
The average number of people who typically complete a survey
90
Ethics
Well-founded standards of right and wrong that prescribe what humans ought to do, usually in terms of rights, obligations, benefits to society, fairness, or specific virtues
91
Experimenter bias
The tendency for different people to observe things differently (also called observer bias)
92
External data
Data that lives, and is generated, outside of an organization
93
Fairness
A quality of data that does not create or reinforce bias
94
Field
A single piece of information from a row or column of a spreadsheet; in a data table, typically a column in the table
95
Field length
A tool for determining how many characters can be keyed into a spreadsheet field
96
Fill handle
A box in the lower-right-hand corner of a selected spreadsheet cell that can be dragged through neighboring cells in order to continue an instruction
97
Filtering
The process of showing only the data that meets specified criteria while hiding the rest
98
Find and replace
A tool that finds a specified search term and replaces it with something else
99
First-party-data
Data collected by an individual or group using their own resources
100
Float
A number that contains a decimal
101
Foreign key
A field within a database table that is a primary key in another table (Refer to primary key)
102
Formula
A set of instructions used to perform a calculation using the data in a spreadsheet
103
FROM
The section of a query that indicates where the selected data comes from
104
Function
A preset command that automatically performs a specified process or task using the data in a spreadsheet
105
Gap analysis
A method for examining and evaluating the current state of a process in order to identify opportunities for improvement in the future.
106
General Data Protection Regulation of the European Union (GDPR)
Policy-making body in the European Union created to help protect people and their data
107
Geolocation
The geographical location of a person or device by means of digital information
108
Good data source
A data source that is reliable, original, comprehensive, current, and cited (ROCCC)
109
Header
The first row in a spreadsheet that labels the type of data in each column
110
Hypothesis testing
A process to determine if a survey or experiment has meaningful results
111
Incomplete data
Data that is missing important fields
112
Inconsistent data
Data that uses different formats to represent the same thing
113
Incorrect/inaccurate data
Data that is complete but inaccurate
114
Internal data
Data that lives within a company's own systems
115
Interpretation bias
The tendency to interpret ambiguous situations in a positive or negative way
116
Leading question
A question that steers people towards a certain response
117
LEFT
A function that returns a set number of characters from the left side of a text string
118
LEN
A function that returns the length of a text string by counting the number of characters it contains
119
Length
The number of characters in a text string
120
Long data
A dataset in which each row is a one-time point per subject, so each subject has data in multiple rows
121
Mandatory
A data value that cannot be left blank or empty
122
Margin of error
The maximum amount that the sample results are expected to differ from those of the actual population
123
Math expression
A calculation that involves addition, subtraction, multiplication, or division (also called an equation)
124
MAX
A spreadsheet function that returns the largest numeric value from a range of cells
125
Measurable question
A question whose answers can be quantified and assessed
126
Mentor
Someone who shares knowledge, skills, and experience to help another grow both professionally and personally
127
Merger
An agreement that unites two organizations into a single new one
128
Metadata
Data about data
129
Metadata repository
A database created to store metadata
130
Metric
A single, quantifiable type of data that is used for measurement
131
Metric goal
A measurable goal set by a company and evaluated using metrics
132
MID
A function that returns a segment from the middle of a text string
133
MIN
A spreadsheet function that returns the smallest numeric value from a range of cells
134
Naming conventions
Consistent guidelines that describe the content, creation date, and version of a file in its name
135
Networking
Building relationships by meeting people both in-person and online
136
Nominal data
A type of qualitative data that is categorized without a set order
137
Normalized database
A database in which only related data is stored in each table
138
Notebook
An interactive, editable programming environment for creating data reports and showcasing data skills
139
Null
An indication that a value does not exist in a dataset
140
Observation
The attributes that describe a piece of data contained in a row of a table.
141
Observer bias
The tendency for different people to observe things differently (also called experimenter bias)
142
Open data
Data that is available to the public
143
Openness
The aspect of data ethics that promotes the free access, usage, and sharing of data
144
Operator
A symbol that names the operation or calculation to be performed
145
Order of operations
Using parentheses to group together spreadsheet values in order to clarify the order in which operations should be performed.
146
Ordinal data
Qualitative data with a set order or scale
147
Outdated data
Any data that has been superseded by newer and more accurate information
148
Ownership
The aspect of data ethics that presumes individuals own the raw data they provide and have primary control over its usage, processing, and sharing
149
Pivot chart
A chart created from the fields in a pivot table
150
Pivot table
A data summarization tool used to sort, reorganize, group, count, total, or average data
151
Pixel
In digital imaging, a small area of illumination on a display screen that, when combined with other adjacent areas, forms a digital image
152
Population
In data analytics, all possible data values in a dataset
153
Primary key
An identifier in a database that references a column in which each value is unique (Refer to foreign key)
154
Problem domain
The area of analysis that encompasses every activity affecting or affected by a problem.
155
Problem types
The various problems that data analysts encounter, including categorizing things, discovering connections, finding patterns, identifying themes, making predictions, and spotting something unusual.
156
Qualitative data
A subjective and explanatory measure of a quality or characteristic
157
Quantitative data
A specific and objective measure, such as a number, quantity, or range
158
Query
A request for data or information from a database
159
Query language
A computer programming language used to communicate with a database
160
Random sampling
A way of selecting a sample from a population so that every possible type of the sample has an equal chance of being chosen
161
Range
A collection of two or more cells in a spreadsheet
162
Record
A collection of related data in a data table, usually synonymous with row
163
Redundancy
When the same piece of data is stored in two or more places
164
Reframing
Restating a problem or challenge, then redirecting it toward a potential resolution
165
Regular expression (RegEx)
A rule that says the values in a table must match a prescribed pattern
166
Relational database
A database that contains a series of tables that can be connected to form relationships
167
Relevant question
A question that has significance to the problem to be solved.
168
Analytical Thinking
The process of identifying and defining a problem, then solving it by using data in an organized, step-by-step manner.
169
Remove duplicates
A spreadsheet tool that automatically searches for and eliminates duplicate entries from a spreadsheet
170
Report
A static collection of data periodically given to stakeholders
171
Return on investment (ROI)
A formula that uses metrics of investment and profit to evaluate the success of an investment
172
Revenue
The total amount of income generated by the sale of goods or services
173
RIGHT
A function that returns a set number of characters from the right side of a text string
174
Root cause
The reason why a problem occurs
175
Sample
In data analytics, a segment of a population that is representative of the entire population
176
Sampling bias
Overrepresenting or underrepresenting certain members of a population as a result of working with a sample that is not representative of the population as a whole
177
Schema
A way of describing how something, such as data, is organized
178
Scope of work (SOW)
An agreed-upon outline of the tasks to be performed during a project
179
Second-party data
Data collected by a group directly from its audience and then sold
180
SELECT
The section of a query that indicates the subset of a dataset
181
Small data
Small, specific data points typically involving a short period of time, which are useful for making day-to-day decisions
182
SMART methodology
A tool for determining a question's effectiveness based on whether it is specific, measurable, action-oriented, relevant, and time-bound
183
Social media
Websites and applications through which users create and share content or participate in social networking
184
Soft skills
Nontechnical traits and behaviors that relate to how people work
185
Sorting
The process of arranging data into a meaningful order to make it easier to understand, analyze, and visualize
186
Specific question
A question that is simple, significant, and focused on a single topic or a few closely related ideas.
187
Split
A function that divides text around a specified character and puts each fragment into a new, separate cell
188
Sponsor
A professional advocate who is committed to moving the career of another
189
Spreadsheet
A digital worksheet
190
SQL
(Refer to Structured Query Language)
191
Stakeholders
People who invest time and resources into a project and are interested in its outcome
192
Statistical power
The probability that a test of significance will recognize an effect that is present
193
Statistical significance
The probability that sample results are not due to random chance
194
String data type
A sequence of characters and punctuation that contains textual information (Refer to Text data type)
195
Structural metadata
Metadata that indicates how a piece of data is organized and whether it is part of one or more than one data collection
196
Structured data
Data organized in a certain format such as rows and columns
197
Structured Query Language
A computer programming language used to communicate with a database
198
Structured thinking
The process of recognizing the current problem or situation, organizing available information, revealing gaps and opportunities, and identifying options
199
SUBSTR
A SQL function that extracts a substring from a string variable
200
Substring
A smaller subset of a text string
201
SUM
A spreadsheet function that adds the values of a selected range of cells
202
Syntax
The predetermined structure of a language that includes all required words, symbols, and punctuation, as well as their proper placement
203
Technical midset
The ability to break things down into smaller pieces and work with them in an orderly and logical way
204
Text data type
A sequence of characters and punctuation that contains textual information (also called string data type)
205
Text string
A group of characters within a cell, most often composed of letters
206
Third-party-data
Data provided from outside sources who didn't collect it directly
207
Time-bound question
A question that specifies a timeframe to be used
208
Transaction transparecy
The aspect of data ethics that presumes all data-processing activities and algorithms should be explainable and understood by the individual who provides the data
209
Transferable skills
Skills and qualities that can transfer from one job or industry to another
210
TRIM
A function that removes leading, trailing, and repeated spaces in data
211
Turnover rate
The rate at which employees voluntarily leave a company
212
Typecasting
Converting data from one type to another
213
Unbiased sampling
When the sample of the population being measured is representative of the population as a whole
214
Unfair question
A question that makes assumptions or is difficult to answer honestly
215
Unique
A value that can't have a duplicate
216
United States Census Bureau
An agency in the U.S Department of Commerce that serves as the nation's leading provider of quality data about its people and economy
217
Unstructured data
Data that is not organized in any easily identifiable manner
218
Validity
The degree to which that data conforms to constraints when it is input, collected, or created
219
Verification
A process to confirm that a data-cleaning effort was well executed and the resulting data is accurate and reliable
220
Video file
A collection of images, audio files, and other data usually encoded in a compressed format such as MP4, MV4, MOV, AVI, or FLV
221
Visualization
(Refer to data visualization)
222
VLOOKUP
A spreadsheet function that vertically searches for a certain value in a column to return a corresponding piece of information
223
WHERE
The section of a query that specifies criteria that the requested data must meet
224
Wide data
A dataset in which every data subject has a single row with multiple columns to hold that values of various attributes of the subject
225
World Health Organization
An organization whose primary role is to direct and coordinate international health within the United Nations system
226
COUNTA
A spreadsheet function that counts the total number of values within a specified range
227
ORDER BY
A SQL clause that sorts results returned in a query
228
LIMIT
A SQL clause that specifies the maximum number of records returned in a query
229
ROUND
A SQL function that returns a number rounded to a certain number of decimal places