Mid Term Flashcards

(66 cards)

1
Q

Define Noisy in data

A

Containing errors or outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Tabular form

A

Data has rows and columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define variable

A

a storage mechanism for a particular identifier, which contains information referred to as a value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define randomization

A

the practice of using chance methods to assign participants to experimental conditions without bias or knowing anything about the person.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

WHERE

A

Defines a specific condition desired in the outcome (ex. age = 35)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is business analytics

A

The use of data to gain insights from data to maximize business outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

3 steps of getting data ready for analysis

A

clean, structure, integrate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

5 stages of business analytics

A
  • data wrangling
  • descriptive analytics
  • predictive analytics
  • prescriptive analytics
  • storytelling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

data wrangling

A

wrestling with data to get it in a more structured format that is useful for analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data integration

A

connecting two sources of data to offer more insights than each source would yield separately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

predictive analytics

A

The practice of interpreting data to predict the likelihood of future business outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Prescriptive analytics

A

the use of optimization techniques to advise businesses on what they should do

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Spreadsheet tool

A

an interactive software application for structuring, transforming, analyzing, and storing data in rows and columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Programming

A

The process of solving a problem using computer algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Programing language

A

a formal set of instructions that can be used to produce various kinds of output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

open-source programming tools

A

programming tools that are made freely available, often developed by and for the community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are two well-known open-source programs

A

R and Python

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Programming code

A

a collection of statements written in a particular programming language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Record

A

row in a spreadsheet
stores a person’s or object’s response over a number of fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Fields

A

column in a spreadsheet
stores the info unit we have about each record (e.g. a person’s age, income, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Integer

A

a variable that contains numbers without decimal points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Programming tool

A

a software package that allows for the execution of programming code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Big data

A

large sets of both structured and unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Relational database

A

A means of storing information in such a way that information can be retrieved from it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Non-relational database
a database that is not stored in tables, ready for analysis, but instead they may be document-based and use a variety of other strategies.
26
Hadoop
an open-source software framework that stores and processes large amounts of data
27
Document Databases
a database that pairs a key with a complex data structure
28
Scientific method
A set of techniques used for investigating phenomena commonly based on reasoning applied to the evidence of empirical data
29
controlled experiment
type of experiment in which a hypothesis is tested by looking for changes in a dependent variable measure caused by manipulated changes to an independent variable as the only factor that is allowed to be adjusted
30
Overconfidence
tendency to think too highly of one's expertise
31
What does S.T.P stand for
Segmentation, Targeting, Positioning
32
Wide-column Stores
a database that uses a column-oriented data structure, similar to an inverted table that has multiple attributes per key
33
Graph databases
databases that use structured relational graphs of the key-value pairings
34
multi-model database
A database that can support multiple data models against a single, integrated backend
35
NoSQL
not-only structure query language with the ability to retrieve data from a relational database and data from non-relational data sources
36
SQL
used to retrieve data from a database application
37
SQL vs. NoSQL
-SQL manages relational databases while NoSQL manages non-relational databases -NoSQL can handle large volumes of rapidly changing structured semi-structured and unstructured data.
38
primary key
a column that uniquely identifies a row in the table
39
unique key
used to indicate that an index cannot accept duplicate entries
40
database management system
interacts with end-users to store and manage structured data
41
Table function
command that manages and changes tables within the database
42
Query function
SQL commands that ask questions of the data
43
SELECT
gathers specific data from a table
44
FROM
Establishes which table the data is gathered from
45
What are nodes
entities in a graph database such as people, accounts, firms, etc
46
cloud
storing and accessing data and programs over the internet instead of a local computer
47
GROUP BY
Tells how to segment the data (ex. group by state)
48
Join command
temporarily combines two tables for the query
49
Inner Join
data that has matching records in both tables
50
Left join
All data from left and matching data from right
51
Right join
All data from right and matching data from left
52
Full join
All data from both tables
53
What is a dendrogram
a diagram illustrating a hierarchy of clusters
54
Psychographic segmentation
Segmenting people by their feelings about a product category.
55
A centroid
a center of mass of a geometric object of uniform density
56
Cluster analysis
a technique for grouping people so that those in the same group are more like one another compared with those in other groups
57
data mining
The process of finding patterns in large data sets
58
Segmentation
Used in marketing to divide the total populations of customers into smaller, relatively homogenous groups.
59
targeting
Identifying which segment(s) to pursue and appealing uniquely to particular segments of customers
60
Positioning
A business strategy that establishes the way a customer perceives a product or firm relative to the rest of the marketplace
61
k-means cluster analysis
iterative technique that seeks to allocate each observation to the cluster closest to it.
62
behavioral data
a highly valued source of segmentation; include usage rates and patterns for a product or category
63
A/B testing
a method for testing the effectiveness of a business effort via a controlled experiment that tests two or more conditions before exposure to the broader marketplace
64
Sample size
The number of participants needed for all conditions of the A/B test or other experiment
65
qualitative
a type of data that uses words, photos, or graphs
66
quantitative
a type of data that uses numbers