EDA Flashcards

(64 cards)

1
Q

Result from making observations either on a single variable or simultaneously on two or more variables

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of data

A

Primary data
Secondary data
Categorical data
Contonuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Collected fresh and for the first time and thus happen to be original in character

A

Primary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data which have been collected by someone else and which already have been passed through a statistical analysis

A

Secondary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is a variable type with two or more categories, take on one of a limited number of possible values

A

Categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data that can be measured on an infinite scale

A

Continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes

A

Data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Process of using diverse analytical methods to review data and arrive at relevant conclusions

A

Data interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making

A

Data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Steps of Data Analysis

A

Defining the question
Collecting the data
Cleaning the data
Analyzing the data
Sharing your results
Embracing failure
Summary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Methods of Data collection

A

Population
Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Entire collection of individuals or objects about which information is desired is called the of interest

A

Population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A subset of the population, selected for study in some prescribed manner

A

Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Steps in data gathering

A

Measurement
Representation
Statistical tools to analyze data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data gathering under measurement

A

Construct
Measurement
Response
Edited response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data gathering under Representation

A

Target population
Sampling frame
Sample/respondents
Post-survey adjustments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data gathering under statistical tools to analyze data

A

Statistical tests
Some advance modeling techniques
Some bias reduction techniques
Regression
Discrete choice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Three Major Methods of Data Collection

A

Mail
Telephone
Face to face survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Types of Questionnaire Survey

A

Stated preference survey
Revealed preference survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

types of of stated choice experiment

A

Conjoint analysis
Contingent valuation method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Typically asking participants to choose one alternative from a set of hypothetical alternatives where attributes of alternatives are set by researcher

A

Conjoint analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Typically asking participants to answer (monetary) value of some public (non-market) good.

A

Contingent valuation method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The group of elements for which the survey investigator wants to make inferences by using the sample statistics

A

Target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Lists/procedures intended to identify all elements of a target population or a set of units who are potentially selected as respondents

A

Sampling frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Sample selected from a sampling frame
Sample
26
Sample who successfully answered
Respondents
27
Two types of Sample Non-response
Item nonresponse Unit nonresponse
28
Respondent refuses to answer one or more survey questions
Item nonresponse
29
Respondents refuses to take the survey at all
Unit nonresponse
30
Handling non-response data by simply excluding the data having item-nonresponse
Procedures with completely recorded units
31
Handling non-response data by excluding the item0missing data, and handle the impacts by changing the weights
Weighting procedures
32
Handling non-response data in a way that the missing values are filled in and the resultant completed data are analyzed by standard methods
Imputation-based procedures
33
Handling non-response data by defining a model for the observed data with a certain missing
Model-based imputation procedures
34
Function gives the frequency of different possible values
Distribution
35
Examples of Single Variable
Histogram Boxplot
36
Examples of data visualization under Categorical-continuous data of Multiple Variable
Point Histogram Boxplot
37
Examples of data visualization under Continuous-continuous data of Multiple Variable
Scatterplot Heatplot
38
Yi=B0 + B1 X1 + ui What are B0 and B1
parameters
39
Yi=B0 + B1 X1 + ui What is ui
error term
40
two types of linear model
Regression analysis Linear Regression
41
For of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor).
Regression analysis
42
The supervised machine learning model in which the model finds the best fit linear line between the independent and dependent variable i.e it finds the linear relationship between the dependent and independent variable
Linear regression
43
Probabilities of occurrence of different possible outcomes
Probability distribution
44
A "bell-shaped" distribution
Normal Distribution
45
A "discrete-probability" distribution
Poisson distribution
46
Confidence interval is calculated from what?
Standard Error
47
Is more intuitive and has clear quantitative implications
Confidence interval
48
Very popular in the old style
p-value
49
The probability to wrongly reject the collect null hypothesis
p-value
50
the 95th confidence interval is not intersected with the zero line
p-value
51
A statistical measure that expresses the extent to which two variables are linearly related (meaning they change together at a constant rate)
Correlation
52
Correlation does not necessarily mean what?
Causation
53
Refers to the claim that a set of observed data are not the result of chance but can instead be attributed to a specific cause. It is a way to tell you if your test results are solid.
Statistical Significance
54
Process of determining the magnitude of statistical variates at some future point of time.
Prediction
55
The process of using correlations between variables to hypothesize about future events and outcomes
Prediction
56
Used to model the relationship between two continuous variables
Simple Linear Regression
57
When to use Simple Linear Regression
Positive relationship Negative relationship Linear relationship Curvilinear relationship
58
Used to model the relationship between a continuous response variable and continuous or categorical explanatory variables
Multiple Linear Regression
59
A statistical test that is used to compare the means of two groups
T-test
60
It is often used in hypothesis testing to determine whether a process or treatment influences the population of interest, or whether two groups are different from one another
T-test
61
A statistical method for testing for differences in the means of three or more groups
One Way - ANOVA
62
Meaning of ANOVA
Analysis of Variance
63
Test that measures how a model compares to actual observed data
Chi-square test
64