EDA Flashcards

1
Q

Result from making observations either on a single variable or simultaneously on two or more variables

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of data

A

Primary data
Secondary data
Categorical data
Contonuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Collected fresh and for the first time and thus happen to be original in character

A

Primary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data which have been collected by someone else and which already have been passed through a statistical analysis

A

Secondary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is a variable type with two or more categories, take on one of a limited number of possible values

A

Categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data that can be measured on an infinite scale

A

Continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes

A

Data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Process of using diverse analytical methods to review data and arrive at relevant conclusions

A

Data interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making

A

Data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Steps of Data Analysis

A

Defining the question
Collecting the data
Cleaning the data
Analyzing the data
Sharing your results
Embracing failure
Summary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Methods of Data collection

A

Population
Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Entire collection of individuals or objects about which information is desired is called the of interest

A

Population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A subset of the population, selected for study in some prescribed manner

A

Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Steps in data gathering

A

Measurement
Representation
Statistical tools to analyze data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data gathering under measurement

A

Construct
Measurement
Response
Edited response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data gathering under Representation

A

Target population
Sampling frame
Sample/respondents
Post-survey adjustments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data gathering under statistical tools to analyze data

A

Statistical tests
Some advance modeling techniques
Some bias reduction techniques
Regression
Discrete choice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Three Major Methods of Data Collection

A

Mail
Telephone
Face to face survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Types of Questionnaire Survey

A

Stated preference survey
Revealed preference survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

types of of stated choice experiment

A

Conjoint analysis
Contingent valuation method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Typically asking participants to choose one alternative from a set of hypothetical alternatives where attributes of alternatives are set by researcher

A

Conjoint analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Typically asking participants to answer (monetary) value of some public (non-market) good.

A

Contingent valuation method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The group of elements for which the survey investigator wants to make inferences by using the sample statistics

A

Target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Lists/procedures intended to identify all elements of a target population or a set of units who are potentially selected as respondents

A

Sampling frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Sample selected from a sampling frame

A

Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Sample who successfully answered

A

Respondents

27
Q

Two types of Sample Non-response

A

Item nonresponse
Unit nonresponse

28
Q

Respondent refuses to answer one or more survey questions

A

Item nonresponse

29
Q

Respondents refuses to take the survey at all

A

Unit nonresponse

30
Q

Handling non-response data by simply excluding the data having item-nonresponse

A

Procedures with completely recorded units

31
Q

Handling non-response data by excluding the item0missing data, and handle the impacts by changing the weights

A

Weighting procedures

32
Q

Handling non-response data in a way that the missing values are filled in and the resultant completed data are analyzed by standard methods

A

Imputation-based procedures

33
Q

Handling non-response data by defining a model for the observed data with a certain missing

A

Model-based imputation procedures

34
Q

Function gives the frequency of different possible values

A

Distribution

35
Q

Examples of Single Variable

A

Histogram
Boxplot

36
Q

Examples of data visualization under Categorical-continuous data of Multiple Variable

A

Point
Histogram
Boxplot

37
Q

Examples of data visualization under Continuous-continuous data of Multiple Variable

A

Scatterplot
Heatplot

38
Q

Yi=B0 + B1 X1 + ui

What are B0 and B1

A

parameters

39
Q

Yi=B0 + B1 X1 + ui

What is ui

A

error term

40
Q

two types of linear model

A

Regression analysis
Linear Regression

41
Q

For of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor).

A

Regression analysis

42
Q

The supervised machine learning model in which the model finds the best fit linear line between the independent and dependent variable i.e it finds the linear relationship between the dependent and independent variable

A

Linear regression

43
Q

Probabilities of occurrence of different possible outcomes

A

Probability distribution

44
Q

A “bell-shaped” distribution

A

Normal Distribution

45
Q

A “discrete-probability” distribution

A

Poisson distribution

46
Q

Confidence interval is calculated from what?

A

Standard Error

47
Q

Is more intuitive and has clear quantitative implications

A

Confidence interval

48
Q

Very popular in the old style

A

p-value

49
Q

The probability to wrongly reject the collect null hypothesis

A

p-value

50
Q

the 95th confidence interval is not intersected with the zero line

A

p-value

51
Q

A statistical measure that expresses the extent to which two variables are linearly related (meaning they change together at a constant rate)

A

Correlation

52
Q

Correlation does not necessarily mean what?

A

Causation

53
Q

Refers to the claim that a set of observed data are not the result of chance but can instead be attributed to a specific cause. It is a way to tell you if your test results are solid.

A

Statistical Significance

54
Q

Process of determining the magnitude of statistical variates at some future point of time.

A

Prediction

55
Q

The process of using correlations between variables to hypothesize about future events and outcomes

A

Prediction

56
Q

Used to model the relationship between two continuous variables

A

Simple Linear Regression

57
Q

When to use Simple Linear Regression

A

Positive relationship
Negative relationship
Linear relationship
Curvilinear relationship

58
Q

Used to model the relationship between a continuous response variable and continuous or categorical explanatory variables

A

Multiple Linear Regression

59
Q

A statistical test that is used to compare the means of two groups

A

T-test

60
Q

It is often used in hypothesis testing to determine whether a process or treatment influences the population of interest, or whether two groups are different from one another

A

T-test

61
Q

A statistical method for testing for differences in the means of three or more groups

A

One Way - ANOVA

62
Q

Meaning of ANOVA

A

Analysis of Variance

63
Q

Test that measures how a model compares to actual observed data

A

Chi-square test

64
Q
A