Week 3 - Research and Measurement Flashcards

1
Q

why do research and analysis?

A

in order to make the right decision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

does all data and analysis have value?

A

NO - only if they help us make a decision

raw data has very little value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

in hypothesis testing, when do you make a prediction

A

prior to testing (a priori)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the purpose of marketing research

A

inform decision making for business decisions (vs scientific research for instance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what do you call raw data once it has been analysed?

A

interpreted data, ie, information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how should a decision maker be involved?

A

understand enough to know what’s reliable
tell the research team which questions to answer
potentially make predictions
project manage perhaps
be able to think like a researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how should a researcher be involved?

A

convert questions/predictions into testable hypotheses
conduct the applicable research
present results in a way to answer the original question
communicate information clearly - reduce the complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how should administrators be involved?

A

understand sufficiently to

1) find common ground
2) engage throughout the process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is inferential statistics?

A

statistical analysis to infer or estimate from a population

based on probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the properties of data?

A

assignment
assignment and order
assignment order, and distance
assignment order, distance, and origin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the minimal requirement for raw data to be analysed?

A

must be able to place into categories (at least assignment)
can have:
assignment order, distance, and origin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is assignment for data?

A

groupings

eg, color, gender, state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is order for data?

A
data points that can be ordered
eg, birth order, class rank, placement in race
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is distance for data?

A

ability to understand how far apart data points are from each other
eg, one person has 100%, another has 80%, distance is 20ppt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is origin for data?

A

an unambiguous starting point or point of comparison
eg, zero is the lowest grade, 2018 is the current year

allows measurement of distance between data points AND vs origin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the four classifications of data?

A

non-metric

  • nominal
  • ordinal

metric

  • interval
  • ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is nominal data classification?

A

nonmetric = nonparametric tests
assignment only
central tendency is only mode (most frequently occurring)

eg, most of these m&ms are blue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is ordinal data classification?

A

nonmetric
assignment and order
central tendency is only mode or median

eg, shortest to tallest height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is interval data classification?

A

metric = parametric data analysis available
assignment, order, and distance
(considered continuous because distance between points is measurable)
central tendency: mean, median, and mode (all three)

eg, what is the average length of a canoe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is ratio data classification?

A

metric
assignment, order, distance, and origin
continuous
all central tendencies (mean, median, and mode)

eg, star ratings between books, consumption over years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what are descriptive statistics

A

a quantitative approach to identifying characteristics about a respondent pool
not a testing method

who answered our questions? what is the make up of our data overall?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what tools does descriptive statistics use?

A

central tendencies (mean, median, mode)
percentages
measures of dispersion
frequency distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

when and how can you use mode?

A

any data with assignment (nominal)

what’s the most common?

24
Q

how do you use median?

A

any data with an order property
what’s in the middle if you count from each side?
if two, you average the tie to come up with the answer

25
how do you use the mean?
only if you have distance property | average the group
26
what is a percentage?
a frequency, expressed as a fraction of 100
27
what is a range?
the defined distance between the smallest and largest numbers in the data?
28
how do you measure standard deviation?
what is the average difference between data points and the mean how similar are numbers on average?
29
how do you measure frequency distribution?
visualise the distribution of the data - say with a bar chart mode is just picking the tallest bar can be applied to nominal data
30
what is the difference between census and sample studies?
``` census = entire population sample = part of the population ``` inferential statistics help when you can't perform a whole census
31
when is a census study better than a sample study?
any time you can do a census study | but often it isn't reasonably possible
32
what is a population parameter
population parameter = true fact based on 100% observation (census) statistic = estimate
33
what are the pros and cons of sampling?
pro - lower cost - easier and faster data handling cons - higher error rate - errors can drive bad decision making
34
why are sample-based estimates useful?
probability distributions allow for predictable estimates
35
how much does sample drawing matter?
it's THE most critical part - an error here can lead to skewing or bias
36
how can you draw a sample?
probability - researcher has no role in drawing (eg, random sample) nonprobability - researcher does have a role (eg, convenience sampling of people nearby)
37
what is probability sampling?
researcher plays no role in buliding the sample generally near random similar but not exactly every data point has an equal chance of being selected
38
what is nonprobability sampling?
researcher does play a role in selection | convenience sample is very common - stopping people at the subway for instance
39
why does error occur in statistical inference?
because a sample <> census thus while it is in theory representative, often reality can differ
40
what are the two types of errors found in statistical inferences?
sampling error - nonrepresentative sample | nonsampling error - systemic and/or random error not associated with the manner of drawing the sample
41
when should sampling error be suspected
``` probability sampling (random) - no risk of sampling error, but VERY rarely 100% followed (think - completion bias) non-probability sampling (selected) - high risk of error, must assume at least a certain level of error (hence statistical significance) ```
42
when should nonsampling error be suspected?
any time you don't have a full census | even if the sample is random, if it isn't complete (eg census) we can never be 100% sure of conclusions
43
what is the null hypothesis?
proof that there is no difference between compared populations eg, people who take this medicine are definitely no better off than people who don't the null hypothesis is generally assumed true until proven false
44
what is a Type 1 error?
telling a man he's pregnant when he isn't rejecting the null hypothesis, when it's actually True
45
what is a Type 2 error?
telling a man he's not a man when he really is accepted the null hypothesis when the null hypothesis is false normally type 2 is safer
46
can you decrease the likelihood of type 1 or 2 errors?
yes, by selecting significance levels but decreasing type 1 increases risk of type 2 choose your adventure
47
what are the two categories of data collection?
primary data | secondary data
48
what is secondary data?
collected for a purpose other than this research project | eg, UN data
49
what is primary data?
collected specifically for our hypotheses
50
what are the pros/cons of secondary data?
pros - available, already there - price, might be cheap or even free cons - relevancy, might not fit needs - accuracy, why was it collected, what standards were in place?
51
what is big data?
normally secondary data passively collected both structured and unstructured can test hypotheses, but can't verify cause/effect
52
how is primary data collected?
questioning - survey, interview (might not be answered honestly) observing - watching, documenting (more honest answers, but harder to understand the why) - on a person or on a company (eg, keyword analysis of company legal policies)
53
how can you establish causality?
only through experimentation | must be very careful to not communicate correlation as causality
54
what three factors are required to prove causality?
evidence of statistical association temporal ordering control for competing hypotheses
55
how do you prove causality - evidence of statistical association?
necessary, but insufficient for causality
56
how do you prove causality - temporal ordering?
must prove that A came before B eg, fire trucks arrived after fire started, not before
57
how do you prove causality - control for competing hypotheses?
look for unmeasured or unobserved hypotheses alternative hypotheses randomise away errors through probability sampling and experiment design churches and liquor stores increase in parallel, but even with temporal ordering, neither causes the other reality: population growth caused both