Midterm Flashcards

(75 cards)

1
Q

What are the 3 major components of statistics?

A
  1. Planning and design of analyses
  2. Descriptive statistics
  3. Inferential statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Descriptive Statistics:

A

Descriptive statisticsorganize, summarize, and communicate large amounts of numerical information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Inferential Statistics:

A

Inferential statisticsdraw conclusions about larger populations based on smaller samples of that population by using the rules of probability to test hypotheses and make decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference between a population and a sample?

A

Population - the entire group of individuals we want information about.
Sample - part of individuals in the population from which we actually collect data to learn about the population as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between discrete and continuous observations?

A

Discrete observationsare those that can take on only certain numbers (e.g., whole numbers, such as 1). Two types ofvariables, nominal and ordinal, can only be discrete.

continuous observationsare those that can take on all possible numbers in a range (e.g., 1.68792). Two types of variables can be continuous (although both can also be discrete in some cases): interval and ratio.These are scale variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal variable

A

Discrete; Nominal variablesuse numbers simply to give names to scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ordinal variable

A

Discrete; Ordinal variablesare rank-ordered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interval variable

A

continuous; Interval variablesare those in which the distances between numerical values are assumed to be equal.
• Temperature is often an interval variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ratio variable

A

Continuous; Ratio variablesare those that meet the criteria for interval variables but also have a meaningful zero point.
• Time always implies a meaningful zero point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the roles of independent, dependent and confounding variables in statistics?

A

Independent variablescan be manipulated or observed by the experimenter, and they have at least twolevels, or conditions.
Dependent variablesare outcomes in response to changes or differences in the independent variable.
Confoundingvariablessystematically vary with the independent variable, so we cannot logically determine which variable may have influenced the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the difference between comparative and correlational analyses?

A

comparative: comparing data from 2 groups; Abetween-groups research design is used; involves random assignment to conditions
correlational: comparing data within one group; correlational research examines associations where random assignment is not possible and variables are not manipulated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the difference between experimental and observational studies?

A

an observational study is where nothing changes and just record what you see, but an experimental study is where you have a control group and a testable group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the difference between a between-groups design and a within-groups design?

A

between- groups: an experiment in which participants experience one and only one level of the independent variable. A control group is compared to an experimental group in this design.

within-groups: an experiment in which all participants in the study experience the different levels of the independent variable. (i.e. An experiment that compares the same group of people before and after they experience a level of an independent variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the importance of randomization?

A

to control for confounding variables. Most experiments have either abetween-groups designor awithin-groups design to try to minimize the effects of confounding variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is HARKing?

A

“hypothesizing after the results are known,” whereby researchers alter their hypotheses to match their findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

positively skewed distribution

A

A distribution that ispositively skewedhas a tail in a positive direction (to the right), indicating more extreme scores above the center. It sometimes results from afloor effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

negatively skewed distribution

A

A distribution that isnegatively skewedhas a tail in a negative direction (to the left), indicating more extreme scores below the center. It sometimesresults from aceiling effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

normal distribution

A

a distribution that is unimodal, symmetrical and bell-shaped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

floor effect

A

scores are constrained and cannot be below a certain number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

ceiling effect

A

scores are constrained and cannot be above a certain number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

histogram

A

displays bars of different heights indicating the frequency of each value (or interval) that the variable can take on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

bar chart

A

used to compare two or more categories of a nominal or ordinal independent variable with respect to a scale-dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is a boxplot and how is it interpreted?

A

x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is a violin plot and how is it interpreted?

A

x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
what is a frequency polygon and how is it interpreted?
x
26
5 ways a graph can lie
1. biased scale 2. sneaky sample 3. interpolation 4. extrapolation 5. inaccurate values
27
biased scale
Uses scaling to skew results
28
line graph
a type of chart which displays information as a series of data points called 'markers' connected by straight line segments
29
interpolation
Assumes value(s) between two data points follows the same pattern
30
extrapolation
Assumes that values beyond data points will continue indefinitely
31
inaccurate values
Uses scaling to distort portions of the data
32
If there is one scale variable (with frequencies), what chart should you use to display your data?
a histogram
33
If there is one scale-independent variable and one scale dependent variable, what chart should you use to display your data?
a scatterplot or line graph
34
If there is one nominal or ordinal independent variable and one scale dependent variable what chart should you use to display your data?
a bar graph. | * but consider using a Pareto chart if the independent variable has many levels.
35
If there are two or more nominal or ordinal independent variables and one scale dependent variable, what chart should you use to display your data?
a bar graph
36
If there are two or more nominal or ordinal independent variables and one scale dependent variable, what chart should you use to display your data?
a bar graph
37
If there are two or more nominal or ordinal independent variables and one scale dependent variable, what chart should you use to display your data?
a bar graph
38
what is a random sample?
every member of the population has an equal chance of being selected into the study; Random samples are almost never used in the social sciences because it is difficult to access the whole population from which to select the sample.
39
what is a convenience sample?
uses participants who are readily available, such as college students.
40
what is a random sample?
every member of the population has an equal chance of being selected into the study; Random samples are almost never used in the social sciences because it is difficult to access the whole population from which to select the sample. However, random assignment is frequently used
41
Generalizability
refers to researchers’ ability to apply findings from one sample or in one context to other samples or contexts. This principle is also called external validity; Can be improved with replication
42
what is wrong with convenience samples?
A convenience sample might not represent the larger population so will have a low generalizability which would make the research pointless because it would not apply to anyone else -one type, a volunteer sample can be problematic when conducted online because researchers have no control over who is involved in it
43
drawback of volunteer sample/crowd sourcing
On the one hand, only with the Internet can we have so many participants and data points. On the other hand, we must be cautious because, as summarized by a journalist, these researchers explained that “collecting data through crowdsourcing means researchers have no control over who is playing
44
drawback of volunteer sample/crowd sourcing
On the one hand, only with the Internet can we have so many participants and data points. On the other hand, we must be cautious because, as summarized by a journalist, these researchers explained that “collecting data through crowdsourcing means researchers have no control over who is playing
45
null hypotheses
a statement that postulates that there is no difference between populations or that the difference is in a direction opposite to that anticipated by the researcher
46
internal validity vs external validity
Internal and external validity are concepts that reflect whether or not the results of a study are trustworthy and meaningful. While internal validity relates to how well a study is conducted (its structure), external validity relates to how applicable the findings are to the real world
47
the law of large numbers
states that as a sample size grows, its mean gets closer to the average of the whole population
48
type 1 error and consequences of it
when we reject the null hypothesis but the null hypothesis is correct. Many researchers consider the consequences of a Type I error to be particularly detrimental because people often take action based on a mistaken finding/ mistaken rejection of the null hypothesis.
49
type 2 error and consequences of it
when we fail to reject the null hypothesis but the null hypothesis is false; A failure to reject the null hypothesis typically results in a failure to take action—for instance, a research intervention is not performed or a diagnosis is not given—which is generally less dangerous than incorrectly rejecting the null hypothesis.
50
the role of null and alternative hypotheses before a study and after it
Researchers develop two hypotheses: a null hypothesis, which theorizes that there is no average difference between levels of an independent variable in the population, and an alternate hypothesis, which theorizes that there is an average difference of some kind in the population. > Researchers can draw two conclusions: They can reject the null hypothesis and conclude that they have supported the alternate hypothesis or they can fail to reject the null hypothesis and conclude that they have not supported the alternate hypothesis.
51
the role of null and alternative hypotheses before a study and after it
Researchers develop two hypotheses: a null hypothesis, which theorizes that there is no average difference between levels of an independent variable in the population, and an alternate hypothesis, which theorizes that there is an average difference of some kind in the population. > Researchers can draw two conclusions: They can reject the null hypothesis and conclude that they have supported the alternate hypothesis or they can fail to reject the null hypothesis and conclude that they have not supported the alternate hypothesis.
52
standardization
a way to convert individual scores from different normal distributions to a shared normal distribution with a known mean, standard deviation and percentiles
53
why is standardization important?
Allows fair comparisons bc if raw scores can be standardized on two different scales, then by converting both scores to z scores, the scores can be compared directly.
54
z score
the number of standard deviations a particular score is from the mean; a standardized version of the raw scores based on the population; Uses a distribution of means instead of a distribution of scores when the entire population is not available
55
to calculate z score:
(X- mean) / standard deviation of population
56
to calculate percentile:
once you have z score, look at z score table and see %. always less. so, if, for example, 90% then 90% of sample's value is less than X
57
to transform a z score to raw score
Step 1: Multiply the zscore by the standard deviation of the population. Step 2: Add the mean of the population to this product.
58
The z distribution
a normal distribution of standardized scores.
59
The standard normal distribution
a normal distribution of z scores.
60
the central limit theorem
demonstrates that a distribution made up of the means of many samples (rather than individual scores) approximates a normal curve, even if the underlying population is not normally distributed.
61
the central limit theorem
demonstrates that a distribution made up of the means of many samples (rather than individual scores) approximates a normal curve, even if the underlying population is not normally distributed.
62
the central limit theorem demonstrates two important principles:
Repeated sampling approximates a normal curve, even when the original population is not normally distributed. A distribution of means is less variable than a distribution of individual scores.
63
A distribution of means
a distribution composed of many means that are calculated from all possible samples of a given size, all taken from the same population; less variable than a distribution of individual scores.
64
the standard error of the mean
The standard deviation of the distribution of means related to the standard deviation of the population (SE) by the square root of the sample size (n) as [SE = sigma / sqrt (n) ]
65
3 components of the central limit theorem
1. Mean of the population (μ) and of the sampling distribution ( )are identical 2. The standard deviation of the population (σ) is related to the standard deviation of the distribution of sample means by: the standard error of the mean= standard deviation of the population/ square root of the sample size 3. For large n, the shape of the sampling distribution of means becomes normal
66
relationship between the mean of population and sample
they are identical
67
the empirical rule
states that for a normal distribution, nearly all of the data will fall within three standard deviations of the mean: 68-95-99.7
68
Benefits of sketching the normal curve:
Stays clear in memory; minimizes errors; Practical reference; Condenses the information
69
Calculating a Score Below the Mean
Step 1: Convert the raw score to a zscore. Step 2: Calculate the percentile, the percentage above, and the percentage at least as extreme for the negative zscore for Manuel’s height.
70
Assumptions and Steps of Hypothesis Testing
Requirements to conduct analyses:Assumption: Characteristic about a population that we are sampling; necessary for accurate inferences
71
Parametric Versus Nonparametric Tests
Parametric tests: Inferential statistical test based on assumptions about a population Nonparametric tests: Inferential statistical test not based on assumptions about the population
72
The Six Steps of Hypothesis Testing
Six basic steps used with each type of hypothesis test: Step 1: Identify the populations, distribution, and assumptions, and then choose the appropriate hypothesis test. Step 2: State the null and research hypotheses, in both words and symbolic notation. Step 3: Determine the characteristics of the comparison distribution. Step 4: Determine the critical values, or cutoffs, that indicate the points beyond which we will reject the null hypothesis. Step 5: Calculate the test statistic. Step 6: Decide whether to reject or fail to reject the null hypothesis.
73
What Does “Statistically Significant” Mean?
A finding is statistically significant if the data differ from what would be expected by chance if there were, in fact, no actual difference. ***“Statistically significant” does not necessarily mean that the finding is important or meaningful.
74
One-Tailed Versus Two-Tailed Hypotheses Tests
One-tailed test: Research hypothesis is directional, positing either a mean decrease or a mean increase in the dependent variable, but not both, as a result of the independent variable. Two-tailed test: Research hypothesis does not indicate a direction of the mean difference or a change in the dependent variable, but merely indicates that there will be a mean difference.
75
p-hacking
p-hacking is the use of questionable research practices to increase the chances of achieving a statistically significant result.