Midterm Flashcards Preview

+SOC 325 > Midterm > Flashcards

Flashcards in Midterm Deck (67):
1

research

disciplined inquiry into questions and theories

2

statistics

organizing numbers and data

3

qualitative research

stats (organizing numbers and data) + disseminating results

4

wheel of science

theory > hypothesis > observations > empirical generalizations

5

descripstive vs inferential statistics

descriptive: what is going on in the data? can be bivariate or multivariate
inferential: generalizing data to population

6

independent & dependent variables

independent variables lead to dependent variables
x = what is doing the predicting, y = what is being predicted

7

discrete vs continuous variables

whole number measurements vs fractional measurements

8

nominal vs ordinal vs interval vs ratio

categories vs ranked variables vs numbers without true zero vs numbers with true zero

9

percentages and proportions

about conceptualizing data
proportions are (f/n), percentages are (f/n)100 where f= frequency and n = number of cases in category

10

good graphs are....

theoretically motivated, easy to understand, useful

11

central tendency

the most typical/common/central score. describes data, makes certain characteristics easy to understand

12

mean and median and mode are all the same when...

the data is a normal curve

13

dispersion

how much variation is in the scores? when there is less dispersion, the curve is taller and narrower, and when there is more dispersion, the curve is flater and wider

14

variation ratio

simple measure of statistical dispersion in nominal distributions; it is the simplest measure of qualitative variation.

v = 1 - fm/n, where fm = the number of cases in the mode, and n = total number of cases
i.e. the proportion of cases not in a modal category

15

determining median in even number of cases

average of the two middle scores

16

interquartile range

the distance between 3rd and 1st quartile i.e. middle 50%

17

all scores ________ the mean

all scores cancel out to the mean

18

mean is the point of ________

mean is the point of minimized variation

19

when there is positive skew, x-bar is ____ relative to the median

when there is positive skew, x-bar is greater than the median

20

when there is negative skew, x-bar is ____ relative to the median

when there is negative skew, x-bar is less than the median

21

when there is no skew, x-bar is ____ relative to the median

when there is no skew, x-bar is equal relative to the median

22

when there is a positive skew, the shape of the curve is...

when there is positive skew, the shape of the curve is stretched out towards the right, with the "lump" being further to the left.

23

when there is a negative skew, the shape of the curve is..

when there is a negative skew, the shape of the curve is stretched out towards the left, with the "lump" being further to the right

24

standard deviation

the average distance from the mean
square root of the average difference from the mean squared

25

box plots

the box indicates the middle 50%, the lower boundary of the box represents the first quartile (i.e. the point where 25% of the sample lies under) and the upper boundary of the box represents the third quartile (i.e. the point where 75% of the sample lies above). The line through the box indicates the median. The whiskers indicate 1.5xIQR. Outliers are often included.

26

normal curve

theoretical, bell shaped, unimodal, symmetrical, mode/mean/median is equal

27

+/- 1 standard deviation captures __% of the sample

+/- 1 standard deviation captures 68.26% of the sample

28

+/- 2 standard deviations captures __% of the sample

+/- 2 standard deviations captures 95.44% of the sample

29

+/- 3 standard deviations captures __% of the sample

+/- 3 standard deviations captures 99.72% of the sample

30

z-score

z-score is a position along the normal curve, indicates the number of standard deviations it falls above or below the mean. i.e. z-score of 1 means that the data point is 1 standard deviation above the mean

31

population and parameter are analogous with...

population and parameter are analogous with sample and statistic.
in other words, statistics are characteristics of the sample, and parameters are characteristics of the population

32

EPSEM

equal probability of selection method

33

sampling distribution

theoretical concept that links the sample to the population. The sample distribution is normal in shape, and the mean is equal to the population standard deviation/sqrN.

The sampling distribution represents the distribution of the point estimates based on samples of a fixed size from a certain population.

34

law of large numbers

the more samples we have, the closer we get to the normal curve.

The law of large numbers is a principle of probability according to which the frequencies of events with the same likelihood of occurrence even out, given enough trials or instances.

So if you flip 10 coins, you may get 90% heads and 10% tails, but if you flip 100 coins, you're more likely to get closer to 50% heads and 50% tails. The proportion of heads after n flips will almost surely converge to 1/2 as n approaches infinity.

35

central limit theorem

The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement, then the distribution of the sample means will be approximately normally distributed. This will hold true regardless of whether the source population is normal or skewed, provided the sample size is sufficiently large (usually n > 30).

the average of your sample means will be the population mean

36

standard error

standard deviation of the sampling distribution
e.g. plotting the means of 50 samples of 10 would give you a normal curve with a standard deviation

37

point estimate

a single statistic used to infer info about the population
e.g. taking the mean of the heights of a sample of students and inferring the mean of the heights of all students from the sample mean

38

criteria for choosing estimators

-bias: if an estimator is unbiased if the mean of its sampling distribution is equal to the proportion of interest.
-efficiency

39

z-score for a 95% confidence interval

1.96

40

alpha

how certain do you want to be?
e.g. alpha = 0.05 means a confidence level of 95%
every alpha has a z-score associated with it
e.g. alpha = 0.05 has a z-score of 1.96

41

constructing confidence intervals for means

(1) set the alpha
(2) find the z-score associated with that alpha
(3) use formula for confidence intervals with sample means

42

the bigger the sample the _____ the width of the confidence interval because _______.

the bigger the sample the smaller the width of the confidence interval because standard error is smaller.

43

what would you do to increase the confidence interval?

increase the alpha, e.g. instead of wanting alpha = 0.05 CI 95%, set alpha to 0.01 CI 99%.

44

confidence interval _____ as confidence level _____.

confidence interval widens as confidence level increases.

45

Null hypothesis vs alternative hypothesis

null hypothesis (H0) always says there is no significant difference. alternative hypothesis (HA) says there is a significant difference. We always assume that the null is true.

46

what is a hypothesis test?

-make a hypothesis
-use z-score formula to determine probability of getting the observed difference: "this difference is statistically different at the alpha = 005 level."
-trying to identify statistically significant differences that didn't occur by chance

47

5 step model of hyptohesis testing: one sample case

(1) make assumptions -level of measurement is interval ratio, sampling distribution is normal (basically n > 120)
(2) state null hypothesis
(3) select sampling distribution and establish a critical region
(4) compare the test statistic
(5) make decision and interpret the results, either rejecting the null or failing to reject the null

48

one-tailed vs two-tailed test

one-tailed = "significantly less/more" +1.96 or -1.96
two-tailed = "significantly different" +/- 1.96
one-tailed is stronger.

49

alpha levels affect what in hypothesis testing?

critical region
> alpha = < critical region, critical region
e.g. alpha = 0.05, critical region +/- 1.96, alpha = 0.10, critical region =/-1.65

50

type I error

rejecting true null hypothesis. aka alpha error. this happens when the thing occurred by random chance but you claimed that it was significantly different. you can avoid type I error by increasing the alpha, e.g. saying you want to be 99% sure instead of 95% sure that something is significantly statistically different.

51

type II error

failing to reject false null hypothesis. aka beta error. this happens when the thing was actually significantly different but you claimed that was not statistically different and happened by random chance. you can avoid type II error by decreasing the alpha, e.r. saying you want to be 95% sure instead of 99% sure.

52

degrees of freedom

(n-1)

53

student's t distribution

used for smaller samples (n < 120) when the population mean is unknown. the student t distribution is shorter and wider than the z-distribution.

54

two sample test of means for large samples

(1) make assumptions - the samples must be independent random sample i.e. mutually exclusive; interval ratio measurements; sampling distribution is normal (basically n > 120)
(2) State the null hypothesis
(3) select sampling distribution and establish critical region
(4) compare test statistic
(5) make decision and interpret results

55

two sample test of means for small samples

(1) make assumptions - the samples must be independent random sample i.e. mutually exclusive; interval ratio measurements; population variances are equal (as long as the 2 samples are approximately the same size, we can make this assumption), sampling distribution is normal (because we're using small samples, we have to add the previous assumption in order to make this one)
(2) State the null hypothesis
(3) select sampling distribution and establish critical region
(4) compare test statistic
(5) make decision and interpret results

56

two sample test for proportions

(1) make assumptions - the samples must be independent random sample i.e. mutually exclusive; nominal measurements; sampling distribution is normal (basically n > 120)
(2) State the null hypothesis
(3) select sampling distribution and establish critical region
(4) compare test statistic
(5) make decision and interpret results

57

significance vs importance

differences that are otherwise trivial or uninteresting may be significant. Significance just states whether something is different (is the difference in our sample correct/same as the population?), but it doesn't say if it is an important difference. The substantive importance is up for interpretation

58

test statistics get ____ as n gets ____.

test statistics (like p-vlue) get larger as n get larger.

59

confidence interval vs two sample test

when you're using the two-sample test, you're taking both estimates of the means and both standard deviations into account. So there is still a possibility of the error bars overlapping but the difference still being statistically different.

60

what is the variance of a normal curve?

1

61

population values can be estimated with...

sample values

62

what is a point estimate?

the use of sample data to calculate a single value (known as a statistic) which is to serve as a "best guess" or "best estimate" of an unknown (fixed or random) population parameter

63

which sample statistics are unbiased?

means and proportions

64

what is efficiency?

Basically sample size.

65

The (larger/smaller) the sample size, the (higher/lower) the value of the standard deviation of the sampling distribution.

larger, lower

66

The (larger or smaller) the sample size, the more tightly clustered the sample outcomes will be around the mean of the sampling distribution.

larger

67

difference between point estimates and interval estimates

point estimate: we estimate the population value is the same as the sample statistic
interval estimate: we construct a confidence interval, a range of values into which we estimate the population value