ch 12 - Data-based and statistical Reasoning Flashcards

1
Q

measures of central tendency

A

those that describe the middle of a sample; how middle is defined can be different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

mode

A

number that appears most often in a set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

normal distribution

A

mean, median and mode are at center of distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

standard distribution

A

mean is zero and standard deviation of one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

skewed distribution

A

contains a tail on one side or the other of the data set. Negatively skewed is tail to the left with mean lower than median and positively is tail to the right with mean higher than median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

bimodal

A

distribution containing two peaks; do not have to have two modes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

range

A

difference between data set’s largest and smallest values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

interquartile range

A

related to the median, first and third quartiles; gathered by subtracting the value of the first quartile from the value of the third quartile (IQR = Q sub 3 - Q sub 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

quartiles

A

include median (Q sub 2), divide data when placed in ascending order into groups that comprise one-fourth of the entire set; first quartile is 1/4n (number of data) and mean of number at whatever position that is and the number of the next position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

example of using interquartile range to determine outliers

A

find range which is third quartile - first quartile. Use this range to multiply times 1.5 and add this number to third quartile. Anything above this number is an outlier. Use range to multiply times 1.5 and subtract this number from first quartile - anything falling below this number is an outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

standard deviation

A

calculated by taking the difference bt each data point and the mean, squaring this value, dividing the sum of all of these squared values by (the number of points in the data set minus one (so divided by n-1)), and taking the square root of the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

determining outlier via standard deviation

A

after standard deviation is determined, if a value falls more than 3 x standard deviations outside of mean, it is an outlier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

standard deviation and normal distribution

A

68% of data points fall within one standard deviation of the mean, 95% fall within two standard deviations, and 99% fall within three standard deviations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

independent events in probability

A

events that have no effect on one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

dependent events in probability

A

have an impact on one another, such that the order changes the probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

mutually exclusive outcomes

A

cannot occur at the same time; probability of them occurring together is 0%

17
Q

exhaustive outcomes

A

a group of outcomes that is all inclusive so that there are no other possible outcomes

18
Q

calculating independent probability

A

P(A) x P(B) - probability of the first option x probability of the second option equals probability that both will occur

19
Q

probability of at least one of two events occurring

A

P(A) + P(B) - P(A and B)

20
Q

hypothesis testing

A

begins with an idea about what may be different bt two populations

21
Q

null hypothesis

A

always a hypothesis of equivalence; says that two populations are equal, or that a single pop can be described by the parameter equal to a given value; when able to rejected based on p-value being greater than significance level (alpha), it means results are statistically significant

22
Q

alternative hypothesis

A

may be nondirectional meaning that the populations are not equal, or directional

23
Q

z- or t-tests

A

most common hypotheses; rely on standard distribution or the closely related t-distribution

24
Q

test statistic

A

calculated and compared to a table to determine the likelihood that that statistic was obtained by random chance (under the assumption that our null hypothesis is true)

25
Q

type 1 error

A

represented by value of alpha which is the level of risk we are willing to accept for incorrectly rejecting the null hypothesis, meaning we report a difference between two populations when one does not actually exist

26
Q

type II error

A

we incorrectly fail to reject the null hypothesis; we determine there is no difference between two populations when one actually does exist; probability of this type of error is sometimes symbolized by beta

27
Q

power

A

probability of correctly rejecting the null hypothesis and reporting a difference between pops when one does exist; equal to 1-beta (1 - type II error)

28
Q

confidence

A

probability of correctly not rejecting the null hypothesis and reporting that two pops are equal when they actually are.

29
Q

confidence intervals

A

the reverse of hypothesis testing; determine a range of values from the sample mean and standard deviation; rather than finding p-value, we begin with a desired confidence level and use a table to find its corresponding z- or t-score. We then create a range based on this score x standard deviation by subtracting and adding it to the mean

30
Q

p-value

A

test statistic compared to a table

31
Q

slope

A

m = rise/run = delta y/delta x = (y sub 2 - y sub 1)/x sub 2 - x sub 1)