statistics exam 2 Flashcards Preview

anatomy > statistics exam 2 > Flashcards

Flashcards in statistics exam 2 Deck (63):
1

association

values of one variable tend to occur with certain values of another variable; detected when the conditional distributions differ from the marginal distribution and from each other.

2

bias

a condition where the mean of the statistic values differs from the parameter and the statistic estimates

3

bivariate data

data collected on two variables for each individual in a study.

4

central limit theorem

the name of the statement telling us that the sampling distribution of x bar is approximately normal whenever the sample is large and random.

5

conditional distribution

the distribution of the values in a single row (or a single column) of a two-way table.

6

control chart

a statistical tool for monitoring the input or output of a process

7

control limits

u-3sigma/rt n and u+3sigma/rt n; used to detect out-of-control signals in a control chart.

8

correlation coefficient

a measure of the strength of the linear relationship between two quantitative variables.

9

disjoint events

events that cannot occur simultaneously

10

distribution of a variable

a list of the possible values of a variable together with the frequency of each value (probabilities can be given instead of frequencies)

11

event

a single outcome or a combination of outcomes from a random phenomenon

12

extrapolation

predicting a Y value using a value of X that is outside of the range of X values used to obtain the regression equation. This prediction could be very far off.

13

inference

using results from a sample statistic value to draw conclusions about the population parameter.

14

influential observation

an observation that substantially alters the values of slope and y intercept in the regression equation when it is included in the computations.

15

law of large numbers

The fact that the average (x bar) of observed values in a sample will get closer and closer to u as the sample size increases.

16

laws of probability

the basis for hypothesis testing and confidence interval estimation

17

least squares

a method for finding the equation of a line that minimizes the sum of squared residuals.

18

least squares regression line:

the line with the smallest sum of squared residuals

19

lurking variable

a variable that is not measured but explains association between two variables that are measured.

20

marginal distribution

the distribution of the values in the "total" row (or the "total" column) of a two-way table

21

mean of the sampling distribution of x bar

the mean of all the sample means (x bars) from all possible samples of size n from a population; equals u

22

u

the mean of the population

23

no association

a condition where values of one variable occur independent of values of another variable; detected when the conditionals of a two-way table equal the marginal distribution (and each other)

24

out-of-control process

one sample mean outside three standard deviations of x bar or 9 sample means in a row above or below the center line.

25

outlier

an observation that falls outside the overall pattern of the data set

26

parameter

a characteristic of a population that is usually unknown; this could be mean, median, proportion, standard deviation computed on all the data from the population; a parameter does not have variability

27

parameter symbols

u, sigma, and p (mean of population, standard deviation of population, proportion of a population)

28

positive association

high values of one variable tend to associate with high values of another variable.

29

probability of an outcome

a measure of the proportion of times an outcome occurs in a very long series of repetitions that gives us an indication of the likelihood of the outcome.

30

process

sequence of operations used in production, manufacturing, etc.

31

process in statistical control

a process whose inputs and outputs exhibit natural variation when observed over time

32

quality control chart

a chart plotting the means, x bar, of regular samples of size n against time; this chart is used to access whether the process is in control.

33

quantitative bivariate:

the type of data required for regression analysis

34

r

the symbol for correlation coefficient

35

r squared

the percentage of total variation in the response variable, y, that is explained by the regression equation; in other words, the percentage of total variation in the response variable, y, that is explained by the explanatory variable, X.

36

random

a phenomenon that describes the uncertainty of individuals outcomes but gives a regular distribution of the outcomes in the long run.

37

regression equation

a formula for a line that models a linear relationship between two quantitative variables

38

residual

the observed y minus the predicted y; denoted y-yhat

39

residual plot

a diagnostic plot of the explanatory variable versus the residuals used to access how well the regression line fits the data; complete scatter in a shoebox pattern is good whereas a megaphone pattern denotes unequal variance in Y's across all levels of X and curvature in the form of a smile or a frown denotes that the linear model isnot best for that data.

40

sample mean, x bar

the random variable ot the sampling distribution of x bar

41

sample space

the list of all possible outcomes of a random phenomenon

42

sampling distribution

a distribution of a statistic; a list of all the possible values of a statistic together with the frequency (or probability) of each value

43

sampling distribution of x bar

a list of all the possible values for x bar together with the frequency (or probability) of each value; in other words, the distribution of all x bar's from all possible samples

44

sampling variability

the variability of sample results from one sample to the next; something we must measure in order to effectively do inference

45

scatterplot

a two dimensional plot used to examine strength of relationship between two variables as well as direction and type of relationship.

46

Simpson's paradox

a condition where the percentages reverse when a third (lurking) variable is ignored; in other words, a condition leading to misinterpretation of the direction of association between two variables caused by ignoring a third variable that is associated with both of the reported variables.

47

simulation

using random numbers to imitate chance behavior

48

slope

a measure of the average change in the response variable for every one unit increase in the explanatory or independent variable

49

standard deviation (s):

a measure of the variability of data in a sample about x bar.

50

standard deviation of x bar, also called the standard deviation of the sampling distribution of x bar

a measure of the variability of the values of the statistic x bar about u; a measure of the variability of the sampling distribution of x bar; in other words, the "average" amount that the statistic, x bar, deviates from its associated parameter. computed as sigma/rt n

51

statistic

a number computed from sample data (without any knowledge of the value of a parameter) used to estimate the value of the parameter.

52

statistic symbols:

x bar, s, p hat (mean of sample, standard deviation of sample, proportion of sample)

53

statistical process control

a procedure used to check a process at regular intervals to detect problems and correct them before they become serious.

54

sum of squared residuals (or error)

the residuals are squared and added; denoted SSE.

55

total variation in Y:

the sum of the squared deviations of the Y observations about their mean, y hat

56

two-way table

a table containing counts for two categorical variables. It has r rows and c columns

57

unbiased

a condition where the mean of the statistic values equals the parameter that the statistic estimates

58

unexplained variation

the sum of squared residuals

59

X:

the symbol for explanatory variable

60

x bar-chart

a plot of sample means over time used to assess whether a process is in control

61

Y:

the symbol for response variable

62

y hat:

the symbol for predicted y

63

z-score

a measure of the number of standard deviations of a value or observation from the mean.