Definitions Flashcards

Question

Intersection

Answer 1

Factors in combination produce effects that could not be predicted by looking at the effect of the factors separately.

Answer 2

Experiments involving human subjects. Two types: Controlled and Randomized Controlled

Answer 3

Assigned treatment is based on chance. Helps sort out effect of treatment from those of lurking variables.

Answer 4

Balanced doubt about benefits and rick

Answer 5

Finite number of values b/w any 2 points

Answer 6

infinite number of values b/w 2 points

Answer 7

Configuration of data points as they appear on a graph. Described in terms of : * skewness: shape reflects mirror image * modality: number of peaks - * kurtosis: "peakedness" of distrubution

Answer 8

Distribution summarized by its center (Central tendency) * Mean: center of distribution. "arithmetic avg." is distrib. balancing point - * Median - * Mode

Answer 9

Corresponds to its rank from wither top or bottom of ordered list of values.

Answer 10

Refers to distribution/variability of data points. Measures of Spread * Range * Quartiles * Stnd. Dev. * variance

Answer 11

Group data in intervals with equal or unequal spacing before tallying freq. Endpoint Conversion: ensure observations falls within interval * - include left boundary and exclude the right * - include right boundary and exclude left

Answer 12

Proportion equation: freq. counts/ by total. Expressed in %

Answer 13

Proportion that falls in or below a certain level. Equation: add two consecutive Rel. Frequencies. Expressed in %

Answer 14

Display freq. with bars that correspond to height of freq. Best for categorical variables

Answer 15

Bar chart with line connecting freq. . Best for Quantitative variables

Answer 16

Set of observations that describe the characteristics of a sample. ex: Cetntral tendency (mean, median, mode), Variability (St. Dev. variance, range, quartiles)

Answer 17

Set of statistical techniques that provide predictions about the population based on info in the pop sample.

Answer 18

Involve one variable at a time (i.e. age, height, weight)

Answer 19

Involve two variables of the sample examined simultaneously (pre/post test)

Answer 20

Involve 2 or more variables in the same analysis

Answer 21

graphical technique that organizes data in a histogram-like display

Answer 22

Arithmetic average of data VALUES. Balancing point in a set. Highly susceptible to outliers and skew. Formula: * sample: (Σ n)/n = (X bar); Population: (Σ N)/N = µ Functions: 1) predict individ. value drawn at random from sample, 2) predict value drawn at random from pop \* Best to pair with Stn. Dev for symmetrical distributions

Answer 23

Midpoint of a distribution in CASES. More ROBUST (resilient to outliers and skew.) Formula: put in order, calculate (n+1)/2, count places to midpoint. \* Best to pair with IQR for asymmetrical distributions. always Q2, 50th percentile

Answer 24

Most frequently occurring value in data set. Useful in only ;arge sets with repeating values.

Answer 25

Measure of spread. Fundamental interest of behavior scientists.

Answer 26

Measure spread of distribution. simplest measure of variability. Max -Minimum distribution Limitations; known to be biased or or highly unstable; increases w/ sample size. \*Should always be supplemented with another unit of measure.

Answer 27

Intuitive way to describe variability by dividing data set into 4 segments: - Q0 (min) = 0% - Q1 (lower hinge) =25% - Q2 (median) = 50% - Q3 (upper hinge) = 75% - Q4 (Max) = 100% Find MEDIAN to identify quartiles

Answer 28

Orded array of "folds" upon itself.

Answer 29

Summary spread of measure that captures middle 50% of data points in set. * 5 poitn sumary (Q0 - Q4) IQR = Q3-Q1 (when Q3 is median b/w Q2 and Q4; Q1 is MEDIAN b/w Q0 and Q2; Q3 is the overall median) Not sensitive to extreme values.

Answer 30

Displays five-point summaries and "potential outliers" in graphical form. aka. box plot. box: spans IQR

Answer 31

lower = Q1 - (1.5)IQR. Upper = Q3 + (1.5)IQR Values below fences are "lower outside values" Values above upper fence are "upper outside values" Smallest values inside lower fence is the :lower inside values" Largest value inside upper fence is "upper inside value"

Answer 32

Common measure of spread. Population: σ^2 = SS/(N) Sample: S^2 = SS/(n-1)\* SS=Sum of Squared deviations \*substract 1 from n to force a larger variance and SD (makes it an unbiased estimate)

Answer 33

* Always present average with variability as to not misrepresent data. * 2 data sets can have the same average but differenct variability.

Answer 34

Common measure of spread Unbiased estimate of samples (good scientists are CONSERVATIVE!) Formula: Square root of variance * Sensitive to outliers and skews * Useful for making comparisons * smaller the SD, the more HOMOGENIOUS the set

Answer 35

For Data sets: At least 3/4s of the date points lie within two stn. devs. of the mean.

Answer 36

For data sets: applies only to distributions with a particular NORMAL shape.. * - 68.3% of points fall within mean + 1 stb. dev. - * 95.4% of data points lie within mean + 2 stn. devs - * 99.7% of data points lie within mean + 3 stn. devs. aka. 68-95-99.7 rule Properties of Noral Curve: * Asymmetrical * unimodal * bellshaped * mean, median and mode are equal

Answer 37

Symmetrical: Mean = Median Asymmetrical: Mean not = Median - * Positive Skew: Mean \> Median - * Negative skew: Mean \< Median

Answer 38

Each data points deviation from the data set mean, squared, then all sumed. aka. SS +E (X1 - Xbar)^2 Calculating formula: SS= Ex^2 -((EX)^2/ N). 1) Sum data points and square, then divie by n. 2) Square each data point and then sum, 3) value of 2-1. \*mathematically the same as above, needed for SPSS.

Answer 39

proportion of times an event is expected to occur. Between 0 (never) and 1 (always) Founded on ralative frequencies.

Answer 40

Numerical quantity that takes on different values depending on chance

Answer 41

set of all possible outcomes for a random variable

Answer 42

An outcome or set of outcomes for a random variable

Answer 43

Countable set of possible outcomes. Fractional units not possible. ex. variable # of luekemia cares in the US in 1995, variable # of successes in n independent treatments,

Answer 44

outcome quantities with unbroken continuum of possible values. Ex. variable amount of time it takes to complete a task; average weight or height of a newborn.

Answer 45

1) Range of Prob. - individ. props are never less than 0 and never more than 1 . 01 2) Total Prob. - probs in the sample space must sum to 1. Pr(S) =1 3) Complements - prob of a complement is equal to 1 minus prob of event . Pr (\_A\_) = 1 - Pr(A) 4. Disjoint events - events are disjiont if they cannot exist concurrently. Pr(A or B) = Pr(A) + Pr(B)

Answer 46

States the number of std. devs by which the original score lies above or below the mean of a normal curve. Formula: z = (x^i - x\_)/ s - z distribution aka. standard Normal curve. - Mean = 0; s= 1 - Method to interpret raw score; takes into account mean value and variability of set of raw scores.

Answer 47

- Raw Score (x): individual observed scores on measured variables. - Deviation of score (s) - standard score (Z)

Answer 48

- Bell shape, symmetrical, unimodal. - Same Mean, Median, and Mode - precise relationship b/w area under curve and Std. Dev.

Answer 49

Use statistical framework that allows researchers to determine how likely it is that the research findings based on sample data are VALID. Proportion of times an event is expected to occur in the population. Prob. ranges from 0 to 1

Answer 50

Act of using data in a sample to make generalizations about its population. Goals: * hypothesis testing * estimate value of population parameters

Answer 51

entire collection of values that conclusions are drawl on.

Answer 52

Infinitely large population of potential values that could ensure following study.

Answer 53

Parameter: numerical characteristics of a statistical population (population level) Statistic: value calculated in a sample. (sample level) - use different symbols (i.e u, σ vs. X\_, s for mean) Statistic --\> statistical inference --\> Parameter --\> Random selection --\> Statistic

Answer 54

The hypothetical distribution of mean from all possible samples of size n taken from the same population. Characteristics: * follows central limit theorem * unbiased estimator of population mean. * Samples means are less variable than individ. distribution. (square root law)

Answer 55

Sampling distribution of x̅ tends toward Normality even when the underlying population is not Normal i.e. Distrubution gets narrower as sample size increases

Answer 56

Standard Deviation of x̅ Formula: SE_x= σ/ √(n) Law of large numbers: As an SRS gets larger and larger, its sample mean x̅ gets closer and closer to the true value of pop. mean.

Answer 57

Statement of NO difference H^o: u = "some number" Reject H_{0 =} True (Type I error, a)/ False (correct decision) Fail to Reject H_o=True (correct decision)/ False (Type II error, ß) Alpha: * Probabilty of Type I error * Chnce you are willing to take in mistakenly rejecting a true null hypothesis Beta: * Probability of Type II error * Chnce you are wiling to take in mistakenly accepting a false null hypothesis

Answer 58

Statement that claims a difference from null hypothesis. H_a: u \<,\>, --\> one-sided z-test H_a:µ not = --\> two-sided z-test

Answer 59

Statistical distance of samples mean X\_ from the hypothesized value of u this provides the weight of evidence for or against H_o. Zstat = (X\_ - u_o)/ SE\_X\_

Answer 60

* Provides a single estaimtate of the parameter * No info regarding probability of accuracy; best "guestimate"

Answer 61

If populiation is not Normal, the distribution of sample means approaches Normal distribution as the size of sample gets larger.

Answer 62

1. Define hypothesis: H_oand H_a. 2. Test Statistic: calculate SE and Z/Tstat 3. Determine P-value: Z/Tstat for CL 4. Decide Significance level: Compare Z/Tstat to P-value. Statistically signifigant or not? 5. State Conclusion

Answer 63

Provides a range of values (CI) that seekd to capture the parameter - Confidence Interval between two limit values.

Answer 64

Testing statistical hypothesis about µ when 1) σ is unknown 2) samples size is small (n \> 30)

Answer 65

Value indicating the # of independent pices of info a sample can provide for purposes of statistical inference.

Answer 66

x̅ ± t_{¤ /2* SE} Mean Difference shoudl fall between upper and lower bound, Ex. 90% CI --\> ¤ = .1 --\> .1/2 = .05 --\> (1-.05) =.095 Look up in t-stat table: df and P(.095)

Answer 67

Reflect experience of a single group. NO control group, but results are cmpared to norms or expected values

Answer 68

Uses Data from two samples in which each data point in the first samples is matched to a data point in the 2nd sample. Ex. Pre- and Post-sample from same subject

Answer 69

Use when comparing two samples in order to draw inferences about groups differences in the population. * Two levels of a nominal level variable; dependent variable approximates interval-scale characteristics. I.e DV = #tv hrs; RV = males, females * assumption of equal variances . * St. Dev of such sampling distribution is standard error of the difference.

Answer 70

Usese two smapels from separate populations. Data points are unrelated. Ex. Eperimental study with treatment and control

Answer 71

One-way analysis of variance * compares 3 or more groups defined by one factor. * variation is the response analyized to understand group differences; in place of independent t-Test. * H_o: µ₁= µ₂= ... = µ_k EX: patients assigned to three treatment groups and measured on stress score (DV) in reaction to treatment (IV)

Answer 72

Quantifies variance of group means around the grand mean. MSB = SS_B/ df_B SS_{B =}n(x - grand Xbar)^{2 +....}--\> (group mean - grand mean)2 x group n +... - measures variability between the groups comparing to grand mean.

Answer 73

Quantifies variability of data points in a group around its mean. - MSW = SS_W/ df_W SS_W = (x - Xbar)² +....... --\> (individual point - group mean)² + ..... then sum all SS together - Measures variability within each data group.

Answer 74

* Ratio of MSB and MSW. * Large F-stat suggests the observed mean differences are NOT merelry due to random noise. * F_stat = MSB/MSW * When converting f-stat to P-values: DF: numerator df_B/ denominator df_W

Answer 75

Tests for variances assumed equal. Use when comparing two or more groups (samples). Ho: σ_1²=σ₂² = σ₃² Accept null when p-value is greater than CI.

Answer 76

Strength of a linear relationship. 1- \< r 0 \< r \<1 Stength * Close to 1: when all point fall on a line with an upward slope * Close to 0: lack of linear correlation Direction: * Upward slope = postive number * Downward slope = negative number 3 r's: * metric...

Answer 77

Statistic that quantifies the proportion of variance in Y explained by X. Expressed by coverting r² to % - x% of varience of Y is explained by X

Answer 78

Expresses functional relationship b/w X and Y by fitted a line to observed data. * Observed y = predicted y + residual * Residual = observed y - predicted y Least Squares regression Line: drawn to minimize sum of squares **Formula: ŷ = a +bx;** ŷ = predicted y, a = interception of regression at Y axis , b = slope.coefficient b = r (s_y/s_x) a = Ybar = b(Xbar) Notes: * Not rebust * b show relationship b/w X and Y in same units as measure. r is unit-free measeure of strength * X must be IV; Y must be DV

Answer 79

Hypothesis: * H_o:B = 0 * H_a:B not = 0 t-stat = b/ SE_b CI formula: b +/- t_{n-2, 1 - (¤/2)}\* SE_b * If "0" is captured in the CI for population slope, data is NOT sig.

Answer 80

Address multiple exploratory variable (IVs) in relation for response variable (DV). IMPROVES prediction by using two or more variables to predict a dependent variable. Formula: Y' = a + b₁X₁+ b₂X₂ ....

Answer 81

Refers to the “peakedness” of a distribution. * Leptokurtic: narrow peak * PLatykurtic: flat peak (plataeu)

Answer 82

* Measure os association b/w 2 nominal variables * magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies. * does not make any assumptions about the shape of the distribution nor about the homogeneity of variances. Formula = Observed - Expected/ Expected

Answer 83

* Use nonparametric stats when: * the parametric assumptions cannot be justified: normal distribution, equal variances, etc. * data as gathered are measured on nominal or ordinal data

Answer 84

* mean of a sampling distribution of means will be the same as the mean of scores in the population (µ). * Central Limit Theorem * Allows us to determine the probability that the particular sample obtained will be unrepresentative. *

Answer 85

* Used to compare a sample mean to a (hypothesized) population mean and determine how likely (chance) it is that the sample came from that population. * Compare the probability associated with statistical results (i.e. probability of chance) with a predetermined alpha level.

Definitions Flashcards

(109 cards)