Definitions Flashcards

(109 cards)

1
Q

measurement

A
  • Assigning number or codes to aspects of objects or events according to rules. -
  • positioning observations along numerical continuum -
  • classifying observations into categories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Observation

A

Unit upon which measurement is made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variable

A

measurable charactoeristic that varies among persons, places, or objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Nominal measuremsents

A

Observation variable that have two or more categories, but there is no intrinsic ordering to the categories. Nonparametric.

Examples: sex, blood type

aka. Categorical variable, attribute variable, qualitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Ordinal measurements

A

Observation variable that has categories that can be put into rank order. Differs from interval, b/c space b/w values is not equal. Non-parametric.

Examples:Stage of cancer on a point scale); economic status (low, med, high)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quantitative measurements

A

Observation variables are along meaningful numeric scale.

  • Interval = is equal spacing scale, but not absolute zero. (i.e. Farenhight, celcius)
  • Ratio = is value has absolute zero and can be added. (i.e. age, body weight, kelvin)

aka, ratio/interval measurement, numeric variable, scale variable, continuous variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Surveys

A

Type of study used to quantify population characteristics. “sampling” rule of statistics b/c data for entire population is rarely available.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Simple Random Sample (SRS)

A

Randomly sample population to collect data so:

1) each population member has same probability of being selected in the sample
2) selection of any individ into the samples is not bias for selecting another individ.
aka. sampling independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cautions

A

samples that tend to over- or under-represent certain segment of pop that can bias survey results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Undercoverage

A

Type of sample caution. Occurs when some groups in the source pop are left out or underrepresented. Will undermine achieving equal selection probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Volunteer Bias

A

Type of sample caution. Occurs b/c self-selected participants of a survey are atypical of pop. ex. web survey volunteers have a particular view point causing hem to participate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Nonresponse Bias

A

Type of sample caution. Large % not represented, Occurs when large % of individs refuse to participate in survey. nonrepsonders differ from responders, which skews survey.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Probability Sample

A

Each member of pop has known probability of being selected. Include SRS, stratified random samples, cluster samples, and multistage sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Stratified random sample

A

draws independent SRS from a homogeneous “groups” or “strata.” Ex. divide pop into age groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Cluster samples

A

Randomly selects large units (clusters) consisting of smaller subunits. Ex. list of household addresses to study all individs in cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Comparative study

A

Learn relationship b/w an exploratory variable and a response variable. Compare group expose vs. not expose to exploratory factor.

  • two types: Experimental and Non-Experimental (observationa)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Experimental studies

A

Investigator assigns exposure to one group and not the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Nonexperimental Studues

A

investigator classifies groups as exposed or nonexposed w/o intervention aka. Observational studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Exploratory Variable (IV)

A

Treatment or exposure that explains or predicts change in the response variable.

aka. (IV) Independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Response Variable (DV)

A

Outcome or response being investigated.

aka. (DV)Dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Lurking variables

A

Extraneous factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Confounding Variables

A

Distortion in an association b/w exploratory variable and response variable by influence of extraneous factors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Factors

A

Exploratory variables in experiments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Treatment

A

Specific set of factors applied to subject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Intersection
Factors in combination produce effects that could not be predicted by looking at the effect of the factors separately.
26
Trials
Experiments involving human subjects. Two types: Controlled and Randomized Controlled
27
Randomized control trial
Assigned treatment is based on chance. Helps sort out effect of treatment from those of lurking variables.
28
Equipoise
Balanced doubt about benefits and rick
29
Discrete variable
Finite number of values b/w any 2 points
30
Continuous variable
infinite number of values b/w 2 points
31
Shape (graph)
Configuration of data points as they appear on a graph. Described in terms of : * skewness: shape reflects mirror image * modality: number of peaks - * kurtosis: "peakedness" of distrubution
32
Location (graph)
Distribution summarized by its center (Central tendency) * Mean: center of distribution. "arithmetic avg." is distrib. balancing point - * Median - * Mode
33
Depth of data Point
Corresponds to its rank from wither top or bottom of ordered list of values.
34
Spread (graph)
Refers to distribution/variability of data points. Measures of Spread * Range * Quartiles * Stnd. Dev. * variance
35
Class intervals
Group data in intervals with equal or unequal spacing before tallying freq. Endpoint Conversion: ensure observations falls within interval * - include left boundary and exclude the right * - include right boundary and exclude left
36
Relative Frequency
Proportion equation: freq. counts/ by total. Expressed in %
37
Cumulative Frequency
Proportion that falls in or below a certain level. Equation: add two consecutive Rel. Frequencies. Expressed in %
38
Bar Chart
Display freq. with bars that correspond to height of freq. Best for categorical variables
39
Histogram
Bar chart with line connecting freq. . Best for Quantitative variables
40
Descriptive Statistics
Set of observations that describe the characteristics of a sample. ex: Cetntral tendency (mean, median, mode), Variability (St. Dev. variance, range, quartiles)
41
Inferential Statistics
Set of statistical techniques that provide predictions about the population based on info in the pop sample.
42
Univariate Statistics
Involve one variable at a time (i.e. age, height, weight)
43
Bivariate statistics
Involve two variables of the sample examined simultaneously (pre/post test)
44
Multivariate Statistics
Involve 2 or more variables in the same analysis
45
Stemplot
graphical technique that organizes data in a histogram-like display
46
mean
Arithmetic average of data VALUES. Balancing point in a set. Highly susceptible to outliers and skew. Formula: * sample: (Σ n)/n = (X bar); Population: (Σ N)/N = µ Functions: 1) predict individ. value drawn at random from sample, 2) predict value drawn at random from pop \* Best to pair with Stn. Dev for symmetrical distributions
47
Median
Midpoint of a distribution in CASES. More ROBUST (resilient to outliers and skew.) Formula: put in order, calculate (n+1)/2, count places to midpoint. \* Best to pair with IQR for asymmetrical distributions. always Q2, 50th percentile
48
Mode
Most frequently occurring value in data set. Useful in only ;arge sets with repeating values.
49
Variability
Measure of spread. Fundamental interest of behavior scientists.
50
Range
Measure spread of distribution. simplest measure of variability. Max -Minimum distribution Limitations; known to be biased or or highly unstable; increases w/ sample size. \*Should always be supplemented with another unit of measure.
51
Quartile
Intuitive way to describe variability by dividing data set into 4 segments: - Q0 (min) = 0% - Q1 (lower hinge) =25% - Q2 (median) = 50% - Q3 (upper hinge) = 75% - Q4 (Max) = 100% Find MEDIAN to identify quartiles
52
Hinges
Orded array of "folds" upon itself.
53
Interquartile Range
Summary spread of measure that captures middle 50% of data points in set. * 5 poitn sumary (Q0 - Q4) IQR = Q3-Q1 (when Q3 is median b/w Q2 and Q4; Q1 is MEDIAN b/w Q0 and Q2; Q3 is the overall median) Not sensitive to extreme values.
54
Box-and-Whiskers plot
Displays five-point summaries and "potential outliers" in graphical form. aka. box plot. box: spans IQR
55
Fences
lower = Q1 - (1.5)IQR. Upper = Q3 + (1.5)IQR Values below fences are "lower outside values" Values above upper fence are "upper outside values" Smallest values inside lower fence is the :lower inside values" Largest value inside upper fence is "upper inside value"
56
Variance
Common measure of spread. Population: σ^2 = SS/(N) Sample: S^2 = SS/(n-1)\* SS=Sum of Squared deviations \*substract 1 from n to force a larger variance and SD (makes it an unbiased estimate)
57
Variability
* Always present average with variability as to not misrepresent data. * 2 data sets can have the same average but differenct variability.
58
Standard Deviation
Common measure of spread Unbiased estimate of samples (good scientists are CONSERVATIVE!) Formula: Square root of variance * Sensitive to outliers and skews * Useful for making comparisons * smaller the SD, the more HOMOGENIOUS the set
59
Chebychev's Rule
For Data sets: At least 3/4s of the date points lie within two stn. devs. of the mean.
60
Normal Rule
For data sets: applies only to distributions with a particular NORMAL shape.. * - 68.3% of points fall within mean + 1 stb. dev. - * 95.4% of data points lie within mean + 2 stn. devs - * 99.7% of data points lie within mean + 3 stn. devs. aka. 68-95-99.7 rule Properties of Noral Curve: * Asymmetrical * unimodal * bellshaped * mean, median and mode are equal
61
Symmetrical vs. Asymmetrical Distribution
Symmetrical: Mean = Median Asymmetrical: Mean not = Median - * Positive Skew: Mean \> Median - * Negative skew: Mean \< Median
62
Sum of Squares
Each data points deviation from the data set mean, squared, then all sumed. aka. SS +E (X1 - Xbar)^2 Calculating formula: SS= Ex^2 -((EX)^2/ N). 1) Sum data points and square, then divie by n. 2) Square each data point and then sum, 3) value of 2-1. \*mathematically the same as above, needed for SPSS.
63
Probability
proportion of times an event is expected to occur. Between 0 (never) and 1 (always) Founded on ralative frequencies.
64
Probability: random variable
Numerical quantity that takes on different values depending on chance
65
Probability: population
set of all possible outcomes for a random variable
66
Probability: Event
An outcome or set of outcomes for a random variable
67
Probability: Discrete random variables
Countable set of possible outcomes. Fractional units not possible. ex. variable # of luekemia cares in the US in 1995, variable # of successes in n independent treatments,
68
Probability: Continuous Random variable
outcome quantities with unbroken continuum of possible values. Ex. variable amount of time it takes to complete a task; average weight or height of a newborn.
69
4 Properites of probability functions
1) Range of Prob. - individ. props are never less than 0 and never more than 1 . 01 2) Total Prob. - probs in the sample space must sum to 1. Pr(S) =1 3) Complements - prob of a complement is equal to 1 minus prob of event . Pr (\_A\_) = 1 - Pr(A) 4. Disjoint events - events are disjiont if they cannot exist concurrently. Pr(A or B) = Pr(A) + Pr(B)
70
Z score
States the number of std. devs by which the original score lies above or below the mean of a normal curve. Formula: z = (x^i - x\_)/ s - z distribution aka. standard Normal curve. - Mean = 0; s= 1 - Method to interpret raw score; takes into account mean value and variability of set of raw scores.
71
Types of scores
- Raw Score (x): individual observed scores on measured variables. - Deviation of score (s) - standard score (Z)
72
Normal Curve
- Bell shape, symmetrical, unimodal. - Same Mean, Median, and Mode - precise relationship b/w area under curve and Std. Dev.
73
Law of Probability
Use statistical framework that allows researchers to determine how likely it is that the research findings based on sample data are VALID. Proportion of times an event is expected to occur in the population. Prob. ranges from 0 to 1
74
Inference
Act of using data in a sample to make generalizations about its population. Goals: * hypothesis testing * estimate value of population parameters
75
Statistical Population
entire collection of values that conclusions are drawl on.
76
Hypothetical Population
Infinitely large population of potential values that could ensure following study.
77
Parameters vs. statistics
Parameter: numerical characteristics of a statistical population (population level) Statistic: value calculated in a sample. (sample level) - use different symbols (i.e u, σ vs. X\_, s for mean) Statistic --\> statistical inference --\> Parameter --\> Random selection --\> Statistic
78
Sampling distribution of a mean
The hypothetical distribution of mean from all possible samples of size n taken from the same population. Characteristics: * follows central limit theorem * unbiased estimator of population mean. * Samples means are less variable than individ. distribution. (square root law)
79
Central Limit Theorem
Sampling distribution of x̅ tends toward Normality even when the underlying population is not Normal i.e. Distrubution gets narrower as sample size increases
80
Standard error of the mean (SE)
Standard Deviation of x̅ Formula: SEx= σ/ √(n) Law of large numbers: As an SRS gets larger and larger, its sample mean x̅ gets closer and closer to the true value of pop. mean.
81
Null hypothesis
Statement of NO difference H^o: u = "some number" Reject H0 =  True (Type I error, a)/ False (correct decision) Fail to Reject Ho= True (correct decision)/ False (Type II error, ß) Alpha: * Probabilty of Type I error * Chnce you are willing to take in mistakenly rejecting a true null hypothesis Beta: * Probability of Type II error * Chnce you are wiling to take in mistakenly accepting a false null hypothesis
82
Alternative hypothesis
Statement that claims a difference from null hypothesis. Ha: u \<,\>, --\> one-sided z-test Ha:  µ not = --\> two-sided z-test
83
Zstat
Statistical distance of samples mean X\_ from the hypothesized value of u this provides the weight of evidence for or against Ho. Zstat = (X\_ - uo)/ SE\_X\_
84
Point Estimation
* Provides a single estaimtate of the parameter * No info regarding probability of accuracy; best "guestimate"
85
Central Limit Theorem
If populiation is not Normal, the distribution of sample means approaches Normal distribution as the size of sample gets larger.
86
Hypothesis Testing Steps
1. Define hypothesis: Hand Ha.  2. Test Statistic: calculate SE and Z/Tstat 3. Determine P-value: Z/Tstat for CL 4. Decide Significance level: Compare Z/Tstat to P-value. Statistically signifigant or not? 5. State Conclusion
87
Interval Estimation
Provides a range of values (CI) that seekd to capture the parameter - Confidence Interval between two limit values.
88
t-Test
Testing statistical hypothesis about µ when 1) σ is unknown 2) samples size is small (n \> 30)
89
Degrees of Freedom (df)
Value indicating the # of independent pices of info a sample can provide for purposes of statistical inference.
90
Determining CI for µ
x̅ ± t¤ /2* SE Mean Difference shoudl fall between upper and lower bound, Ex. 90% CI --\> ¤ = .1 --\> .1/2 = .05 --\> (1-.05) =.095 Look up in t-stat table: df and P(.095)
91
Single Sample
Reflect experience of a single group. NO control group, but results are cmpared to norms or expected values
92
Paired Sample
Uses Data from two samples in which each data point in the first samples is matched to a data point in the 2nd sample. Ex. Pre- and Post-sample from same subject
93
Independent Samples t-Test
Use when comparing two samples in order to draw inferences about groups differences in the population. * Two levels of a nominal level variable; dependent variable approximates interval-scale characteristics. I.e DV = #tv hrs; RV = males, females * assumption of equal variances . * St. Dev of such sampling distribution is standard error of the difference.
94
Independent Samples
Usese two smapels from separate populations. Data points are unrelated. Ex. Eperimental study with treatment and control
95
ANOVA
One-way analysis of variance * compares 3 or more groups defined by one factor. * variation is the response analyized to understand group differences; in place of independent t-Test. * Ho: µ= µ= ... = µk EX: patients assigned to three treatment groups and measured on stress score (DV) in reaction to treatment (IV)
96
Mean of Squares Bewteen (MSB) | (ANOVA)
Quantifies variance of group means around the grand mean. MSB = SSB/ dfB SSB = n(x - grand Xbar)2 +....    --\> (group mean - grand mean)2 x group n +... - measures variability between the groups comparing to grand mean.
97
Mean Square Within (MSW) ANOVA
Quantifies variability of data points in a group around its mean. - MSW = SSW/ dfW SS = (x - Xbar) +....... --\> (individual point - group mean) + ..... then sum all SS together - Measures variability within each data group.
98
F-statistic (ANOVA)
* Ratio of MSB and MSW. * Large F-stat suggests the observed mean differences are NOT merelry due to random noise. * Fstat = MSB/MSW * When converting f-stat to P-values: DF: numerator dfB/ denominator dfW
99
Levene Test
Tests for variances assumed equal. Use when comparing two or more groups (samples). Ho: σ1σ22 = σ3 Accept null when p-value is greater than CI.
100
Correlation Coefficient (r)
Strength of a linear relationship. 1- \< r 0 \< r \<1 Stength * Close to 1: when all point fall on a line with an upward slope * Close to 0: lack of linear correlation Direction: * Upward slope = postive number * Downward slope = negative number 3 r's: * metric...
101
Coefficient of determination (r2)
Statistic that quantifies the proportion of variance in Y explained by X. Expressed by coverting r2 to % - x% of varience of Y is explained by X
102
Single Regression Line
Expresses functional relationship b/w X and Y by fitted a line to observed data. * Observed y = predicted y + residual * Residual = observed y - predicted y Least Squares regression Line: drawn to minimize sum of squares **Formula: ŷ = a +bx;** ŷ = predicted y, a = interception of regression at Y axis , b = slope.coefficient b = r (sy/sx) a = Ybar = b(Xbar) Notes: * Not rebust * b show relationship b/w X and Y in same units as measure. r is unit-free measeure of strength * X must be IV; Y must be DV
103
Confidence Interval for Population Slope
Hypothesis: * Ho: B = 0 * Ha: B not = 0 t-stat = b/ SEb CI formula: b +/- tn-2, 1 - (¤/2) \* SEb * If "0" is captured in the CI for population slope, data is NOT sig.
104
Multiple Regression
Address multiple exploratory variable (IVs) in relation for response variable (DV). IMPROVES prediction by using two or more variables to predict a dependent variable. Formula: Y' = a + b1X+ b2X2 ....
105
Kurtosis
Refers to the “peakedness” of a distribution. * Leptokurtic: narrow peak * PLatykurtic: flat peak (plataeu)
106
Chi-Squared Test
* Measure os association b/w 2 nominal variables * magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies. * does not make any assumptions about the shape of the distribution nor about the homogeneity of variances. Formula = Observed - Expected/ Expected
107
PARAMETRIC VERSUS NONPARAMETRIC STATISTICS
* Use nonparametric stats when: * the parametric assumptions cannot be justified: normal distribution, equal variances, etc. * data as gathered are measured on nominal or ordinal data
108
Properties of Sampling distribution
* mean of a sampling distribution of means will be the same as the mean of scores in the population (µ). * Central Limit Theorem * Allows us to determine the probability that the particular sample obtained will be unrepresentative. *
109
One -Sample Z test
* Used to compare a sample mean to a (hypothesized) population mean and determine how likely (chance) it is that the sample came from that population. * Compare the probability associated with statistical results (i.e. probability of chance) with a predetermined alpha level.