Statistics Flashcards

1
Q

Also called a categorical variable. Simple classification. We do not need to count to distinguish one item from another, mutually exclusive.

A

Nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The only discrete-only in scales of measurement.

A

Nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The only continuous alone on scales of measurement or have 0.5 as the smallest unit.

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Cases are ranked or ordered. Represent position in a group where the order matters but not the difference between values.

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

It uses intervals equal in amount measurement where the difference between two values is meaningful.

A

Interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Similar to interval but includes a true zero point and relative proportions on the scale make sense.

A

Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which among the scales of measurement are parametric and non parametric?

A

P- Interval & ratio
NP- Nominal & Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 4 scales of measurement?

A

Nominal
Ordinal
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Refers to the analysis of data of an entire population merely using numbers to describe a known data set.

A

Descriptive Statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Value in a group of values which is the most typical for the group, or the score at which all the scores are evenly clustered around. The average or midmost score.

A

Measures of Central Tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the measures of central tendency?

A

Mean
Median
Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The average/arithmetic mean. Sum of a set of measurements in the set. Data is interval only.

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Central value of a set value such that the half the observations fall above it and half below it. The middle score in the distribution. Use ordinal and interval data.

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Modal value of a set. Most frequently occurring value. For grouped data, it is the midpoint of the class interval with the largest frequency, uses nominal, ordinal and interval data.

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Measures of how much or how little the rest of the values tend to vary around the central or typical value. Variation or error.

A

Measures of variability/Dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the measures of variability/dispersion?

A

Standard deviation
Variance
Range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What level of data does all measures of variability/dispersion use?

A

Interval (some books include ratio)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Square root of variance. Shows the distribution of measurement.

A

Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

(Sd)²

A

Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Simplest measure of variation. Difference between the largest and smallest measurement.

A

Range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Used to describe the position of a particular observation in relation to the rest of the data set.

A

Measures of Location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

In measures of location, The pth percentile of a data set is a value such that at least percent of the observation take on this value or less and at least _ percent of the observations take on this value or more.

A

100-p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the measures of location?

A

Percentiles
Quartiles
Deciles
Frequency Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Percentage of the total number of observations that are less than the given value. Identifies the point below which a specific percentage of the cases fall.

A

Percentiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
The data can be divided into 4 parts instead of two. This is what you call the cut points.
Quartiles
26
The data can be divided into 10 parts instead of two or four. This is what you call the cut points.
Deciles
27
A classification of data that may help in understanding important features of the data may be graphically presented in the form of a histogram, polygon, etc.
Frequency Distribution
28
This measure of location represents the same 2 elements: Set of categories that make up the original measurement scale. A record of the frequency, or number of individuals in each category.
Frequency Distribution
29
All measures of location use ordinal, interval, and ratio level of data except _ which uses all levels of data.
Frequency Distribution
30
Measurement of the extent to which pairs of related values on 2 variables tend to change together; gives measure of the extent to which one variable can be predicted from values on the other variable.
Measures of correlation.
31
If one variable increases with the other, the correlation is positive (near _). If the relationship is inverse, it is a negative correlation (near _). A lack of correlation is signified by a value close to _.
+1 -1 0
32
What are the measures of correlation?
Pearson's Product moment correlation Spearman's Rho Rank-order Kendall's Coefficient of Concordance W Point-Biserial Coefficient rpb Phi or Fourfold Coefficient Lambda
33
A measure of correlation for 2 groups, using interval level of data. Data must be in the form of related pairs of scores. The higher the r , the higher the correlation.
Pearson's Product Moment Correlation (r)
34
A measure of correlation for 2 groups, using the ordinal level of data. Data must be in the form of related pairs of scores and is used for ≤ 3. Easy to calculate but non parametric.
Spearman's Rho Rank-order
35
A measure of correlation for ≥ 3 groups, using the ordinal level of data. Data must be ≥ 3 sets of ranks. Easy to calculate but non parametric.
Kendall's Coefficient Concordance W
36
A measure of correlation for 2 groups, using the continuous and dichotomous nominal level of data.
Point-Biserial Coefficient rpb
37
A measure of correlation for 2 groups, using 2 dichotomous nominal level of data.
Phi or Fourfold Coefficient
38
A measure of correlation for ≥ 2 groups, using nominal (dependent/independent) ) levels of data. It is also known as Guttman's Coefficient of predictability. Gives an indication of the reduction of errors made in a prediction scheme.
Lambda
39
A non parametric measure of the agreement between two rankings.
Tau Coefficient
40
Tests for statistical dependence.
Kendall's Tau Coefficient
41
An index of interrater reliability of ordinal data.
Coefficient of Concordance (W)
42
Measurement of the extent to which pairs of related values on 2 variables tend to change together; gives measure of the extent to which values on one variable can be predicted from the values on the other variable.
Inferential statistics
43
What are the inferential statistics tests?
Z-test of one sample mean T-test
44
Variation of t-test
Independent samples Dependent samples Proportions/Percentages Variances 2 correlation coefficients
45
What level of data do all tests for inferential statistics use?
Interval
46
A measure for inferential statistics for 1 group. N ≥30 used to test whether a population parameter is significantly different from some hypothesized value.
Z-test of one sample mean
47
A measure for inferential statistics when n< 30
T-test
48
This kind of t-test is for 2 groups. It assesses whether the means of 2 groups are statistically different from each other.
Independent samples
49
This kind of t-test is for 1 group. It is used when the subjects making up the 2 samples are matched on some variable before being put in the 2 groups or the situation where the 2 groups are the same subjects administered a pretest and post test.
Dependent samples
50
This kind of t-test is for 1 group. It is used to test the hypothesis that an observed proportion is equal to a pre-specified proportion.
Proportions/Percentages
51
This kind of t-test uses the F test for equal and unequal.
Variances
52
This kind of t-test is for 2 groups. It is used to assess the significance of the difference between two correlation coefficients found in 2 independent samples.
2 correlation coefficients
53
It is used for problems of predicting one variable from a knowledge of another or possibly several other variables. It is always the regression of the predicted value on the known variable.
Regression Equation
54
What are the regression equations?
Linear regression of y on x Linear regression of x on y Standard error of estimate (SEE)
55
Standard deviation of errors of prediction. An indication of the variability about the regression line in the population wherein predictions are being made.
Standard error of estimate (SEE)
56
Among ANOVA and t-tests, which organizes and directs analysis and has easier interpretation of the results?
ANOVA
57
Performing repeated t-tests increases the probability of _?
Type I error
58
ANOVA needs to be followed by what test?
Post hoc test
59
What does the post hoc test determine?
Which group differs from each other.
60
We should not conduct a post hoc test unless the null is ?
Rejected.
61
A test designed for a situation with equal sample size per group, but can be adapted to unequal sample sizes as well.
Tukey's (Honestly Significant Difference or HSD) Test
62
Descriptive measure of the utility of the regression equation for making predictions square of the linear correlation coefficient.
Coefficient of Determination
63
In determining the coefficient of determination, the nearer the value is to _, the useful is the regression equation in making predictions.
1
64
Used to test the significance of the differences among means obtained from independent samples (parametric tests) significance where >2 conditions are used, or even when several independent variables are involved.
Analysis of Variance (ANOVA)
65
What are the types of ANOVA?
One-Way ANOVA Two-Way ANOVA Three-Way ANOVA
66
Tests used if 2 or more samples were drawn from the same population by comparing means or if data from several groups have a common mean. There's 1 IV and 1 DV and an interval level of data.
One-Way ANOVA
67
It tests the hypothesis that the means of 2 variables (factors) from 2 or more groups (2 IV, 1 DV) are equal (drawn from population with the same mean)
Two-way ANOVA
68
It has a similar purpose with the different kinds of ANOVA, except that the groups here have 3 categories of defining characteristics. It must have 3 IV and 1 DV
Three-Way ANOVA
69
It corrects alpha not just for all pair-wise or simple comparisons of means, but also for complex comparisons (contrast of more than 2 at a time) of means.
Scheffe's Test
70
The most popular of the post hoc procedures, most flexible and most conservative but has least statistically powerful procedure.
Scheffe's Test
71
Versatile formula data must be presented in frequencies. It is categorized under non parametric test but can also be used as a parametric test data
Chi-square
72
The 2 chi-square tests
Goodness of Fit Independence
73
Also called one-sample or one-variable chi-square. Involves 1 variable of ≥ 2 categories. It compares the distribution of measures for deviation from a hypothesized distribution. Nominal level of data.
Goodness of Fit
74
A chi-square test that involves 2 variables consisting of ≥ 2 categories. It determines whether the 2 variables are related or not. Reveals only the relationship but not the magnitude of the relationship.
Independence
75
Parametric Test or Non-Parametric test? Random selection of subjects from a normal population with equal variances.
Parametric Tests
76
Parametric Test or Non-Parametric test? Whether the groups or samples to be compared are independent samples or correlated.
Both
77
Parametric Test or Non-Parametric test? Whether the number of groups to be compared is ≥ 2.
Non-parametric Test
78
Parametric Test or Non-Parametric test? More power, higher power efficiency.
Parametric Test
79
Parametric Test or Non-Parametric test? Simple and easier to calculate.
Non-parametric Test
80
Parametric Test or Non-Parametric test? No need to meet data requirements at all.
Non-parametric Test
81
What are some non parametric tests?
Median test Father's Sign Test Wilcoxon Rank Sum Test Mann-Whitney (U) Test Wilcoxon Signed Ranks Test (T) Kruskal-Wallis H Test Friedman Rank
82
Non parametric test that compares the medians of 2 independent sample (uncorrelated). Only considers the number of cases above and below the median. Presented in an ordinal data.
Median Test
83
Non parametric test that compares the 2 correlated samples by obtaining the differences between each pair of observation. Consider the signs of the differences between paired observations in their sizes.
Fisher's Sign Test
84
What level of data do non parametric tests have?
Ordinal
85
Non parametric test that is used for comparing 2 independent samples using rank data.
Wilcoxon Rank Sum Test
86
Non parametric test that is used with independently drawn random samples, the sizes of which need not to be the same.
Mann-Whitney (U) Test
87
Non parametric test that is used for correlated samples, the difference, d, between each pair is calculated (data subjected to computation)
Wilcoxon Signed Ranks Test (T)
88
Non parametric test that is used to test whether or not a group of independence samples is from the same or different population. Compares 3 or more independent samples with respect to an ordinal variable.
Kruskal-Wallis H Test
89
Non parametric test that is used to test whether or not the data is from the sample under 3 different conditions.
Friedman Rank
90
The act of assigning numbers or symbols to characteristics of things according to rules.
Measurement
91
A set of numbers or symbols whose properties model empirical properties of the objects to which the numbers are assigned.
Scale
92
Permits classification. Rank ordering on some characteristic. Have no absolute zero point.
Ordinal scales
93
A set of test scores arrayed for recording or study.
Distribution
94
A straightforward, unmodified accounting of performance that is usually numerical.
Raw score
95
All scores are listed alongside the number of times each score occurred.
Simple Frequency Distribution
96
Class intervals replace the actual test scores.
Grouped frequency distribution
97
A graph with vertical lines drawn at the true limits of each test score or class interval forming a series of contiguous rectangles.
Histogram
98
Expressed by continuous line connecting the points where the test scores or class intervals (X axis) meet frequencies (Y axis).
Frequency Polygon
99
If the distribution is normal, the most appropriate measure of central tendency for mean is what data?
Interval or ratio
100
There are two scores that occur with the highest frequency. It is theoretically possible for this distribution to have two modes with falls at the high or low end of the distribution.
Bimodal distribution
101
An indication of how scores in a distribution are scattered or dispersed.
Variability
102
Statistics that describe the amount of variation in a distribution.
Measures of variability
103
It is a measure of variability equal to the difference between Q3 and Q1.
Interquatile ranges
104
It is equal to the interquartile range divided by 2.
Semi-interquatile range.
105
Quarter refers to an _.
Interval
106
The dividing points between the 4 quarters in the distribution. It refers to a specific point.
Quartiles
107
Q2 and the _ are exactly the same.
Median
108
In a perfectly symmetrical distribution, Q1 and Q3 will be exactly the same distance from the _.
Median
109
To obtain this, all the deviation scores are summed and divided by the total number of scores.
Average deviation
110
A measure of variability equal to the square root of the variance.
Standard deviation
111
It is equal to the arithmetic mean of the squares or the differences between the scores in a distribution and their mean.
Variance
112
It is the nature and extent to which symmetry is absent.
Skewness
113
Relatively few of the scores fall at the high end of the distribution.
Positive skew
114
Relatively few of the scores fall at the low end of the distribution.
Negative skew
115
Refers to the steepness of a distribution in its center.
Kurtosis
116
What are the 3 general types of curves and what does it mean?
Platykuric- relatively flat Leptokurtic- relatively peaked Mesokurtic- somewhere in the middle
117
Distribution that have _ kurtosis have a high peak and fatter tails compared to a normal distribution.
High
118
Distribution that have _ kurtosis values indicate a distribution with a rounded peak and thinner tails.
Lower
119
The development of the concept of a normal curve began in the middle of the 18th century with the work of _ and later the Marquis de Laplace.
Abraham DeMoivre
120
Through the early 19th century, scientists referred to the Normal curve as the _.
Laplace-Gaussian curve
121
He was credited with being the first to refer to the curve as the normal curve.
Karl Pearson
122
The distribution of the normal curve ranges from _ to _.
Negative infinity to positive infinity.
123
The normal curve is perfectly symmetrical with _ skewness.
No skewness
124
A raw score that has been converted from one scale to another scale where the latter scale has some arbitrarily set mean and standard deviation.
Standard scores
125
These scores are more easily interpretable than raw scores.
Standard scores
126
It results from the conversion of a raw score into a number indicating how many standard deviation units the raw score is below or above the mean of the distribution.
Z scores or zero plus or minus one scale
127
T score is devised by W.A Mccall and named it in honor of his professor _.
E. L. Thorndike
128
This standard scores system is composed of a scale that ranges from 5 standard deviations below the mean to 5 standard deviations above the mean. Mean=50; Std.= 10
T scores or fifty plus or minus ten scale
129
A standard score which has a mean of 5 and std of 2.
Stanine
130
The 5th stanine indicates performance in the _.
Average range
131
It is an expression of the degree and direction of correspondence between two things.
Correlation
132
A number that provides us with the index of the strength of the relationship between two things.
Coefficient of Correlation (r)
133
The meaning of correlation coefficient is interpreted by its _ and _.
Sign and magnitude
134
The two ways to describe a perfect correlation between two variables are as either _ of _.
+1 -1
135
Magnitude is a number anywhere between _ and _.
+1 -1
136
He devised the Pearson r .
Karl Pearson
137
Can be the statistical tool of choice when the relationship between the variables is linear and when the two variables being correlated are continuous.
Pearson r/ Pearson Correlation Coefficient/ Pearson Product-moment Coefficient of Correlation
138
The Spearman Rho was developed by _.
Charles Spearman
139
A measure of correlation that is frequently used when the sample size is small (fewer than 30 pairs of measurement) and when both sets of measurements are in ordinal form.
Spearman Rho
140
A simple graphic of the coordinate points for the values of the X-variable (horizontal axis) and the Y-variable (vertical axis). They are useful because they provide a quick indication of the direction and magnitude of the relationship between the 2 variables and also reveals the presence of curvilinearity.
Scatterplot
141
An eyeball gauge of how curved a graph is.
Curvilinearity
142
An extremely atypical point located at a relatively long distance from the rest of the coordinate points in a scatter plot.
Outlier
143
A statistic useful in describing sources of test score variability.
Variance
144
Variance from true differences.
True variance
145
Variance from irrelevant random sources.
Error variance
146
The greater the proportion of the total variance to true variance, the more _ the test.
Reliable
147
Refers to all the factors associated with the process of measuring some variable other than the variable being measured.
Measurement error
148
A source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process.
Random Error
149
A source of error in measuring a variable that is typically constant or proportionate to what is presumed to be the true value of the variable being measured.
Systematic error
150
Give a source of variance during test construction.
Item sampling/Content sampling
151
A validity coefficient that is used when correlating self rankings of performance.
Spearman Rho Rank-order Correlation