L7 - Introduction to inferential statistics Flashcards
(37 cards)
Measure of central tendency (location)
Mean: average value
Median: exact middle value
Mode: most frequently value
Measure of dispersion (variability / spread)
Range and standard deviation
Range
- The spread of data - distance between the min and max values of the variable.
- Can use to describe the variability of open-ended questions (Respondents define range by their answers).
Standard deviation
Describes the average distance of the distribution values from the mean.
> indicate the usefulness of the mean as typical value.
Role of Descriptive analysis
+ Provide summary measures of typical or average values
+ Present data in a digestible format
+ Provide preliminary insights about the distribution of values for each variable
+ Help detect errors in the coding process
Population (Malhora, 2010)
the complete set of individuals or objects of interest
Sample (Malhora, 2010)
a subset of population from which information is gathered
Parameter (Malhora, 2010)
- true value of a variable
- fixed values referring to the population and are unknown
> It is the same from sample to sample
Sample statistic (Malhora, 2010)
- value of a variable that is estimated from a sample.
- it is hoped to be close to parameter of the population of which the sample is a subset.
Point estimate (Malhora, 2010)
a single value that is obtained from sample data and is used as the best guess of the corresponding population parameter
> It differs from sample to sample
Confidence interval
a range into which the true population parameter will fall, assuming a given level of confidence.
CI = sample statistic +- k * standard error
Standard error parameter (k)
value of desired standard errors for the estimate (ex: k = 1.96 for a 95% CI)
Hypothesis (Hair, 2017)
an unproven supposition that tentatively explains certain facts or phenomena. It is developed prior to data collection.
> Test are designed to disprove null hypothesis.
Null hypothesis
If null hypothesis is accepted, we do not have to change the status quo. If cannot rejecting, conclude that it may be true.
Steps in hypothesis testing (slide)
1) Formulate the hypothesis
2) Decide on test, test statistic
3) Select a significance level
4) Statistical decision (reject or not reject)
5) Conclusion
Test the hypothesis based on 4 factors:
- Type of hypothesis
- Number of variables
- Scale of measurement
- Distribution assumptions
Three types of hypothesis
- Specific population characteristics
- Contrasts / Comparisons
- Associations / Relationships
2 types of Distribution assumptions
Parametric (interval scale, normal bell-shaped distribution) and Nonparametric (nominal and ordinal scale) types of statistic.
Type of scale use what Appropriate statistic: measure of location, spread and statistics technique
- Nominal: mode, none, Chi-square
- Ordinal: median, percentile, Chi-square
- Interval: mean, standard deviation, t-test and ANOVA
Comparing means with Independent vs. Related samples
- Means are from independent samples: (ex: coffee drink of female and male)
- Means are from related samples: (ex: coffee drink and milk tea drink of female) Since the sample is the same, it is called a paired sample.
Test statistic
- serves as a decision maker, since the decision to accept or reject Ho depends on its magnitude (how close the sample comes to the Ho)
- an univariate hypothesis test using the t distribution, which is used when the standard deviation is unknown and the sample size is small.
Frequency distribution (Malhora, 2013; Hair, 2017)
- a mathematical distribution whose objective is to obtain a count of the number of responses associated with different values of one variable and to express these counts in percentage terms.
- descriptive statistics are used to accomplish this task.
Role of frequency distribution (Malhotra, 2013)
- Determine the extent of item nonresponse.
- Indicate the extent of illegitimate responses.
- Detect outlier cases with extreme value.
- Indicate the shape of empirical distribution of the variable. By constructing a histogram, we can examine whether the observed distribution is consistent with the assumed distribution.
One-tailed and two-tailed test differences
- It is a one-tailed test because the alternative hypothesis is expressed directionally (<= or >).
- It is a two-tailed test where the alternative hypothesis is not expressed directionally.