A RV represents a numerical value associated with each outcome of a probability experiment. “X” is determined by chance (random). Value of a RV is subject to some form of uncertainty.
As an experiment is repeated over and over, the empirical probability of an event approaches the theoretical (actual) probability of the event.
The mean of the random variable in an infinite number of repetitions of the experiment (samples). For a discrete RV: Σ xP(x). The sum of all values of x multiplied by each x’s probability.
The probability distribution for discrete variables contains info about ALL the statistical properties of X, for example, once the probability distribution is known, the expectation of any function of X can be calculated (expected value, variance, standard deviation).
There is an infinite continuum of possible values values for x for a continuous RV “X”. The probability of X being exactly equal to a particular value is zero: Pr{X = x} = 0. So instead the probability distribution of a continuous variable is defined by the probability of a RV being less than or equal to a particular value.
Normal distribution with mean of 0 and standard deviation of 1.
mean: μ(x-bar) = μ
variance: σ^2(x-bar) = σ^2/n
s. d.: σ(x-bar) = σ/n^0.5 (also known as “standard error of the mean”)
SEM: The standard deviation of the sampling distribution of the sample means: σ(x-bar) = σ/n^0.5
SEM quantifies the precision of the mean; it is a measure of how far your sample mean is likely to be from the true population mean. SEM is expressed in the same units as the data.
There are many t-distributions; the particular form the t-distribution is determined by its degrees of freedom (df = n -1) –> sample size minus one. The more degrees of freedom the closer a t-distribution is to a normal distribution; w/ infinite df, t distribution is the same as the standard normal distribution.
t = [x-bar - μ] / [s / n^0.5], where x-bar is the sample mean, μ is the population mean, s is the standard deviation of the sample, n is the sample size.
The t-distribution is used when sample sizes are small and/or the SD of the population is unknown. The t-distribution can be used w/ any statistic having a bell-shaped distribution (approximately normal).