Data

collections of facts

Population

a well-defined collection of objects

Census

When desired information is available for all objects in the population

Sample

a subset of the population

Variable

any characteristic whose value may change from one object to another

univariate data set

consists of observations on a single variable. For example, we might determine the type of transmission, automatic (A) or manual (M), on each of ten automobiles recently purchased at a certain dealership, resulting in the categorical data set: M A A A M A A M A A

bivariate data

when observations are made on each of two variables

Multivariate data

when observations are made on more than one variable

descriptive statistics

summarizing and describing important features of the data e.g. A graph or a mean

Inferential statistics

Techniques for generalizing from a sample to a population

hypothetical population

the population as consisting of all possible data that might be made under similar experimental conditions

confidence interval or interval estimate

Estimate of the population mean

lower prediction bound

Estimate of a single data point

The relationship between probability and inferential statistics

probability reasons from the population to the sample (deductive reasoning), whereas inferential statistics reasons from the sample to the population (inductive reasoning)

Enumerative studies

interest is focused on a finite, identifiable, unchanging collection of individuals or objects that make up a population

Sampling frame

a listing of the individuals or objects to be sampled

Analytic study

A study that is not enumerative in nature

simple random sample

This is a sample for which any particular subset of the specified size (e.g., a sample of size 100) has the same chance of being selected

stratified sampling

entails separating the population units into nonoverlapping groups and taking a sample from each one

Sample size

The number of observations in a single sample, often denoted by "n"

Truncating

To make the numbers in a set all shorter by the same amount

Dot plot

an attractive summary of numerical data when the data set is reasonably small or there are relatively few distinct data values. Each observation is represented by a dot above the corresponding location on a horizontal measurement scale. Whena value occurs more than once, there is a dot for each occurrence, and these dots are stacked vertically. As with a stem-and-leaf display, a dotplot gives information about location, spread, extremes, and gaps.

Discrete

A numerical variable is called this if its set of possible values either is finite or else can be listed in an infinite sequence

Continuous

A numerical variable is called this if its possible values consist of an entire interval on the number line.

Frequency

the number of times that a value occurs in the data set

Relative frequency

the fraction or proportion of times the value occurs ( number of times the value occurs/ the number of observations in the data set)

Frequency distribution

a tabulation of the frequencies and/or relative frequencies

unimodal histogram

A histogram that rises to a single peak and then declines

bimodal histogram

A histogram that has two different peaks

multimodal

A histogram with more than two peaks

When is a histogram symmetric?

if the left half is a mirror image of the right half

When is a unimodal histogram positively skewed?

if the right or upper tail is stretched out compared with the left or lower tail

When is a unimodal histogram negatively skewed?

if the left or lower tail is stretched out compared with the right or upper tail

qualitative

Categorical

Mean

The arithmetic average of the set. Often referred to as the sample mean and represented by x̄.

point estimate

a single number that is our “best” guess

Population mean

The average of all values in the population. Denoted as μ. When there are N values in the population (a finite population), then μ= sum of the N population values/N.

median

the middle value once the observations are ordered from smallest to largest. Sample median is denoted as x-tilde

Range

the difference between the largest and smallest sample values

population median

a middle value in the population. Denoted as μ-tilde

deviations from the mean

Obtained by subtracting x̄ from each of the n sample observations. The average deviation is always zero.

sample variance