17. Basic Statistics Flashcards Preview

Six Sigma > 17. Basic Statistics > Flashcards

Flashcards in 17. Basic Statistics Deck (16)
Loading flashcards...

Central Limit Theorem P.282

The distribution of sample averages will tend toward a normal distribution as the sample size, n, approaches infinity.


Inferential statistics P.283

Uses sample data to draw statistical conclusions about the population from which the sample was drawn.


Confidence Intervals P. 283

Provides boundaries for an unknown parameter of a population with a specified degree of confidence that the parameter falls within the interval.


Hypothesis testing P.283

Test of significance and tests whether events occur by chance or not.


Descriptive statistics P.284

Collection of tools and techniques for displaying and summarizing data.


Three common ways for quantifying the centrality of a population or sample P. 285

1. Mean
2. Median - Middle value of an ordered data set
3. Mode - Most frequently found


Useful description statistics P.287

- Kurtosis, measures the degree to which a set of data is peaked or flat.
- Skewness, measures the degree to which a set of data is not symmetrical.
- Maximum
- Minimum
- First quartile (Q1)
- Second quartile (Q2), median
- Third quartile (Q3)
- Fourth quartile (Q4), Maximum value. Not computed
- Interquartile range (IQR) = Q3 - Q1


Determining Quartiles P.290

Q1: W= n+1th /4
If not an integer, w is formated as "y.z" where y is an integer and z is the decimal portion of w.
Q1: Xy+z(Xy+1 - Xy)
Xj = jth observation of the ordered list of sample data.

Q3: W=3(n+1th /4)


Frequency distribution P.293

A tabular display of data in mutually exclusive and collectively exhaustive classes or intervals that summarized the number of occurrences in each class.

Class / Class Interval (Cell) - range in which data in frequency distribution table are placed or binned
Class limits - upper and lower limit
Class mark - average value of the class limits
Class boundaries - mutually exclusive and adjacent class (one decimal place more than the original raw data and typically ends in 5)


Histogram P.297

A pictorial representation of data based on frequency distribution. Histogram need not have equal class widths.
Adv.- immediate visual feedback.
Dis. - groups data into classes (raw data lost)


Histogram Interval calculation P.297

- 2 to the K rule, 2^k > n. n=population, K=first integer value
- Sturges's formula. K=1+3.3log10n
- Square root rule. K=√n
- Rice's rule K=2n^1/3


Box-and-Whisker Diagrams P. 298

Graphically display of high and low values of the data, quartiles, the median, outliers, and interquartile range.


Scatter Diagrams P.301

Plot of two variables, one on the y-axis and the other on the x-axis. Shows visual examination for patterns to determine whether the variables show any relationship.


Normal Probability Plots P.303

Use of special paper that random sample fro a normally distributed population would plot as an approximately straight line. AKA Rankit, QQ, Quantile, PP Plots.

Probability = Mean Rank / (N+1)


Descriptive (enumerative) Statistics P. 325

Studies use techniques (numerical and graphical) to present data in an understandable format. Summarize the information revealed in a data set.


Inferential (anlytical) P.325

Studies analyze data from a sample to infer properties of the population from which the sample was drawn.