15. Data Collection Flashcards Preview

Six Sigma > 15. Data Collection > Flashcards

Flashcards in 15. Data Collection Deck (27):

Qualitative Data P.227

Data based on descriptive information. Usually collected in a free-form manner.


Quantitative Data P.228

Data that can be measured, verified, and manipulated, also known as numerical data.


Discrete Data P.228

Count data and are sometime called categorical/attribute data.


Continuous Data P.228

Exist on an interval, or on several intervals. Variable data that can be transformed into attribute data, but the reverse is not true.


Measurement Scales P.229



Nominal Scales P.229

Classify data into categories with no order implied.


Ordinal Scales P.229

Refer to positions in a series, when order is important but precise differences between values aren't defined. Bright, Brighter, Brightest.

Sometimes collected as discrete data but manipulated as continuous data and analyzed with parametric tests.


Interval Scales P.230

Scales have meaningful differences but no absolute zero, so ratios aren't useful.


Ratio Scales P.230

Scales have meaningful differences, and an absolute zero exists.


Sampling concepts P.231

-Representative sample (random drawn from population)
-Bias (1. less likely to be included than other, 2. nonrandom collection)
-Accuracy (how close the sample statistic to population)
-Precision (how close estimates from different sames are to each other)
-Margin of error (max. expected difference between the sample estimate to population)
-Sampling frame (complete list of the members in the target population)
-Strata (mutually exclusive segment)


Type of sampling P.232



Random sampling P.233

Equal probability of being chosen. Selected independently of every other member.


Stratified sampling P.233

Members are assigned to a unique stratum that are mutually exclusive and collectively exhaustive. Stratification variables should create a heterogeneous set of strata.


Systematic sampling P.234

Sampling from an ordered population at a specified sampling interval, i. (interval sampling)

i = N/n, N (population) n (sample size)
k, k+i, k+2i..., k+(n-1)i.


Block sampling P.235

Non-probability sampling or judgement block sampling. The balance of a defined block are automatically chosen.


Factors influence Sample Size P.236

-Precision and accuracy required
-Resources available
-Nature of the study (time, geographical)
-Sampling design adopted
-Size and characteristics of the populaion
-Confidence and power required
-Methods of obtaining the sample


Determining sample size P.236

-As many as possible
-For correlational, experimental design, and causal-comparative studies: 30 per group.
<100: sample entire population
~500: sample 50% population
~1500: sample 20% population
>5000: sample 400 of the population


Sample size calculation when population is a known qty P. 237

n= N / 1+Ne^2

N= Population
n= Sample size
e= Margin of error 1/ √n


Good sampling design P.237

-Samples may be drawn randomly
-Sample is representative of the population
-Design is economical and efficient
-Sufficiently easy to conduct
-Standard error is minimal
-Bias is minimal
-Statistical assumptions are not violated
-Inferences can be made from the sample with the desired level of confidence
-Selection criteria are objective.


Data Collection: Operational Definition P.239

1. A clear, concise, and unambiguous statement that provides a unified understanding of the data for all involved before the data are collected or the metric is developed.
2. Defined the formula that will be used and each term used in the formula.
3. Provides an interpretation of the metric, such as up is good.


Common cause of poor data accuracy P.241

-Multiple points of entry
-Limited use of data validation
-Batching input versus real-time input/ synchronization
-Multiple means of data correction access to upstream systems
-Unclear directions
-Lack of training
-Ambiguous terminology
-Manual versus automated means of entry
-Order of calculations
-Calculation errors
-Inadequate measurement systems
-Units of measure not defined
-Failing to use the proper measurement system
-Field inconsistencies/ truncation
-Similarities of characters (0/O)
-Copying errors


Useful data collection techniques P.243

-Walk the process first
-Make the data collection process simple
-Define collection points
-Use check sheets and checklists
-Bridge computer gaps
-Minimize "other" (category)
-Limit options
-Establish the rules
-Address timing and sequencing
-Time-stamp data
-Red-Tag a unit of product or service
-Chart data accuracy


Common data collection points

-End of process
-In process
-Points of convergence
-Points of divergence
-Across functional boundaries (hand-off points)


Data cleaning P.245

The process of detecting and correction and possibly removing inaccurate data form a set of data (usually through human means).


Coding the data P.256

-Helpful to code data to simplify the recording process
-Sometimes useful to code data by using an algebraic transformation


Collecting seasonal data P.247

-Understand the nuances of the data before undertaking a major data collection effort.


Data collection strategies P.249

1. Observational studies
2. Monitoring techniques
3. Process capability
4. Measurement assessment
5. Sampling
6. Design of experiments
7. Complementary data and information