Mid-term Flashcards
(43 cards)
what is Data
the facts and figures collected, analyzed, and summarized for presentation and
interpretation
What is Quantitative data
Data are considered quantitative data if numeric and arithmetic operations, such as addition,
subtraction, multiplication, and division, can be performed.
What is categorical Data
If arithmetic operations cannot be performed, they are categorical data.
When data is seperated into groups/classes like in a Venn Diagram
Variable
A variable is a characteristic or a quantity of interest that can take on different values.
what is Variation
Variation is the difference in a variable measured over observations.
What is cross sectional data
Cross-sectional data are data collected from several entities at the same, or approximately the same, point in time
What is time series data
Time series data are data collected over
several time periods.
What is Random Sampling
give an example
Random sampling is a sampling method that allows gathering a representative sample
from the POPULATION DATA
Example: Age, education, location income
Sample data
A sample is a subset of the population, which consists of all the elements of interest
leads to uncertainty
Example: What is your sample? (816 voters)
Experimental study
a variable of interest is first identified.
Then, one or more other variables are identified and controlled or manipulated to obtain data about how they influence the variable of interest.
Statistical data
Data necessary to analyze a business problem can often be obtained with a statistical study
Statistical studies can be classified as experimental or observational
Observational or Nonexperimental Study
does not attempt to control the variables of
interest.
A survey is perhaps the most common type of observational study.
Frequency distribution
frequency distribution is a summary of data showing the number (frequency) of observations
in several nonoverlapping classes, typically referred to as bins.
example: 50 soft drinks are distributed over 5 types of soft drinks
Relative Frequency
State equation
For a data set with n observations, the relative frequency of each bin can be determined as follows
Frequency of Bin / number of observations
Relative frequency distribution
A relative frequency distribution is a tabular (any summary that uses a table) summary of data showing the relative frequency for each bin
percent frequency distribution
A percent frequency distribution summarizes the percent frequency of the data for each bin
Bin Width formula
Largest data- smallest data / number of bins = BW (always round up)
Bin widths calculations:
(MIN,MIN+BW), (UB1,UB1+BW), (UB2,UB2+BW)
what kind of data does Histogram use
a common graphical presentation of quantitative data
What is a histogram
histogram is a column chart with no spaces between the columns whose heights represent the frequencies of the corresponding bins
Frequency polygon
A frequency polygon is useful for comparing quantitative distributions.
-A frequency polygon uses lines to connect the frequency counts of observations from
different bins.
Cumulative Frequency distributions
A cumulative frequency distribution is a variation of the frequency distribution that provides another tabular summary of quantitative data.
- It uses the number of classes, class widths, and
class limits developed for the frequency
distribution. - Shows the number of data items with values less
than or equal to the upper-class limit of each class.
Skewness
Skewness, or lack of symmetry, is an important characteristic of the shape of a distribution
Skewed Left= data is higher at right and lower at left
Skewed right= data is higher at left and lower at right
symmetrical= data looks like a pyramid
Skewness can be highly or moderately depending on how skewed
Mean
equation
The most common measure of central location is the mean, the average of all the data values.
The population mean is denoted by the Greek letter, 𝜇
Sum of All Data / # of data sets = mean (average)
Mode
The mode of a data set is the value that occurs with the greatest frequency (number that is seen the most)
The greatest frequency may occur at two or more different values. In these instances, more
than one mode exists.