MID TERM REVIEW Flashcards Preview

1P97 ITIS - Data Anaylsis & Modelling > MID TERM REVIEW > Flashcards

Flashcards in MID TERM REVIEW Deck (69):

What is quantitative analysis?

A scientific approach to managerial decision making in which raw data are processed and manipulated to produce meaningful information



What is a model? What are attributes of a model?

A simplified representation or abstraction of reality. Can be physical, analog, or mathematical


What are the components of a Mathematical Model?


1. The parameters (uncontrollable factors)

2. Independent or decision variables (controllable factors)

3. A mathematical relationship between variables

4. Dependent Variables


What is an independent or decision variable?

The variables that we are going to test to see if they have an effect on the dependent variable.


Example: When determining the effects of budget expenditures, the budget lines would be the independent variables


The x's


What is a dependent variable?


The variable we want to see if is effected by our model. What you are modelling to study or forecast.


For example: If determining budget effects, the dependent variable would be the specific aspect you want to see if was effected or not.


The y


What are 6 benefits of Mathematical Modelling?

Models let us: TARMST

- Compress Time

- Manipulate the model more easily than a real system and reduce the cost of experimentation and mistakes.

- Consider risk

- Help us learn about symptoms

- Encourage rigorous thinking

- Analyze a large number of solutions.


What are 6 drawbacks of Mathematical Modelling?


- Can be time consuming and expensive

- Managers must agree to accept results

- Data collection can be expensive or impossible

- it may be difficult to assess uncertainties

- Oversimplification may produce poor results

- Common perception that if it is done on a computer, it must be correct


What are two types of models? What do they do?

 Prescriptive model: Find the best or optimal solution

- Enumeration, or algorithm


Descriptive Models: will: COPS

- Characterize things as they are

- outcomes and consequences investigaion

- Predict the behaviour of systems in certain situations

- Solutions are not necessarily optimal


What type of analytics/model involves the study and consolidation of historical data?

Descriptive analytics


What type of analytics/model involves the forecasting of future outcomes based on patterns in the past data?

Predictive analytics



What type of analytics/model involves the use of optimization methods?

Prescriptive methods


What are the 7 steps of the quantitative analysis approach?


1. Defining the problem

2. Developing a model

3. Acquiring input data

4. Developing a solution

5. Testing the solution

6. Analyzing the results

7. Implementing the results


What is a deterministic model?

A mathematical model that that does not involve risk or chance. All values  used in the model are known with complete certainty


What is a probabilistic model?

A mathematical model that involved risk or chance. Values used are estimates based on probabilities.


What is a break even analysis? What is the equation for Break even analysis?

The quantity of sales that results in zero profit.


Break even point = fixed costs / selling price per unit - variable cost per unit


What is the profit equation?

Profit = selling price(x) - fixed costs - variable cost per unit(x)


What is time series data?

A series of data points indexed in time order. Usually collected at discrete and equally spaced intervals.


What is cross-sectional data?

Observations that come from a group or individual at a single point in time. Is discrete


What is Categorical Data? What are types of this data?

Data representing groups. This data can be nominal (names or titles there is no intrinsic ordering) or ordinal (there is a value and can be ordered)


What is Numerical data? What are two types of this data?

Data attributed to numbers and is measurable. Can be discrete (finite end value) or continuous (infinite end value)


Example. The people in a room is discrete numerical. The time they arrive in the room is numerical continuous because time can have many decimal places.


What are the measures of central tendency? What do they each measure?

Mean - Average, add all numbers and divide by total numbers


Mode - the number that occurs the most often


Median - the center value


What are typical measures of variation?


Range - the max value and the min value of a number set

Variance - the average squared differences from the mean

Standard deviation - square root of variance - measures distance from center


What are three common statistical displays?

Bar charts - represents categorical data in bars. Used to compare variables

Histograms - represents quantitative variables in bins. Used to show distribution of variables

Pareto Charts - bar chart that aligns categories with occurrence to recognize importance.

Use of bin numbers - assists with organizing variables into bins.


What is a simple definition of an experiment? What are terms associated with experiments?

An experiment is a scientific procedure undertaken to make a discovery, test a hypothesis, or demonstrate a known fact.


An experiment will have an outcome (result)


The sample space is all possible outcomes.


Results or outcomes can be grouped into events (an outcome or collection of outcomes)


What are the 3 types of events of an experiment? Describe each

Mutually exclusive - A situation in which only one event can occur on any given trial or experiement


Collectively exhaustive - A collection of all possible outcomes of an experiement


Independent - A situation in which the occurrence of one event has no effect on the probability of occurrence of a second event.


What is the multiplication rule in Probability?

It involves determining the probability of event A and B occurring when one items happens after the other. Example. What is the probability of picking A and then picking B


If they are dependent events (ie. No replacement), then P(A and B) = P(A) * P(B given A has occurred) or the reverse for P(B)


If they are independent events (with replacement) then P(A and B) = P(A) * P(B)


What is the Addition Rule for probability?

For determining the probability of an event where P(A) or P(B) could occur. Example. What is the probability of picking A or picking B.


If they are mutually exclusive (one event can occur without the other event having occurred) then P(A or B) = P(A) + P(B)


If they are non-mutually exclusive events (one event cannot occur unless another has occurred) then P(A or B) = P(A) + P(B) - P(A and B)


If your Data is discrete in nature what probability distributions could be utilized?

Uniform -

Binomial -

Poisson -


If your data is continuous what probability distributions could be used?

Uniform -

Exponential -

Normal -


What is the Chi-square goodness of fit?

A chi-square test tests whether a set of observed values (frequency distribution) fits the expected range (frequency)


What are the variables of a chi-square test?

K = number of categories

K - 1 = degree of freedom

A(alpha) = the level of signifgance, = .01 (99%) or .05 (95%) etc....

X0^2 = critical value - this is calculated from df and alpha

The claim = HO null hypothesis (from outside source or expected values)

The alternative = H1 the alternate hypothesis (from observed values, maybe from your own study)

The test stat = sum of (observed frequency - expected frequency)^2 /all over Expected frequency

If the test stat (p-value) is in the rejection region than we reject H0


What is the equation of a regression line?

Y = a + bx


What is slope of a regression line? How do you calculate it?



B = (y2 - y1) / (x2 - x1)


What are the three common pitfalls associated with using regression analysis?




-Inferring Cause and effect


What are the four steps in the regression process?

regpro - ITSA

- Identify potential independent variables

- Transform if necessary

- Select the variables to be used (forward selection or backward elimination)

- Analyze the residuals


In the ANOVA table what is the meaning of the regression stats? Ie. R, R^2 etc...

C - Multiple R = coefficient of correlation - the measure of strength of the relationship between variables.  = the square root of R^2


D - R^2 = Coefficient of determination - The percent of the variability in the dependent variable (Y) that is explained by the regression equation. Higher the better. Calculated by dividing SSR by SST


Adjusted R Square - A measure of the explanatory power of a regression model that takes into consideration the number of independent variables in the model.


In the ANOVA table. Explain the df column.

Df - degree of freedom

Regression DF = number of independent variables

Residual DF = total df - df regression

Total DF = total observations - 1 (n-1)


In the ANOVA table, explain the SS column.

SSR (on top), SSE(middle), and SSTotal (SSR and SSE sum) are results from the regression analysis, they inform other columns


SSR / SST = R^2


In the ANOVA table, Explain the MS column

MSR Regression = SSR / DF regression

MSE Residual = SSE / DF residual


In the ANOVA table, how if F calculated? What is it?



This is the test statistic. It informs Signifgance F


In the ANOVA table, What is Signifgance F? What does it mean?

It is the probability of getting a value of F larger than the one calculated. The lower the Sig F the better and more statistically significant the model is.


Ex. It should be less than your stated level of signifgance (.05, .01...95%, 99% etc)


In the ANOVA table, explain the Coefficients column

Intercept is the a value of the y=a + bx


The lines that follow are the b variables values


What is the Standard Error column in the ANOVA table?

The standard deviation of the errors, sometimes called the standard deviation of the regression.


Error is the difference between the actual Y value and the predicted value. Residual is another term for error.


In the ANOVA table what is the P-value column?

This is, by variable, the probability of getting a t stat larger than the one calculated. Lower is better here. A high value may lead to rejection of a variable (in concert the variable effect on the Adjusted R square value)


What is the purpose of excel function BINOM.DIST? What are the variables to enter?

a statistical measure that is frequently used to indicate the probability of a specific number of successes occurring from a specific number of independent trials.


Number_s = the number of successes

Trials = the number of trials

Probability_s = the probability of success

Cumulative = true (use cumulative dist) or false (use probability)



What does the LINEST function calculate?

The LINEST function calculates the statistics for a line by using the "least squares" method to calculate a straight line that best fits your data, and then returns an array that describes the line.


What does the NORM.INV calculate in excel?

The Excel NORMINV function calculates the inverse of the Cumulative Normal Distribution Function for a supplied value of x, and a supplied distribution mean & standard deviation.


What is the syntax of the NORM.INV function?

NORMINV( probability, mean, standard_dev )


Probability - the value at which you want to evaluate the inverse function

Mean - the arithmetic mean of the distribution

Standard_dev - the standard deviation of the distribution


What does NORM.S.DIST function calculate?

What does NORM.S.DIST function calculate?


What does STDEV.S calculate?

Estimates the standard deviation based on a sample. The standard deviation is a measure of how widely values are dispersed from the average value (the mean)


What does TREND calculate?

Returns values along a linear trend. Fits a straight line (using the method of least squares) to the arrays known_y's and known_x's. Returns the y-values along that line for the array of new_x's that you specify.


What does EXPONDIST calculate?

Returns the exponential distribution. Use EXPON.DIST to model the time between events, such as how long an automated bank teller takes to deliver cash. For example, you can use EXPON.DIST to determine the probability that the process takes at most 1 minute.


What is the syntax for EXPON.DIST?



X - the value of the function

Lamda - the parameter value

Cumulative - true or false


What is the Poisson dist?

A common application of the Poisson distribution is predicting the number of events over a specific time, such as the number of cars arriving at a toll plaza in 1 minute.


What is the syntax of POISSON?


The POISSON.DIST function syntax has the following arguments:

X     Required. The number of events.

Mean     Required. The expected numeric value.

Cumulative     Required. A logical value that determines the form of the probability distribution returned. If cumulative is TRUE, 


What is the NORM.DIST function in excel?

It returns the normal distribution for the specified mean and standard deviation. This function has a very wide range of applications in statistics, including hypothesis testing.


What is the syntax for the NORM.DIST function?



You are modelling the expected effect of an increase in the advertising budget on sales of new cars at your dealership. You are aware that your model will be affected by the local unemployment rate and a rumored move out of the area by a major employer. In this model the independent variable is? Why?

The advertising budget. Because you want to see the effects of the ad budget of sales (the dependent variable)


What is sensitivity analysis?

A process that involves determining how sensitive a solution is to changes in the formulation of a problem


The number of aces at a tennis match is what kind of data?



The probabilty of heads or tails on a coin toss is what kind of event?




Of two events, if the events are ______, then their joint occurrence cannot occur

Mutually exclusive


When events have only two possible outcomes, what distribution is useful?



What is an assumption about the residuals of regression analysis?

That the variance of the residuals is constant


What does R^2 tell us about linear relationship?

The proportion the variance of the independent variable that can be explained by the regression equation


If R^2 = 0, then there is no relationship between the dependent variable and any of the independent variables


The closer R^2 gets to 1, the stronger the relationship is


What does the Signifgance F value tell us?

Whether or not to reject our hypothesis. If the Signifigance F is less than our desired level of signifigance, we can reject the hypothesis (Ho)


What does the P-value tell us?

If the p-value is less than our desired level of signifigance than we can reject the hypothesis


How do you do a CHI square test?

Calculate the chi square statistic x2 by completing the following steps: SOE, SQUARE, DIVIDE, SUM - SSDS

1. For each observed number in the table subtract the corresponding expected number (O — E).

2. Square the difference [ (O —E)2 ].

3. Divide the squares obtained for each cell in the table by the expected number for that cell [ (O - E)2 / E ].

4. Sum all the values for (O - E)2 / E. This is the chi square statistic.


What are the rules of probability? 

The addition rule and the multiplication rule