QMB 3200 Flashcards

Question

Frequency Distribution 

Answer 1

a tabular summary of data showing the number (i.e. frequency) of observations in each of several non over-lapping categories.

Answer 2

= frequency of a class/ n of a class

Answer 3

= relative frequency * 100

Answer 4

a visual display of frequency; relative frequency & percent frequency distributions. (Used to compare 2 variables)

Answer 5

a visual display of frequency; relative frequency & percent frequency distributions.

Answer 6

determine the number of non over-lapping classes; b. determine the width of each class; c. determine the class limits.

Answer 7

Typically, between 5 and 20.Small datasets have less; larger datasets have more.

Answer 8

Generally, it should be the same for each class. Approximate class width = (largest data value – smallest data value)/number of classes.

Answer 9

each data observation must only belong to one class.

Answer 10

= frequency of the class/n.

Answer 11

A visual display of a frequency, relative frequency or percent frequency distribution, where the variable of interest is on the horizontal axis and the frequency, relative frequency or percent frequency is on the vertical axis. * Shows the shape of the distribution of the variable of interest.

Answer 12

Presents the number of data items with values less than or equal to the upper class limit for each class.

Answer 13

Shows the proportion of data items with values less than or equal to the upper limit of each class.

Answer 14

Shows the percentage of data items with values less than or equal to the upper limit of each class.

Answer 15

a tabular summary of data for two variables (either categorical or quantitative)

Answer 16

A scatter diagram is a graphical display of the relationship between two quantitative variables and a trendline provides an approximation (i.e. an estimate) of the relationship; which can be positive, negative or none.

Answer 17

Depicts multiple bar charts on the same display.

Answer 18

Has one bar broken into segments of a different color showing the relative frequency of each class.

Answer 19

is the average value for a variable and the sample mean is denoted as 𝑥, and the population mean is denoted as 𝜇.

Answer 20

∑ 𝑥𝑖/𝑛 ; where Σ is the symbol meaning to sum or add up all the x i ’s. For a variable, the first observation is x 1 , the second is x 2 and the i this x i , and n is the number of observations.

Answer 21

is the value in the middle, when the data are arranged in ascending order. When the number of observations are odd the median is the middle value; when the number of observations are even the median is the average of the two middle values. The median avoid problems where there are extreme high or low values for x.

Answer 22

is the value that occurs with the greatest frequency. If there are two values that are most frequent the variable is bi-modal; if there are more then it’s multi-modal.

Answer 23

=mode.sngl

Answer 24

used when observations have different weights (relative importance).

Answer 25

provides information about how the data is spread over the interval from the smallest to the largest value. The pth percentile divides the data into two parts – approximately p% of the observations are less than the pth percentile and approx. (100-p)% are greater.

Answer 26

𝐿𝑝 = 𝑝 /100 (𝑛 + 1)

Answer 27

represent how the data is spread over four parts, each containing approximately 25% of the observations. Q 1 is the first quartile (25th percentile); Q2 = second quartile (50th percentile or median); Q3 = third quartile (75th percentile). Same calculation as a percentile, but only use 25th , 50 th and 75 th .

Answer 28

=percentile.exc( array, k)

Answer 29

=quartile.exc( array, quart )

Answer 30

Range: largest value – smallest value.Interquartile Range: Q3 – Q1 , is the range of the middle 50% of the data. Variance: measures variability using all the data, since it is based on the difference between the value of xi and the mean. This difference is called a deviation about the mean. For a sample, a deviation is 𝑥𝑖−̅𝑥 and for a population, a deviation is 𝑥𝑖− 𝜇. Then the deviation is squared.

Answer 31

that it is useful in comparing the variability of two or more variables; but one difficulty is the units, since the variance is in units squared.

Answer 32

The population standard deviation: 𝜎 = sqr.rt𝜎 2 The sample standard deviation: 𝑠 = sqr.rt𝑠^2 The advantage is now the units are no longer squared, making it easier to compare the results to the mean as well as other statistics.

Answer 33

=var.s (A:A)

Answer 34

=stdev.s (A:A)

Answer 35

This is a measure of how large the standard deviation is relative to the mean. Coefficient of Variation = (𝑠/𝑥 ∗ 100)%

Answer 36

is measured by skewness. If the shape of the data is skewed to the left, the skewness is negative; if to the right then skewness is positive; and if the data is symmetric, then skewness is zero.

Answer 37

the mean and median are equal.

Answer 38

the mean is usually greater than the median

Answer 39

the mean is usually less than the median.

Answer 40

=SKEW CA,A... A)

Answer 41

Measures the relative location of values in the dataset. This helps determine how far a particular value is from the mean.

Answer 42

The z-Score yields a standardized value and is the number of standard deviations from the mean. The z-Score for any observation is a measure of the relative location of the observation in the dataset.

Answer 43

Allows us to make statements about the population of the data values that must be within a specified number of standard deviations from the mean. At least (1 − ⁄1 𝑧^ 2) of the data values must be within z standard deviations of the mean (z > 1).

Answer 44

It applies to any dataset if the data is bell shaped around the mean

Answer 45

Outliers are extreme values relative to the rest of the data. z-Score can help identify outliers.Typically, any z-Score greater than 3 is an outlier. Alternatively, we can use the interquartile range, where the: lower limit: 𝑄1 − 1.5(𝐼𝑄R) upper limit: 𝑄3 + 1.5(𝐼𝑄R)

Answer 46

is a descriptive measure of the linear association between two variables.

Answer 47

= covariance.s CA:A, B:B) or =covariance.s (array 1, array2)

Answer 48

Two descriptive measures of the relationship between two variables are covariance and correlation.

Answer 49

If sxy > 0, then there is a positive linear association between x and y. If sxy < 0, then there is a negative linear association between x and y. *Depends on unit of measurement

Answer 50

A numerical measure of the likelihood of an event occurring. A probability ranges from 0 to 1, such as the probability it will rain tomorrow.

Answer 51

Is a counting rule computing the number of experimental outcomes when n objects are to be selected from a set of N objects where the order of selections is important.

Answer 52

The probability assigned to each experimental outcome must be between 0 and 1, inclusively. The sum of the probabilities for all experimental outcomes must be equal to 1.

Answer 53

Classical Method – such as a coin toss or a roll of a 6-sided die. Relative Frequency Method - is used when data are available to estimate the proportion of time the experimental outcome will occur if the experiment is repeated a large number of times. Subjective Method - is used when outcomes are not equally likely and data is unavailable.

Answer 54

a collection of sample points

Answer 55

is equal to the sum of the probabilities of the sample points in the event.

Answer 56

(A^C) are all the sample points not in the event.

Answer 57

is the event containing all sample points belonging to Event A, Event B or both. The union of Event A and Event B is denoted by: 𝐴 ∪ 𝐵.

Answer 58

is the event containing the sample points belonging to both A and B. Intersection is denoted by: 𝐴 ∩ 𝐵

Answer 59

is useful when we want to know the probability that at least one of two events occurs.The addition law: 𝑃 (𝐴 ∪ 𝐵)= 𝑃(𝐴) + 𝑃 (𝐵) −𝑃(𝐴∩𝐵).

Answer 60

occur when two events have no sample points in common. Addition Law for mutually exclusive events: 𝑃 (𝐴 ∪ 𝐵) = P (A) + P(B)

Answer 61

is the sum of the joint probabilities (by row and column).

Answer 62

𝑃 (𝐴|𝐵) = 𝑃(A ∩ 𝐵)/𝑃(𝐵) or𝑃 (𝐵 |𝐴) = 𝑃(𝐴 ∩ 𝐵)/𝑃(𝐴)

Answer 63

Event A and Event B are independent if:P(A|B) = P(A) or P(B|A) =P(B)

Answer 64

is used to compute the probability of the intersection of two events:𝑃 (𝐴 ∩ 𝐵) = 𝑃 (𝐵) ∗ 𝑃(𝐴|𝐵) or𝑃 (𝐴 ∩ 𝐵) = 𝑃 (𝐴) ∗ 𝑃(𝐵|𝐴)For independent events: 𝑃 (𝐴 ∩ 𝐵) = 𝑃 (𝐴) ∗ 𝑃(𝐵)

Answer 65

either are a finite number of values or an infinite number of values such as 0, 1, 2 ...

Answer 66

a random variable is a measure of the central location for a random variable.

Answer 67

measures the variability or dispersion of the random variable

Answer 68

is the positive square root of the variance.

Answer 69

The experiment consists of a sequence of identical trials. 2. Two outcomes are possible on each trial; successor failure. 3. The probability of success (p) and the probability of failure (1-p) does not change from trial to trial. 4. The trials are independent.

Answer 70

=Binom.Dist

Answer 71

This distribution relates to the case for estimating the number of occurrences over a specified interval of space/time.

Answer 72

The probability of an occurrence is the same forany two intervals of equal length. 2. Occurrence or non-occurrence in any interval is independent of the occurrence or non-occurrence in any other interval. In the Poisson distribution, the mean and variance are equal

Answer 73

= Poisson. Dist (x, u, true / False)

Answer 74

is similar to the binomial distribution, except the trials are not independent and the probability of success changes from trial to trial. NOTE: r is the number of successes in population N, and N-r is the number of failures.

Answer 75

= Hypergeom. Dist (x, u, r, N, true/ False)

Answer 76

Discrete random variables are computed where the random variable takes on a specific value; continuous random variables are computed where the random variable is within an interval.

Answer 77

This distribution is useful for random variable measuring the time between an interval. The exponential probability density function

Answer 78

= Expon. Dist (x, T/M, true / false)

Answer 79

Only two parameters: μ and σ. 2. The highest point is the mean, which is also the median and the mode. 3. The mean can take on any numerical value. 4. The normal distribution is symmetric; skewness = 0 The standard deviation determines how flat or wide the normal curve is. Larger standard deviations result in wider or flatter curves. 6. Probabilities for a normal random variable are given by the area under the normal curve. Total area under the curve equals 1.

Answer 80

=Norm. Dist (x,u, o, true, false)

Answer 81

Standard Normal Probability Distribution is where the μ is 0 and the standard deviation is 1.

Answer 82

= Norm. s. Dist (z, true)

Answer 83

= quartile. exe (array, quart)

Answer 84

= Poisson. Dist (x, u, true / False)

QMB 3200 Flashcards

(110 cards)