Week 12 - Descriptive statistics Flashcards

Question

What is equal weighted return equation?

Answer 1

=∑X_i / n X_i = individual returns n = number of assets or components

Answer 2

A portfolio return is the weighted average return of individual assets in the portfolio usually equal the value weighted return

Answer 3

most appropriate in situations where the data items to be summarised result from a ratio-type calculation, such as with growth rates or index numbers calculated by multiplying all the numbers together and then taking the nth root of the product, where n is the total number of values

Answer 4

provides information about how the data are spread over the interval from the smallest value to the largest value Admission test scores for colleges and universities are frequently reported in terms of percentiles

Answer 5

a value such that at least p percent of the items take on this value or less and at least (100 - p) percent of the items take on this value or more. 10th percentile of a data set is a value such that at least 10% of the items are less than or equal to 90% of the items

Answer 6

Arrange the Data: Sort the data set in ascending order. Determine the Position (i): Calculate the position using the formula: 𝑖 = (p/100) x n where p is the desired percentile and n the number of observations Locate the Percentile: If 𝑖 is an integer, the p-th percentile is the average of the values at positions 𝑖 and 𝑖 +1 If 𝑖 is not an integer, round up to the next whole number, and the p-th percentile is the value at this position.

Answer 7

Consider a data set: 7, 10, 15, 20, 25. To find the 40th percentile: Arrange the Data: The data is already in ascending order. Determine the Position (i): p=40 n=5 𝑖 = (40/100)×5 = 2 Locate the Percentile: Since 𝑖=2 is an integer, the 40th percentile is the average of the values at positions 2 and 3. Values at positions 2 and 3 are 10 and 15, respectively. 40th percentile = (10 + 15)/2=12.5 Therefore, the 40th percentile of this data set is 12.5.

Answer 8

specific percentiles first quartile = 25th percentile second quartile = 50th percentile = median third quartile = 75th percentile

Answer 9

how data points spread out from the centre (mean or median). This is useful in decision-making, such as evaluating supplier delivery times, stock price volatility, or quality control in manufacturing.

Answer 10

1. Range 2. Interquartile Range (IQR) 3. Variance 4. Standard Deviation 5. Coefficient of Variation (CV%)

Answer 11

The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of variability. It is very sensitive to the smallest and largest data values.

Answer 12

Range = largest value - smallest value

Answer 13

The interquartile range of a data set is the difference between the third quartile and the first quartile. It is the range for the middle 50% of the data. It overcomes the sensitivity to extreme data values.

Answer 14

IQR = 3rd quartile - 1st quartile

Answer 15

with its ends located at the 1st and 3rd quartiles a vertical line is drawn in the box at the location of the median (second quartile) Dashed lines are drawn from the ends of the box to the smallest and largest data values inside the limits. Data outside these limits are considered outliers The locations of each outlier is shown with the symbol * .

Answer 16

the lower limit is located 1.5(IQR) below Q1 the upper limit is located 1.5(IQR) above Q3

Answer 17

The variance is the average of the squared differences between each data value and the mean. The variance is a measure of variability that utilises all the data. It is based on the difference between the value of each observation (xi) and the mean (𝑥 ̅ for a sample, µ for a population).

Answer 18

sˆ2 = [ ∑(x_i - x̄)ˆ2]/ (n-1) for a sample x_i - each individual data point x̄ - sample mean n - sample size σˆ2 = [ ∑(𝑥_𝑖 −µ)ˆ2]/ N for a population x_i - each individual data point 𝜇 - population mean 𝑁 - total number of data points in the population

Answer 19

set is the positive square root of the variance. It is measured in the same units as the data, making it more easily interpreted than the variance.

Answer 20

s = √sˆ2 = √[ ∑(x_i - x̄)ˆ2]/ (n-1) for a sample x_i - each individual data point x̄ - sample mean n - sample size σ = √σˆ2 = √[ ∑(𝑥_𝑖 −µ)ˆ2]/ N for a population x_i - each individual data point 𝜇 - population mean 𝑁 - total number of data points in the population

Answer 21

how large the standard deviation is in relation to the mean

Answer 22

CV = (s/x̄) x 100% for a sample s - sample standard x̄ - sample mean CV = (σ/𝜇) x 100% for a population σ = population standard deviation 𝜇 = population mean

Answer 23

Variance: 𝑠^2= (∑(𝑥_𝑖 − x̄)ˆ2 )/ (𝑛−1) = 2,996.16 Standard Deviation: 𝑠= √(𝑠ˆ2 )= √2996.16 = 54.74 Coefficient of variation: (s/x̄) x 100% =(54.74/490.84) x 100% = 11.15% the standard deviation is about 11% of the mean

Answer 24

1. covariance 2. correlation coefficient

Answer 25

a measure of the linear association between two variables. Positive values indicate a positive relationship. Negative values indicate a negative relationship.

Answer 26

𝑠_XY= [ ∑(𝑥_𝑖 − x̄)(y_i - ȳ)]/ (𝑛−1) for samples x_i, y_i - individual data points for variables x̄, ȳ - means of variables X and Y n - sample size σ_XY = [ ∑(𝑥_𝑖 − µ_𝑌)(y_i - µ_𝑌)]/ 𝑛 for populations µ_x, µ_y - populations means of X and Y n - population size

Answer 27

quantifies the strength and direction of the linear relationship between two variables (not necessarily causation, just because two variables are highly correlated, it does not mean that one variable is the cause of the other) The coefficient can take on values between -1 and +1. Values near -1 indicate a strong negative linear relationship. Values near +1 indicate a strong positive linear relationship

Answer 28

r_XY = S_XY / (S_X)(S_Y) = [ ∑(𝑥_𝑖 − x̄)(y_i - ȳ)] / √(∑(x_i - x̄)ˆ2)(∑(y_i - ȳ)ˆ2) for samples x_i, y_i - individual data points for variables x̄, ȳ - means of variables X and Y n - number of data points p_XY = σ_XY/ (σ_X)(σ_Y) = [ ∑(𝑥_𝑖 − μ_x)(y_i - μ_y)] / √(∑(x_i - μ_x)ˆ2)(∑(y_i - μ_y)ˆ2) for populations x_i, y_i - individual data points for variables X and Y μ_x, μ_y - population means for X and Y n - population size (number of data points)

Answer 29

Positive Correlation: If r>0, as one variable increases, the other tends to increase. Negative Correlation: If r<0, as one variable increases, the other tends to decrease. No Correlation: If r=0, there is no linear relationship between the two variables. Strength: Strong: r near 1 or -1 Weak: r near 0

Answer 30

Perfect Positive Correlation (r=1): A straight line with a positive slope (both variables increase together in perfect proportion). Perfect Negative Correlation (r=−1): A straight line with a negative slope (one variable increases as the other decreases in perfect proportion). No Correlation (r=0): No linear pattern in the data.

Answer 31

Sample covariance: 𝑠_XY= [ ∑(𝑥_𝑖 − x̄)(y_i - ȳ)]/ (𝑛−1) = -35.4/ 6-1 = -7.08 Sample correlation coefficient: r_XY = S_XY / (S_X)(S_Y) = -7.08/ (8.2192)(0.8944) = -0.9631

Week 12 - Descriptive statistics Flashcards

(55 cards)