What are the types of sources of data?
Primary - Collected by an investigator for a specific project
Secondary - Data collected by someone else for another purpose
What do you need to consider when using secondary data?
Conditions the data was taken - where? Why? who was tested?
Limitations to the data? - Lacks control over quality, outdated, definitions and measures
In terms of gathering data, whats a population and whats a sample
Populations - eg., all UK males
Sample - subset of population - representative? Sufficiently large?
what types of sample are there?
Random sampling
Systematic (random) sampling - picking every nth person on a list
Stratified (random) sampling - defining eg., 50% m/f population and random sample from there to ensure its representative
Convenience sampling - easily picking randoms as an initial exploration into idea
Quota sampling - similar to stratified but its judgement that selects appropriate proportions within subset
Snowball sampling - if random
What different types of data are there?
Continuous - Can take any value, infinity to infinity but its measured e.g., time taken in a race
Discrete - Can only take certain values e.g., result of rolling a dice, its counted
What are the different types of average?
Arithmetic mean - standard mean - distorted by outliers
Median - middle number - distorted by existence of outliers
Mode - most frequently occurring number - not distorted
Geometric mean - calculates average rate of change - nth route of the product of N numbers.
What is the geometric mean equation
(1+r1) x (1+r2)…..^ 1/n
What does Standard deviation measure?
Variability around the mean. SD = square root of variance / variance = SD^2
What is the range (in data values)
Difference between highest and lowest value
Interquartile range - define it, first qtile, 2nd qtile and 3rd qtile?
Interquartile = middle 50% of the data set, the difference between 1st and 3rd quartile
First quartile = ‘25th percentile’ - midpoint of lowest value and median
Second quartile = Median
Third quartile = Midpoint between median and highest value
Features of a normal distribution?
What does skew do to central tendency?
Seperates the mean mode and median. Mode at top of data peak, down to mean. Median always between them.
What is regression analysis?
Derivation of an equation in which one of the variables (dependent: Y) can be estimated from the other variables (the independent variables: X) Therefore Y is dependent on X.
What is the least square method?
Calculation of the ‘simple regression line’ (linear bivarate regression line), creating a straight line summarising the relationship between X and Y. The line is derived by minimising the sum of squared errors.
Covariance / pearson correlation equation?
r = Covariance( x, y) / oxoxy ————- o is the stdev sign. Actually quite easy to do… (x, y) are the means of two data sets. Mulitply them by covar and the denominator is the st dev of both data sets.
What is the least square method equation?
Y = a + bx
Explain a correlation coefficient
How well a linear bivarate regression line summarises the data and a measure of how closely related the variants are. Perfect correlation +1, no correlation 0, negative correlation -1 (all perfect correlations.
What is the Indexation equations? - Price Weighted and Market Weighted
Price Weighted - ((£A + £B)/Start Value) x base
Market Weighted - ((£MKTCAPA + £MKTCAPB)/Start Value) x base
start value for price is just the share prices added, for market weighted - price x quantity for A & B.
Explain positive/negative correlation
Values that increase and decrease together are positively correlated
Values that diverge are negatively correlated
How is the least squares method used?
Extrapolation - Forecast values outside the range of the sample
Interpolation - Filling in values missing within the range
If you want to rebase your index what do you do?
What is the disadvantage of a price weighted index?
Take a new start value
Places too much/too little weight on the price of a share - a company with 1000 shares is treated equally to one with 1m shares, so must respect the entire actual traded volume of the market.
Pros of Geometric Indicies?
Less sensitive to large changes in the price of its constituents
Capital changes easy to accommodate (only small adjustments required)
Examples include FT30 - Financial Times Ordinary Share Index FT30
Name the key Market indicies and a fact about them:
UK
UK (x2)
UK (x3)
UK (x4)
USA
USA (x2)
France
Germany
Japan
Uk - FTSE 100 - 100 largest cos. 70% value of listed shares
UK - FTSE 250 - 250 largest UK cos - MidCap Stocks
UK - FTSE allshare - 98% of UK stock market capitalisation
UK - FT30 - Geometric mean - started in 1935
USA - DowJones Industrial Average - 30 largest industrial stocks on NYSE and NASDAQ, Price weighted
USA - S&P500 - 500 largest US cos.
France - CAC40
Germany - DAX
Japan - Nikkei225 - Unweighted arithmetic mean - price weighted
What is the effective annual rate equation and explanation
((1+r)^n -1)*100 = effective annual rate - so if its compounded multiple times in a year, this’ll annualise the rate. r = the respective proportion of the annual rate and n equals the number of times its compounded in the period.