Quant Skills Flashcards

(167 cards)

1
Q

The part of maths dealing with manipulation of numbers using mathematical operators (+,-,x,/,)

A

Arithmetic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Numbers joined together with mathematical operators E.g. 5+7

A

Arithmetic Expression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Mathematical statement of equality where two arithmetic expressions are joined together with an = sign E.g. 2+7=9

A

Arithmetic equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When using indices what rules apply?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True/False?
Measurement error is a type of sampling bias?

A

False

  • sampling bias is when your sampling method is systematically biased towards picking some members of the population more then others
  • survivor bias is a type of sampling bias
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why is stratified sampling better then simple random sampling?

A

You are likely to get a more representative sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How would you describe a representative sample?

A

A sample that accurately reflects the characteristics of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What guarantee does taking a random sample give us?

A

If we were to do the study many times (which is impossible/impractical) then on average we will get a representative sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the features of an experiment that are different to an observational study?

A
  • An experiment allows us to infer causation
  • An experiment involves manipulating one or more variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the term reproducible mean when used to describe an experiment?

A

We could replicate the whole experiment and obtain similar results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the difference between an experimental unit and a sampling unit?

A
  • An experimental unit is the smallest entity you can apply a treatment to
  • A sampling unit is the smallest entity you can measure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a control group?

A

A group receiving a standard treatment or no treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The Heart Foundation are interested in understanding the prevalence of heart disease in the greater Springfield area. The area is divided up into small regions and a random sample is chosen. Several interviewers are sent out to survey all individuals in these regions. What type of sampling is this?

A

One-stage cluster sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A study aims to estimate what proportion of people in the greater Brisbane region have allergies to dust mites. The researches are having trouble getting people to come in for testing so they ask participants to ask their family to come in and get tested too. What type of sampling is this?

A

Snowball sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A hospital wants to estimate average time spent in hospital, so they look at a random sample of patients from each ward. What type of sampling is this?

A

Stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If the hospital were to instead take a random sample of patients from a randomly sampled selection of wards, what type of sampling is this?

A

Two-stage cluster sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Answer all questions
The following questions are about using correct terminology to describe this study.
A) is this an experiment?
B) is this a clinical trial?
C) this is ………………?
D) what is the dependant variable?
E) what is/are the factors?
F) what are the levels of vitamin type?
G) what are the possible treatments?
H) how many replicates are there in this study?
I) what could be a nuisance variable?
J) how could you reduce the effect of nuisance variable?

A

A) yes
B) yes
C) a blind trial
D) baby weight
E) baby weight and vitamin type
F) multivitamin Y & multivitamin Y
G) multivitamin X & multivitamin Y
H) 50
I) mother’s diet, mother’s age, father’s height, mother’s height
J) randomly allocating mothers to treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Give an example of a completely randomised design

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Give an example of matched-pairs design

A

Splitting the women up into pairs of women with similar characteristics (e.g. similar medical history and risk of birth complications), then randomly assigning one women of each pair multi X and the other multi Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Give example of randomised block design

A

Split woman into 3 categories based on risk ( low, med and high), then randomly assigned half in each category multi X and the other half multi Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Suppose the sample of 100 pregnant woman was taken by randomly sampling 100 mothers from the sampling frame. What type of sampling is this?

A

Simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

If 100 pregnant women were selected to be the pregnant women who way the least at the time of their first checkup, what’s the most descriptive way to describe the mothers weight as a variable?

A

It’s a confounding variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What would be an appropriate placebo for the study?
A) multivitamin X
B) multivitamin Y
C) A non active-pill that looks like the other vitamins

A

C)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

36.52+2
—-—-—
—2

Is this an arithmetic expression or algebraic equation?

A

Arithmetic expression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Arithmetic expression or algebraic equation? In(6a) + b = -3105.28
Algebraic equation
26
Exponent means?
Repeat multiplication E.g.
27
When do additional zeros on the right hand side count as significant figures?
When they indicate additional precision. E.g. extra zeros at the end of a decimal indicate additional precision because they are not needed to right down the number properly. Zeros on the end of an integer often don’t count because they may just be there to tell us the scale of the number (e.g. 4900 rather than 49), rather than to indicate additional precision.
28
Why do zeros at the beginning of a number, e.g. 0.0045, not count as significant figures?
Because they don’t indicate additional precision
29
What is a random variable?
A numerical qty whose possible values depend on a random phenomenon (Anything you don’t know without collecting your sample) E.g. • no you get when you roll a dice • pulse (beats per min) • age (random sample of individuals)
30
Explain continuous random variables
A random variable that can take on any value in a range E.g. time in seizures or urine output in mL
31
Explain Discrete random variables
A random variable that can only take on certain values E.g. No of bacterial cells
32
Height of individuals in a random sample What type of variable is this?
Continuous random variable
33
No of cm in a m What type of variable is this?
Not a random variable
34
Your blood pressure at any given time What Tyler of variable is this?
Continuous random variable
35
Explain Probability distribution
A function or table that helps link outcomes of a random variable to the probability that the outcome will occur If X has a probability distribution f then we write X ~ f (x).
36
Explain discrete uniform distribution (not assessable)
Discrete Uniform Distribution (not asssessable) We can also show this information in a bar chart Figure: Probability of each outcome on a die roll. If you’re interested: this is the discrete uniform distribution. It describes the situation where our full list of events is equally likely. We write X ∼ DiscreteUniform(1, 6).
37
Why is data helpful?
Can help answer difficult answers. E.g. How does the type of cholesterol-lowering drug affect mean cholesterol levels in people who are over 50 an have high cholesterol
38
Why is it rarely feasible to perform a census?
•Expensive •Impractical to collect data from some •parts of the population •Information will become stale •The question could be about the •future or an untested treatment
39
What is sampling error and what causes it
The estimate we get using a sample doesn’t match the true quantity in the population Random variation Measurement error Sampling bias
40
What can help make a sampling error small?
When we have representative sample
41
Explain Under-coverage bias
42
Explain self-selection bias
43
Independent variable
A variable that we think affects the dependent variable. Often denoted X
44
Dependant variable
A variable that we think might be affected by the independent variable. Often denoted y and sometimes referred to as the response.
45
Two main types of studies?
Observational (aka descriptive) - measuring & recording variables only - allows for association or correlation between variables - aim to identify patterns Experiments (aka casual studies - deliberate manipulation of 1 or more variables, observing the effect or response - allows us to infer causation - aims to explain patterns
46
Nuisance variables
Usually there are other factors influencing the dependant variable besides the independent variable
47
What’s a quick way to summarise statistics (aka descriptive statistics)
- Mean, median and mode - Range, interquartile range, variance, standard deviation - Correlation, covariance - Other quantities like the proportion satisfying some criterion or quantiles
48
What are the two ways a summary statistic is referred to?
Population parameters: when they refer to the population Sample statistics: when they’re estimated from a sample. Sample statistics are estimates of population parameters.
49
Examples of continuous random variables?
• Age • Height • Income • Weight • Heart rate • Temperature
50
Examples of discrete random variables?
• Shoe size • Outcome on dice
51
What are the two basic types of statistical inference?
• Confidence Intervals • Hypothesis testing
52
Confidence intervals
When you want to make an estimate regarding population
53
Hypothesis testing
When someone has a supposed value regarding the population and your putting it to the test
54
Name the types of data in statistics and their subtypes?
Categorical - Nominal - Ordinal Numerical - Discrete - Continuous
55
What type of data is this? “What is Steph’s 3-point percentage this season?” What are proportions?
Each three point attempt provides nominal data: {3 point made, 3 point missed} A proportion aggregates this information to provide a numerical summary figure. Steph Curry (0.4766, 128 shots)
56
What is the difference between Nominal and Ordinal?
The team Sam Thaiday plays for is considered Nominal, whereas the position he plays is Ordinal (size, strength, ability etc determine his position). (Dragons, Cowboys & Bulldogs possible teams he could play for is considered “Sample Space”)
57
Example of continuous numerical data ?
Height Weight Temperature Length a type of numerical data that refers to the unspecified number of possible measurements between two realistic points
58
95% confidence intervals are determined by?
Sample size, percentage and population size The larger the sample, the more confident the answer truly reflects the population
59
Common parameter usage?
• μ - Mean of a numerical variable • σ - Standard deviation of a numerical value • π - Proportion of a categorical variable • ρ - Correlation between two variables • β - Gradient between two variables • θ (general use)
60
How do you transform normal distribution to standard distribution?
X - μ Ζ = ——— σ
61
What does the Z value tell us?
How many deviations away from the mean we are
62
Is the random variable shown normally distributed?
No Not symmetrical * in histograms normal distribution is bell shaped
63
What is a census?
An official count or survey of a population
64
What is a population?
A statistical population is all the observations of data of an experiment
65
What is a sample?
A sample is a portion of the population but not all the observations
66
What are different variable types?
- Discrete - Continuous - Count - Nominal - Ordinal - Interval - Ratio
67
What are the different types of continuous variables?
- Interval - Ratio
68
What are the different types of discrete variables?
- Categorical - Count
69
What are the different types categorical variables?
- Nominal - Ordinal
70
What is a nominal data variable?
- Is named data with no order - E.g. Gender, blood type
71
What is an ordinal data variable?
- Is data of ordered categories - E.g. Cancer stage, pain level
72
What is a count data variable?
- Is the count - E.g. number of visits
73
What is an interval data variable?
- Is meaningful data with no zero - E.g. Temperature, IQ
74
What is a ratio data variable?
- Is data that has a true zero - E.g. Pulse, blood pressure
75
The independent / predictor / explanatory variables affects what?
- The dependent / Response / Outcome variables
76
What is a nuisance variable?
- A nuisance variable is a variable that is no loner of interest to us
77
Sampling might not be representative due to what forms of sampling error?
- Random variation - Sampling bias - Measurement error
78
What are some different types of sampling bias?
- Under-coverage bias - Self selection bias (voluntary response / non response bias) - Attrition bias - Survivorship bias
79
What are types of non-random sampling methods?
- Convenience sampling - Snowball sampling - Purpose sampling
80
What is random sampling?
- In simple random sampling each member of the sampling frame has an equal probability of being selected
81
What is stratified random sampling?
- If sub-populations within overall population vary, we can go one step further that a random sample. - Stratified random sampling divides members into homogeneous sub-groups before sampling.
82
What is cluster sampling?
- Cluster sampling allows us to randomly select some groups from the sample and use all individuals within those groups (one stage cluster) or randomly sample individuals from within those groups (two stage cluster)
83
What are the two effects the presence of a nuisance variable can have?
- If they are systematically related to the independent variable they are confounding variable which alter the apparent effects of the independent variable - If they have no relationship to the independent variable their effects will obscure the effects of the independent variable
84
How can we deal with nuisance variable?
- Hold the nuisance variable constant - Counterbalance the nuisance variable by including all of its values equally - Include the nuisance variable in the design as an explicit factor - The relationship with the variables may be destroyed by randomisation
85
What is a control group in an experiment?
- Is a group receiving a standard treatment or a placebo
86
What is a blind trial in an experiment?
- An experiment where the experimental units are not aware of which treatment they are receiving to avoid response bias
87
What is a double-blind trial in an experiment?
-The same as a blind trial but investigators are also kept unaware of which treatment the patients are receiving to avoid observer bias
88
A histogram can be used to give a rough idea of what?
- Histograms are used to visualize a variety of data - Shows the most common values - See the spread of data - See if the data is skewed
89
What are the different types of skewed data and what does it mean?
- Right or positive skew means a long tail to the right - Left or negative skew means a long tail to the left
90
What is the mean and how do you find it?
- The mean is the average score of the data - The sum of all the observations divided by the number of observations
91
What is the median and how can you find it?
- The median is the middle most number
92
What is the mode and how can you find it?
The mode is the most common occurring number
93
What is the 5 number summary?
The 5-number summary is based on splitting the ordered data into 4 roughly equal parts (quartiles)
94
What is the population sample variance?
Is how much observations differ from their mean
95
How do we find the population sample variance?
It’s the sum of Xi subtract the mean squared multiplied by 1 over n minus 1
96
What is standard deviation?
Is a value expressing by how much the observations of a group differ from their mean value for the group
97
How do we find standard deviation?
Standard deviation is the squared root of the sample variance
98
What is the IQR and how can we find it?
Is the range between the 1st and 3rd quartile (Q3-Q1)
99
What are two important summary statistics used to describe the relationship between two variables?
- Covariance is a measure of how two variables vary together - Correlation or Pearsons correlation coefficient
100
What is covariance and how can you describe it?
- Sample covariance measures the strength and direction of the relationship between the elements of two samples - With the same equation as sample variance but with the addition of the second group and removal of the square
101
What is correlation and how can you describe it?
Correlation is the value between 1 and -1 that indicated how strong the linear association between two variables is.
102
What are some properties of sample correlation and how can you find it?
- Is an estimator for the population covariance - Is the sample covariance between x and y over the sample SD of x and sample SD of y
103
What is a random variable?
A random variable is a numerical quantity whose possible value depends on a random phenomenon
104
What is the normal distribution and what is it used for?
- The normal distribution is used to describe continuous random variables - The Centre of a normal distribution is the mean
105
How is the normal distribution denoted?
By a histogram
106
What is the z-score and how can we find it?
- The z-score tells us how many standard deviations away from the mean we are - X- mean divided by SD
107
What is the standard error and how can we find it?
- Standard error of a statistic is the SD of its sampling distribution - Is sample SD over the square root of n
108
What is the percentage of results that fall within 1,2 and 3 SD from the mean?
- 68% - 1 - 95% - 2 - 99.7% - 3
109
What is the central limit theorem?
If a sample size n is large enough, then sample mean has a normal distribution
110
What is a QQ plot?
Is a graphical method for comparing two probability distributions by plotting their quartiles against each other
111
What is the Z test and how can it be used?
- A Z test can be used to calculate the Z statistic - A Z score can be compared to a p value
112
How is a Z test calculated?
Z* is calculated mean subtract the reference mean divided by standard error
113
What is the confidence interval?
- A confidence interval gives us a plausible range of values for a parameter - For a 95% CI we are 95% confident that the CI we have calculated contains the true parameter
114
What is the definition of a 95% CI?
A 95% confidence interval is a range of values that you can be 95% certain contains the true mean of the population
115
What is degrees of freedom?
Degrees of freedom is n minus 1
116
T-test?
A statistical test used to determine if there is any significant difference between the means of two groups
117
How do we calculate a one sample T-test T score?
Sample mean- reference mean over standard deviation divided by square root of n
118
How is 95% CI calculated?
Sample mean + or - the t* x SD over the square root of n
119
What is a paired t-test?
Used to determine whether the means of a dependent variable is the same in two related groups
120
What is unpaired T-test?
Determines whether there is a significant difference in the means of two independent or unrelated groups
121
What are different types of statistical hypothesis errors?
Type 1 error Type 2 error
122
What is a type 1 error?
We detect an effect when there is none
123
What is a type 2 error?
We fail to detect a true effect
124
What is a binary variable?
A discrete random variable that can only take two possible values
125
What is sample proportion and how is it calculated?
The sample proportion is the proportion of individuals in a sample Sharing a certain trait The number of successes over size of sample
126
What is proportion of odd and how is it calculated?
The odds ratio is the proportion for the possible outcome Is the sample proportion over 1 minus the sample proportion
127
What is the odds ratio and how is it calculated?
The odds ratio measures the chance of an event occurring in one group compared to the chance of an event comparing in another group Is the odds of group 1 over the odds of group 2
128
How do we calculate range?
Max - min
129
How do we find interquartile range?
IQR = Q3 - Q1
130
How do you calculate variance (population variance)
1/N (x - mean)^2
131
How do u calculate variance (sample variance)
1/N-1 (x - mean)^2
132
Calculating SD
The square root of the variance
133
What is covariance
A measurement of how two variables vary together (two continuous)
134
What is correlation?
A value between -1 and 1 that indicates how strong the linear association between two variables is. (Two continuous)
135
What is the most common visualization of one categorical and one continuous?
Side-by-side box plot
136
What is a continuous random variable?
A random variable that can take on any value in a range
137
What is a discrete random variable?
A random variable that can only take on certain values
138
What is a probability distribution?
A function or table that helps link outcomes of a random variable to the probability that the outcome will occur
139
How is normal distribution denoted?
By a histogram
140
Representative bias
A sample that accurately reflects the characteristics of a population as a whole
141
Self-selection bias
Subjects seek out being involved in a study
142
Voluntary response bias
A sample which involves only those who want to participate in the sampling
143
Non-response bias
Bias introduced into survey results because individuals refuse to participate
144
Survivorship bias
Concentrating on the people or things that “survived” some process and inadvertently overlooking those that didn’t because of their lack of visibility
145
Attrition bias
Occurs when participants drop out of a long-term experiment or study
146
Convenience sampling
Choosing individuals who are easiest to reach
147
Snowball sampling
Recruitment of participants based on word of mouth or referrals from other participants
148
Purposive sampling
A biased sampling technique in which only certain kinds of people are included in a sample
149
Simple random sampling
Every member of the population has a know and equal chance of selection
150
Binary
The value of the data has only two options
151
Proportion
The relationship of one thing to another
152
Standard deviation
A computed measure of how much scores vary around the mean score
153
Coefficient of variation
Standard deviation/mean
154
Central limit Theorem
The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution
155
Steps in hypothesis testing
Step 1: State your null and alternate hypothesis. ... Step 2: Collect data. ... Step 3: Perform a statistical test. ... Step 4: Decide whether to reject or fail to reject your null hypothesis. ... Step 5: Present your findings.
156
P value
The probability of observing a test statistic as extreme as, or more extreme than
157
One-way ANOVA
A statistical test used to analyze data from an experimental design with one independent variable that has three or more group (levels).
158
Pearson’s correlation
Assumes that both variables are continuous
159
Simple linear regression
Regression analysis involving one independent variable and one dependent variable in which the relationship between the variables is approximated by a straight line
160
One sample T-test
1. Null and alternative hypothesis 2. Do test, including CI 3. Assumptions- independent/normally distributed 4. Reject/Accept
161
Paired sample T-test
1. Null and alternative 2. Do test with CI 3. Assumptions- independence, normality 4. Interpret
162
2 sample T-test
1. Null and alternative 2. Do test CI 3. Assumptions- independence, normality 4. Interpret/reject accept
163
Correlation
A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other
164
Regression
Dependence of one variable on another
165
Spearman rank-order correlation
Used when variables are measured on an ordinal scale (the numbers reflect the rank ordering of participants on some attribute)
166
Pearson correlation coefficient
The most common statistical measure of the strength of linear relationships among variables
167
Unpaired T-test
Compares two means