7: Variables, parameters and data Flashcards
(10 cards)
What is inference
Inference is the formal name given to learning from data using statistical tools
What is the parameter
The numerical measure of the quantity of interest in the population
( The parameter is not usually known, so we make an estimate. From this, we can make an inference about the population)
eg; The slope of a line describing the relationship between age and cholesterol level
What are the three main types of variables?
- Continuous
- Discrete
- Categorical
What is a continuous variable?
Can be expressed on a continuous scale in which every variable is possible
Continuous variable usually arises from some form of measurement
eg; height. age, blood pressure etc
What is a discrete variable?
Can be put in one to one correspondence with the counting numbers
eg; number of cases of cancer diagnosed during a day
number of children in a family (0, 1, 2, 3, etc)
What is a categorical variable?
Restricted to one of a set of categories. For example ‘head’ or ‘tails’
Categorical variables can be divided into binary and more than two
Binary
-Maori/non Maori
-smoker/non-smoker
etc etc
more than two:
-Blood group: A/B/AB/O
What is censored data?
With censored data the underlying variable follows a continuous distribution, but some values are not known exactly
What does it mean for data to be right censored?
The true value is known to be larger than a recorded value
eg; we know that someone lived until at least 31 December 2015
What does it mean for data to be left censored?
The true value is known to be smaller than a recorded value
eg; we know that a measurement is less than a known limit of detection
What does it mean for data to be interval censored?
The true value is known to lie between two values
eg; we know the date of infection with HPV is after a negative test and before a positive test 2 years later