Types of variables and presentation of data Flashcards

(35 cards)

1
Q

What are some routinely collected sources of data?

A
  • mortality and census data
  • hospital activity data
  • primary care data
  • infectious disease notifications
  • regular national surveys (e.g. health survey
    for England)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a strength and a weakness of research study data?

A

+ Better quality
- More expensive and time consuming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 3 types of categorical variables?

A

Ordinal (ordered categorical)
Nominal (unordered categorical)
Binary / Dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is categorical data?

A

categories (no numbers) e.g. hair colour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ordinal data?

A

Has an underlying order

Categories can be ranked
e.g. highest level of education, GCSE, A level, Degree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is nominal data?

A

No underlying order, categories cannot be ranked e.g blood group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is binary (dichotomous) data?

A

Has two categories

e.g. Male / Female
Presence of disease - Yes / No
I / 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 2 types of numerical variables?

A

Continuous
Discrete / count variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is continuous data?

A

Can be any number

e.g. height
e.g. 5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is discrete ( count) data ?

A

Can only be whole numbers (integers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Can categorical variables be created from numerical variables?

A

Yes - categorical variables can be created from numerical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Can numerical variables be created from categorical variables?

A

NO - numerical variables CANNOT be created from categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is the type of variable important?

A

Variable type determines appropriate way to:

  • display the data
  • summarise the data (central tendency /
    variation)
  • analyse the data using statistical testing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How should single variable data with one categorical variable be presented?

A

= Bar chart, Pie Chart or Frequency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How should single variable data with one continuous variable be presented?

A

= histogram or bar chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How should a pair of variables with categorical outcome and categorical exposure be presented?

A

= Contingency table

17
Q

How should a pair of variables with numerical outcome and categorical exposure be presented?

A

= Box and whisker plot

18
Q

How should a pair of variables with numerical outcome and numerical exposure be presented?

A

= Scatter plot

19
Q

With exposure and outcome which is the X and which is the Y variable?

A

X variable = Exposure

Y variable = Outcome

20
Q

What factors relate to exposures?

A
  • Explanatory variable
  • Independent variable
  • Risk factor
  • Treatment group

X variable

21
Q

What factors relate to outcomes?

A
  • Response variable
  • Dependent variable
  • Case / control group
  • Disease group

Y variable

22
Q

What is the 3 main features of a bar chart?

A
  • Height of the bars are proportional to the
    frequencies
  • Useful for comparing frequencies relative to
    others
  • Variables MUST be categorical
23
Q

What is the 2 main features of pie charts?

A
  • Areas of the sectors are proportional to the
    frequencies
  • Useful for comparing the frequencies in each
    category with the whole group
24
Q

What are the 2 main features of a histogram?

A
  • Variable must be CONTINUOUS
  • relative frequencies are represented by
    areas of the bars
25
How does a box and whisker plot work?
Minimum / maximum indicated by whiskers Middle 50% contained within box Median indicated by horizontal line inside box
26
What are the three types of distribution?
- Normal distribution - Positively skewed (long tail to right) - Negatively skewed (long tail to left)
27
What is the definition of the mean (average)?
= Sum of all values divided by number of observations
28
What is the definition of standard deviation?
= Measure of the spread of observations around the mean √ (sum of squared deviations) / (no of observations - 1) (All square rooted) [Variance = SD2]
29
What is the formula for squared deviation?
= (Original value - Mean value) 2
30
What is the definition of the median?
the middle value when values are arranged in order
31
What is the definition of the interquartile range?
The range from the first (25%) to the third (75%) quartiles of a distribution
32
What is the definition of the mode?
= the most frequently occurring value - should not exist is the data is truly continuous
33
What is the definition of the range?
The difference between largest and smallest values in a distribution - depends upon the extreme values, which may give an unrepresentative view of the whole set of values
34
What does a 95% reference range indicate?
= mean + or - 1.96 x SD -> can interpret as likely values for an individual in the population
35
What is variance?
= (Standard Deviation)^2