Lecture 3-Summarising data Flashcards

1
Q

what is a variable?

A

subject’s characteristics taking any number of a set of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a qualitative variable?

A

(catergorical)
falls into a specific catergory
e.g. Sex, hair colour, ethnic group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a quantitative variable?

A

Continuous or discrete

  1. Continuous
    - variables can take any number of values –> height / weight
  2. discrete
    - variables that can only take integer measuremnets
    - number of children / number of pre-existing diseases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the three scales of measurement

A
  1. interval scale –> height / BMI / BP
  2. Nominal catergory –> sex / hair colour
  3. ordinal catergory –>QOL / hospital performance ranking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what does the frequency distribution show?

A

shows how often different values of a variable occur in the dataset, usually described as a graph or table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the frequency distribution graph for nominal / ordinal data?

A

Bar chart

-frequency on y-axis and values on the x-axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how can frequency distribution be shown on interval scale variables?

A
Histogram
-where the class intervals are on an axis and rectangles with heights or areas proportional to the frequencies are stacked on them
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is relative frequecy?

A

frequency expressed as percentage of the total frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the mean?

A

centre value of the sample

-sum of all the values divided by the number of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the median?

A

Middle value of the data

-resistant measure of data’s center

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is measure of dispersion?

A

the value each data piece has from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is variance?

A

The variance of the sample is the mean of the squared deviations of the values from their mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is standard deviation

A

square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the range?

A

Highest number - lowest number

prone to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the IQR

A

UQ - LQ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a positive skew?

A

Skew to the right

but the graph is towards the left

17
Q

what is a negative skew?

A

Skew to the left

graph is towards the right

18
Q

When would you see a uniform symmetrical histogram?

A

throwing a dice

19
Q

What is included in the five number summary boxplot?

A
  1. Min
  2. Q1
  3. Median
  4. Q3
  5. Max
20
Q

How to deal with outliers?

A
Check for obvious mistakes.
When/who recorded the data.
How are the results affected?
Can the results be analysed a different way?
Do not simply discard!
21
Q

What do outliers do to the data?

A

affect mean and standard deviation

do 5 calculations with the outliers present

22
Q

what is preferred when data is heavily skewed?

A

IQR and median