Presentation of Data Flashcards

1
Q

what are the purposes of screening data

A

to detect blunders
locate outliers
determine distributional properties
determine number of missing values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how is a small data set screened?

A

by eye

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how is a large data set screened

A

frequency table or histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the 2 main types of variables

A

categorical and quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are categorical variables

A

occur when individual falls into a category
divided into nominal (no ordering eg sex)
ordinal (have an ordering eg pain)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the frequency distribution

A

frequency of the occurrence of different values of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is relative frequency

A

frequency expressed as a proportion of the total frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how are interval scale variables graphically presented

A

histograms or box plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is a box plot

A

5 point summary of the data consisting of the minimum, 1st quartile, median, 3rd quartile and maximum valueas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what do summary statistics attempt to capture

A

a typical value (the location) or the spread (or dispersion)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what 2 measurements are used for location

A

mean and median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is mean

A

sum of all the observations divided by the total number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the median

A

middle value if a sample is arranged in increasing order. approx 50% of the sample is less than the median and 50% is greater than the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what summary statistics measure the spread

A

range, interquartile range, variance, standard deviation, coefficient of variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the range

A

difference between the largest and smallest observations in the sample - not recommended as it severely affected by outlying observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the interquartile range

A

difference between the 3rd and 1st quartiles

17
Q

what is variance (s2)

A

sum of the squared distance of each value from the mean, divided by the number of values - 1

18
Q

what is the standard deviation

A

the square root of the variance. used in preference to variance as is in the original scale of measurement

19
Q

what is the coefficient of variation defined by

A

c = s/x x100%
provides a measure of variation which is independent of the unit of measurement and hence can be used to compare the variation of variables measured on different scales

20
Q

what summary statistics are used if the distribution is roughly symmetrical

A

mean and std deviation

21
Q

what summary stats are used if the distribution is skewed

A

median and interquartile range