Data visualisation Flashcards

(39 cards)

1
Q

What is a stem and leaf diagram

A

displays numerical data where each value is split into two parts:
the stem, which represents the leading digit(s), and
the leaf, which represents the last digit. It is a way to see distrubutions, patterns and data spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The table below shows the ages (in years) of 20 people who attended a movie night:

13, 15, 17, 18, 19, 20, 21, 21, 22, 22,
23, 24, 25, 25, 26, 28, 30, 31, 33, 35

a) Construct a stem-and-leaf diagram to represent the data.

A

1 | 3 5 7 8 9
2 | 0 1 1 2 2 3 4 5 5 6 8
3 | 0 1 3 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can we also use a stem and leaf diagram for

A

comparing 2 data sets using one stem and leaf plot. ( remember we put lowest value at the top of the diagram)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What other method can be used for comparing 2 sets of data. and what is the purpose of this method

A

box plots- useful as they show the median ,range ,IQR and skewness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In order, what are the different points on a box plot

A

lower adjacent value

lower hinge

median

upper hinge

upper adjacent value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the middle 50 percent of the data known as in a boxplot

A

the interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe features of a frequency distribution

A
  • histogram=frequency
    -has individual frequency bars ( how many observations in the range)
    -each bar gives the frequency of a given value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the freatures of a probability distribution

A

bell curves

smooth but segmented by SD’s

area under curbe is the probability that value occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

why do we wnant normally fitted distributed data which fits under a bell shaped curve

A

as it allows us to do more powerful and accurte statistical tests.

power= more likely to be true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the two ways that distribution can deviate from normality

A

lack of symmetry ( skewness)

pointiness ( kurtosis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is skewness and how does it occur

A

is deviation from symmetry . This happens when more extreme scores are affecting the mean.

When histograms show a big difference between means, median and modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is positively skewed data

A

when the tail extends to the right

mode<median<mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is negatively skewed data

A

when the tail extends to the left of the graph and starts at the right

mean < median < mode , meaning mean is smallest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is good method for showing the distribution of data

A

histograms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

why do we not report the mean in skewed data sets

A

as the mean is more sensitive to skew, so we report the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is kurtosis

A

is the measure of the tailedness of a distribution
tailedness= how often outliers occur

(in simple terms it refers to the point of the curve)

the higher the curve- more outliers

17
Q

what are the three types of kurtosis

A

Mesokurtic ( normal peak and tail)

platykurtic (negative)

leptokurtic ( positive

18
Q

for the types of kurtosis what are the values for there kurtosis

A

P= <3
m= 3

L= >3

19
Q

what are features of leptokurtic

A
  • positive
  • high peak and skinny in middle

-fat tails : signifies either lots of outliers or occassional outliers which are extreme

20
Q

Platykurtic distribution features

A

negative kurtosis
flatter distribution
broad in middle
skinny tails ( few outliers or outliers are not so extreme)

21
Q

Mesokurtic distribution features

A

Normal
Medium
Normal amount
0

22
Q

draw the distribution kurtosis

A

refer to lecture slides

23
Q

why is distribution so important

(cover later in semester)

A

determines what we can do with the data :
parametic/normal= mean/sd
non-parametic/skewed= median or range (IQR)

what inferential tests

Parametic = Parametic tests
non-parametic =np test

24
Q

how do we determine skewness

A

skewness statistics
histogram with normality curve
probability density curves

25
what is skewness statistics
a value showing how skewed the data is if 0 =normal . the further from zero, the more skewed
26
what value concludes if our data is skewed
<-1 or >1 meaning larger than one or less than -1
27
what is a historgram with normality curve and a density curve
Basically show the same thing, with 2 representing a smoothed distribution, while 3 follows the individual data more closely
28
what is a way of presenting data to help others understand it
figures and tables ( data is illustrated quickly and easily)
29
When are tables used
- for descriptive statistics
30
what are the rules for tables
-labelled and titled -placed at top of most appropriate page -font and size should be same as the main text -logical and easy to understand
31
what are figures
all other visuals that are not tables. Some of these will be useful to the reader to explain what you are talking about
32
what are bar charts used for , and what are the two tyoes of bar charts
comparing differences between means. Used for seperate data. simple - one differnce clustered-two differences in data sets
33
What are scatterplots
shows a correlation between the co-variables. There is a line of best fit which we can determine how strong the relationship is between the co-variables
34
Breifly what are error bars
presented on bar charts to show variability in data
35
for error bars, what is the percentage for confidence intervals
95 percent - percentage that u will produce an estimate in the range
36
what do confidence values tell us
something about where the population lies. If the error bars overlap , there is no significant correlation between two conditions vise versa
37
Where do figures go to assess data. Give examples of these figures
appendix ( stem and leaf , boxplots, histograms for distribution) for central tendency measures
38
where do figures go to present data analysis.Give examples of these figures
results ( bar graphs scatterplots )
39
When there is a normal distribution what is the most appropriate measurements to report
mean and standard deviation median and range- skewed ordinal iqr and trimmed mean skweed ratio and interval - trimemd mean and iqr