methods: analysis of quantitative data Flashcards
(28 cards)
what does research that gathers quantitative data produce?
- raw scores
- data tables are used to present this data
what are the data tables called?
- raw score table
- or frequency table
why are raw scores summarised?
- difficult to understand
- summarised to make it easier to see if trends are being shown and to highlight differences between groups
what can the data be summarised using?
- descriptive statistics
- measures of central tendency and dispersion
define measures of central tendency
- descriptive statistic that calculates the average or most typical value in a dataset
how can the average score be calculated? (measures of central tendency)
- mean
- mode
- median
how is the mean calculated?
- adding up all values in dataset and diving by number of scores collected
when is the mean often used?
- when interval/ratio level data is obtained
- it’s the most sensitive and powerful measures of central tendency because all scores in dataset are used in the calculation
- however, it can be affected by extreme values or when there’s a skewed distribution
how is the mode calculated?
- looking at the most frequent score in dataset
- if there’s 2 most frequent scores (bi-modal), both scores should be reported
- if there’s more than 2, mode is a meaningless measure of central tendency
when is the mode used?
- when nominal data is obtained and is easy to calculate
- it’s not affected by extreme scores
- however, it’s not a useful measure of tendency on small datasets with frequently occurring same values
how is the median calculated?
- when values in dataset are placed in rank order (smallest to largest)
- when dataset has odd number of scores, it’s simply the middle score
- if there’s an even number of scores, the mean of two middle scores needs to be calculated
when is the median used?
- when ordinal level data is obtained
- simple calculation to make and not affected by skewed distribution
- however, it’s less sensitive than the mean and isn’t useful on datasets that have small number of values as it may not represent the typical score
define measures of dispersion
- descriptive statistic that calculates spread of scores in dataset
- can be misleading without knowing variation between the scores
how is measures of dispersion calculated?
- range
- standard deviation
how is the range calculated?
- difference between highest and lowest value
- high range value indicates that scores are spread out and low range value indicates that scores are closer together
what is the range affected by?
- extreme scores
- may not be useful if there are outliers in the dataset
- also doesn’t indicate if scores are bunched around mean score or more equally distributed around mean
how can dataset be calculated if it has extreme scores? (range)
- calculate interquartile range
- involves cutting out lowest quarter and highest quarter of values (top and bottom 25%) and calculating range of remaining middle half of scores
what is standard deviation?
- deviation: distance of each value from the mean
- each score in dataset would have deviation value, so to get single value that represents all deviation scores, the standard deviation needs to be calculated
- SD gives a single value that represents how scores are spread out around the mean
- the higher the SD, the greater the spread of scores around mean value
how can data be presented?
- summary tables
- graphical representation
what do summary tables represent?
- measures of central tendency and dispersion
how can graphs be used?
- to illustrate summary data or data frequencies
- NEVER illustrate raw scores in graph since data should be shown that’s meaningful and summative
what are bar charts used for?
- to present data from categorical variable eg mean, mode, median
- categorical variable is placed on x-axis, and heigh of bars represents value of that variable
what are histograms used for?
- used to present distribution of scores by illustrating frequency of dataset
- unlike bar charts where bars are separated by space, bars on histogram are joined to represent continuous data rather than categorical (discrete) data
- possible values are presented on x-axis and height of each bar represent frequency of value
why is examining the distribution of data in larger samples important in research?
- shows the overall frequency of values in a dataset
- helps identify trends not visible in small samples
- allows estimation of the distribution of scores in the whole population