DATA DESCRIPTION Flashcards

(98 cards)

1
Q

2 types of statistics

A

DESCRIPTIVE: describe study population

INFERENTIAL: what we know to infer what we don’t know

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3 key factors in designing a research

A
  1. type of variables
  2. level of measurements
  3. extraneous + confounding variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

research design model (6)

A
  1. current knowledge
  2. choose hypothesis to test
  3. design experiment
  4. do experiment
  5. statistical analysis
  6. interpret + report
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

5 factors involved in good experimental research design

A

1) sample size and type of sample

2) accurate variables to reduce error

3) valid measuring instrument

4) practical experiment?

5) cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

why is it important to use research design

A
  1. smooth operation
  2. efficiency
  3. blueprint for planning
  4. reduce errors
  5. reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what makes good research design? (3)

A

1) reliability

2) replication

3) validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

4 types of validity

A

measurement

internal

external

ecological

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

/ Type of variable (3)

A

CONTINUOUS - temp (figure on a scale)

DISCRETE - no. of symptoms

CATEGORICAL - ethnicity, gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

/ measurement variables (type of scale) (4)

A

INTERVAL

RATIO

NOMINAL

ORDINAL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

/ interval scale

A

order of magnitude
equal intervals on scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

/ ratio scale

A

order of magnitude
equal intervals
absolute zero point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

/ nominal scale

A

attributes only named
e.g: gender - male female
ethnicity - white, black, asian

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

/ ordinal scale

A

attributes only ordered
e.g: 1st, 2nd, 3rd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

difference between EXTRANEOUS variables

and

CONFOUNDING variables

A

EXTRANEOUS: may effect other variables, not acknowledging in study

CONFOUNDING: type of extraneous, directly effects our variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

/ calculate median formula

A

(n+1) / 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

/ what does data look like when its:
1) + skewed
2) normally distributed
3) - skewed

A

1) to the left
2) equal on both sides
3) to the right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is a factor

A

e.g: two categories: undergrad v post grad

to compare their media, mode etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

/ MAKING DECISION

if both variables are categorical use…

A

a contingency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

/ MAKING DECISION

if you have one categorical variable and one continuous use…

A

compare means/medians

or

collapse and use contingency tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

/ what type of data is
1) mean
2) Median
best with

A

1) normal
2) skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

/ how to calculate a percentile value

A

percentile
————— X (n+1)
100

n = number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

/ what is RANGE

A

difference between highest and lowest value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

/ what is INTERQUARTILE RANGE

A

difference between upper and lower quartile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

/ what is STANDARD DEVIATION

A

measures average deviation from mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
/ what is VARIANCE
standard deviation squared
26
/ what are the upper and lower fences
if values are either side of these they are outliers
27
/ how to calculate upper and lower fence
Lower fence: LQ - (1.5 X IQR) Upper fence: UQ - (1.5 X IQR)
28
/ elements of a box plot: (top to bottom) 5
1) biggest observation below UF 2) UQ 3) Median 4) LQ 5) smallest observation above lower fence UQ-- LQ = IQR
29
limitations of pie charts
not recommended for multiple categories avoid 3d small observations, confusing to use % confusing when comparing outcomes of 2 diff survey/experiment
30
What does SD show large SD and small SD
spread of data LARGE SD: data more spread out SMALL SD: data closer to mean
31
/ equation for standard deviation
square root of: sum of (observation-mean)^2 ----------------------------- no. of observation - 1
32
difference between categorical v continuous data
CATEGORICAL data adds to a whole e.g: BMI categories CONTINUOUS data on individuals over time ratio/interval e.g: height
33
what scale is categorical data measured on
nominal ordinal
34
what scale is continuous data measured on
ratio interval
35
/ graphs for categorical data (3)
1) bar chart 2) stacked bar chart 3) pie chart
36
/ graphs for continuous data (6)
1) stem and leaf plot 2) histogram 3) box plot 4) bar chart w error bars 5) scatterplots 6) line graph for time series data
37
/ when should a scatterplot be used
2 continuous variables
38
what is an adjusted v non adjusted axis
unadjusted = start from zero adjusted = start from e.g: 40 as that's the lowest figure
39
/ calculate standard error
standard deviation -------------------------- square root of number of observations
40
/ when should 1) SD 2) SE be used
1) describe data you have 2) show how confident you are in estimate of the mean
41
what does a histogram show
distribution of data puts into categories e.g: age 1-5, 5-10 x axis = categories y = frequency in each category bars touch if continuous
42
how does a stem and leaf diagram work
stem : all but last digit leaf : last digit e.g: 43, 46, 47, 53, 54, 62 4| 3 6 7 5| 3 4 6| 2
43
adding or subtracting by constant number to each value in data when scaling 1) __________ SD 2) __________ mean
1) doesn't change 2) changes mean by amount added or subtracted
44
when multiplying or dividing by scale 1) SD __________ 2) mean _________
1 and 2) increases/decreases by proportion x or / by
45
when should SCALING and STANDARDISATION be used
SCALE: one person weight in lbs , one in kg STANDARDISE: a boy and girl at 26 months weight is 10kg standardise using gender
46
/ what does z score show
number of SD's an observation is from the mean
47
/ + Z score = - Z score =
+ = observation is above the mean - = observation is below the mean
48
/ what does it mean if the Z score is zero?
observations equals the mean
49
/ Z score equation
observation - mean ---------------------------- SD
50
/ 1) mean of Z score = 2) SD of z score = only when ....
1) 0 2) 1 working with whole data set they were collected from
51
/ in normal distribution curve what is the % from -1SD to +1SD
68.2%
52
/ imagine a normal distribution curve split into 6 'columns' , name the % of each column going up then down
0.13% 2.15% 13.6% 34.1% 34.1% 13.6% 2.15% 0.13%
53
/ on a 'NORMAL DISTRIBUTION TABLE' what does each column mean
along the left side: first digit in number along the top: second digit in number e.g: 0.66 0.6 along left side 0.06 along top
54
when can you use a normal distribution table
e.g Q: what proportion of data lies between mean and 0.66
55
what's the difference between a|: SAMPLE and POPULATION
SAMPLE: selection from population POPULATION: whole, large group, everyone fit criteria
56
theory of sampling (3)
1) STATISTICAL ESTIMATION point/interval estimate 2) TESTING HYPOTHESIS accept/reject null 3) STATISTICAL INFERENCES general population statement
57
limitations of sampling (5)
- less accurate - changing of units - misleading conclusion - need special knowledge - is sampling possible?
58
probability sampling methods (4)
1) simple random sampling 2) stratified sampling 3) systematic sampling 4) multistage sampling
59
non probability sampling methods (4)
1) deliberate sampling 2) convenience sampling 3) snowball sampling 4) quota sampling
60
PROBABILITY sampling methods: + and -
+: detailed info of pop measure precisely, unbiased -: require skill + expertise time to plan cost
61
simple random sampling characteristic
everyone has equal chance of being chosen random number generator
62
stratified sampling what are strata and what should they have
population split into strata (similar groups) strata needs homogeneity same ratio in each strata
63
systemic sampling + and -
order population, e.g: every 5th person +: simple smaller variance v ordered population - : estimate error
64
summarise multistage sampling
e.g: 1) randomly select region 2) randomly select school in region 3) randomly select children in school
65
multistage sampling + and -
+: complete pop list not needed only need info on selected sample cheaper if geographically defined -: larger errors
66
NON PROBABILITY SAMPLING + and -
+: include important units practical representative of importantance -: risk of bias not reliable
67
convenience sampling when to use (3)
use when: - no clear population - sampling not clear - complete list of source not available
68
snowball sampling
contact few people in target group get more people contacts from these
69
quota sampling
non random select categories then quota e.g: 40% men 60% women actively look for people to fit this bias cheaper
70
factors that effect reliability of sample (5)
size of sample representativeness homogeneity unbiased parallel sampling - another sample for test
71
3 errors in samples
1) SAMPLING VARIABILITY - diff samples from sam pop have diff SD + mean 2) SAMPLING ERROR - mean of sample different to mean of pop 3) NON SAMPLING ERROR - error when asking / recording results
72
SE formula for MEAN
SD ------ √ number in sample
73
when to use SE instead of SD
when using sample means to determine precision
74
The Central Limit theorem (3)
1) will have 'normal distirbution' 2) mean of sample means = mean of population 3) SE = SD
75
what numbers on 'normal table' show 95% 99%
95% = 1.96 99% = 2.58
76
SE formula for proportion what does sample size have to be above to work?
square [p (1-p)] root of: ------------ ​ n p = proportion n = no. in sample ​ 30
77
how to calculate a CONFIDENCE INTERVAL
for 95%: (sample +/- 1.96x SE mean) for 99% = +/- 2.58
78
how to use answer from CONFIDENCE INTERVAL formula
you will get a +/- number add/subtract this to your mean = upper and lower limit can conclude 99%/95% confident of _______ mean being between (upper and lower limit)
79
what is a POINT ESTIMATE how to calculate
estimate of a population mean add all samples up, divide by amount of samples , easy:)
80
what is an INTERVAL ESTIMATE
aka confidence interval use CI formula
81
describe a high-lo plot what is on it what does it mean
UL, MEAN, LL if DONT they overlap: sig nif difference in mean if they DO overlap : no sig nif difference
82
what is: NULL hypothesis
Ho NOT different from e.g:mean
83
what is: ALTERNATIVE HYPOTHESIS
H1 IS different
84
what is the alpha?
5% of 1%. z score you choose so e.g: accept null, accept risk of 5% being wrong visa versa
85
what is a TYPE 1 ERROR
reject null hypothesis accept alternative BUT null is true
86
what is a TYPE 2 error
accept null but alternative is right
87
what is a 2 TAILED TEST
reject Ho if statistic reaches either +/- e.g: 1.96
88
what is a 1 TAILED TEST
reject Ho if stat reaches one side of the e.g: 1.96 specify greater than or less than so risk is only in 1 tail
89
in hypothesis 1) If data is skewed we use the ____ 2) if data is normal we use _____
1) median 2) mean
90
flat distribution is called ___________ peak distribution is called _____________
platykurtic leptokurtic
91
what test do we use if: we know population SD
Z test
92
what test do we use if: don't know SD Sample size bigger than 30
z test
93
what test do we use if: dont know SD sample size less than 30 data normal
t test
94
what test do we use if: dont know sd sample size less than 30 data skewed
sign test
95
how to use SPSS to test if data is skewed
skewness/SE skewness , shows + or - skewed kurtosis/SE kurtosis , shows if angle normal if answer between e.g: (95%) -1.96 and 1.96 , data not skewed
96
difference between 1) QUALITITIVE 2) QUANTITIVE research
1) describe/understand quality of something 2) measuring quantity of something
97
example of qualitative v quantitive E.G: plan is A best
QUALITIVE: plan A is best approach QUANTITIVE: plan A will make participants embarrassed
98
4 elements of research process
ONTOLOGY: what do we want to know? EPISTEMOLOGY: what can we know and how? METHODOLOGY: how can we get knowledge? METHODS: procedures we can use?