Statistics - Sampling & Central tendencies Flashcards

(43 cards)

1
Q

Standard deviation indication

A

shows spread between numbers & how volatile they are

In same unit as data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variance indication

A

average degree to which points deviate from the mean (in squared units)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Outlier

A

any data that lies an abnormal distance from the given data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Extrapolating

A

estimating a value outside the given data range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Interpolating

A

estimating value inside the given data range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

line of best fit

A

estimated correlation used for predictions by extrapolating the graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

venn diagram

A

a geometric representation of sets & their relation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

quartiles

A

show you were certain percentages of the data lie; 25%, 50%, 75% respectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Outlier test

A

Q1- 1.5 x IQR = is below Q1
Q3- 1.5xIQR= is below Q3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

set

A

collection of well defined unqiue objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

list

A

a collection of objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

PMCC

A

Pearson moment correlation coefficient
* only used for linear equations
* - = negative correlation
* + = positive correlation
* always between 0-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ogive

A

cumulative frequency curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

central tendency

A

mean, mode, median = descriptive summary of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Skew

A

measuring where most data lies
negative skew = most are positive
postitive skew= most are negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Analysing histograms

A

CSOS
* centre
* spread
* outlier
* shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Shape

A
  • amount of peaks= unimodel, bimodel, multimodal
  • symmetry & skew
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

unreliable data

A

if
* missing data
* errors in handling data

14
Q

sufficient data

A

if there is enough data to support your conclusion

15
Q

How is standard dev. affected when a constant is added or subtracted

A

unaffected as all values shift by that number= distance between values remains the same

16
Q

how is standard dev. affected when a constant is multiplied or divided

A

standrad deviation is also multiplid or divided as this affects the ratio between the distances of the vlaues

17
Q

How is mean affects when a constant is added to a value

A

constant is also added to mean as it shifts

18
Q

Target population

A

population from which you take a sample of

19
Q

Sampling Unit

A

single member that is chosen to be sampled

19
Sampling frame
list of the items/people
20
Sampling values
possible values the sampling variable can take
20
Sampling Variable
variable under investigation
21
BIA's in sampling
* no response * bad design * bias in respondant * some mebers are excluded
22
Reliable data
data is reliable when you can retake it and get the similar results
23
Sufficient data
Data is sufficient if there is enough data available
24
Qualitative Data
* opinion based * expressed in words * can be described * ONLY mode can be calculated
25
Quantitative data
* expressed in numbers * can be discrete or continuous * can be measured * can be counted
26
Discrete data
* countable * in disctinct catagories * finite value Types of graph * dotted graph * bar chart
27
Continuous data
* measureble * can always be measured more accurately and to higher resolution * infinite value Graph * histogram * graph (example cumulative frequency)
28
Simple random sampling
Sampling units are assigned numbers and a random number generator is used Pros * everyone has equal chance of being chosen = bias free * simple & cheap Cons * not suitable for large population * needs sampling frame
28
Systematic Sampling
You take the population/sampling frame= k assign numbers on everyone and start between 1- k; take every kth member Pros * simple & quick to use * suitable for large sample sizes Cons * might be biased when you chose who to start on * sampling frame needed
28
Quota Sampling
Split sample into groups based on qualities handpick one item from each group until quota is satisfied Pros * ensures variety in sample * allows small groups to be represented * no sampling frame required Cons * biased in choosing = not random
28
Stratified sampling
put items in stratas with common characteristics find startas proportion within population = strata/population perform random sampling in each strata Pros * random * represnets different groups reflective in the population proportionally cons * needs smapling frame * same cons as random within each strata
28
Convenience sampling
find whoever is most convenient/closest proximity Pros * easy & inexpensive Cons * unreflective of use in sample * highly biased
28
Clustered sampling
put people into random groups of different kinds of people select individual group randomly only choose one from each group
29
things to remember when making box plots
Check for outliers
30
Estimating mean ( or any measures of cnetral tendency)
* in frequency list take mean of range value * use same frequency * plot in spreadsheet = calculate
31
finding measures of cnetral tendency on GDC
* plot list (x/frequency) * click menu * click stats * click stat calc * click one variable calc * done :D