Statistics Flashcards

(31 cards)

1
Q

discrete vs continuous data

A

discrete:
- set number of values, eg shoe size

continuous:
- can have any value, eg height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

definition:

population

A

total set of possible values that could be selected for the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

definition

sampling unit

A

a single member of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

definition

sample

A

a selection of sampling units observed to make conclusions about population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

definition

sampling frame

A

a list of all members of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

advantages and disadvantages:

sample

A

advantages
- less time consuming/ expensive
- fewer people to respond
- less data to process than census

disavantages:
* data may not as accurate as census
* may not be large enough to give info abt small sub groups of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

dis/advantages

census

A

pros
* should give accurate results

cons
* time / expensive
* can’t be used when testing process destroys the item
* hard to process large quantity of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Systematic sampling definition

A

A sample is formed by choosing members of a population at regular intervals using a list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

stratified sampling

A
  • population divided into specific groups & random sample taken from e/ group
  • proportion chosen from group equal to proportion sample size n is of total population N
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

pros and cons of stratified sampling

A

PROS
* useful when very diff groups in population
* sample represenative of population structure
* members selected randomly

CONS
* can’t be used if not possible to split population into specific groups
* same cons as simple random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

opportunity sampling

A

sample is formed using available members of population who fit criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Pros and cons of opportunity sampling

A

PROS
* Quick and easy
* useful when list of population not possible

CONS
* unlikely to be representative of population structure
* likely to produce biased results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

pros and cons of quota sampling

A

PROS
* useful when sampling frame not available
* sample will be representative of population structure
CONS
* may introduce bias as some members of the population may choose not to be sampled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

in a data set

outliers are

A

any data points 2 standard deviations more or less than mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

in a box plot

outliers are

A

any data point that is 1.5x IQR more or less than upper or lower quartile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to work out estimated mean in a frequency table

A
  • mid interval value (x)
  • frequency (f)
  • Efx / f
17
Q

coding

measure of location is affected by:
measure of spread is affected by:

A

measure of location is affected by: all operations
measure of spread is affected by: only multiplication or division

18
Q

linear interpolation

what do you do to the value when finding quartiles / percentiles for discrete data?

A
  • decimal number: round up
  • whole number: take average of x and next number
19
Q

How to work out outliers?

A

if not in the range:
[Q1-1.5(IQR)] , [Q3+1.5(IQR)]

20
Q

2 events CANNOT be both:

A

independent and mutually exclusive

because
- when mutually exclusive: P(A n B) = 0
- when independent: P(A n B) = P(A) x P(B) and these 2 cannot be equal

21
Q

to work out P(A l B’):

A

P(A n B’) / P(B’)

22
Q

probability

condition for independency:

A

P(AnB) = P(A) x P(B)

23
Q

condition for mutually exclusive:

A

P (A n B) = 0

24
Q

What is a histogram?

A
  • A histogram: for grouped continuous data whereas a bar chart: discrete or qualitative data
  • no gaps betw
  • Whilst in a bar chart the frequency is read from the height of the bar, in a histogram the height of the bar is the frequency density
  • On a histogram frequency density is plotted on the y– axis. This allows a histogram to be plotted for unequal class intervals
  • It is particularly useful if data is spread out at either or both ends
  • The area of each bar on a histogram will be proportional to the frequency in that class
25
give a reason to justify the use of a histogram to represent these data
it is (unequal) grouped continuous data grouped frequency table given
26
explain why using mean and standard deviation are just estimates
because the data is grouped so no exact values
27
# definition census
observation of every member of a population to make a conclusion
28
Write down the underlying feature associated with each of the bars in a histogram.
area of the bar is proportinal to the frequency
29
Explain why a linear model may be appropriate to describe the relationship between f and d (positive correlation) (1)
point lie reasonably close to a straight line
30
Reliability of extra data points using line of best fit (2)
* **reliable:** interpolation as .... within range of values collected * **unreliable:** extrapolation as .... outside range of data collected
31
y = axⁿ and y = kbˣ as an equation of a straight line
* log y = log a + n log b Y = C + nX * Y = C + X logb (log b is the constant, gradient)