Biostatistics Flashcards Preview

Research Methodology > Biostatistics > Flashcards

Flashcards in Biostatistics Deck (104):
1

 

 

 

What are the two types of statistics? 

 

 

 

Descriptive and Inferential 

2

 

 

 

 

Define population

 

 

 

 

An aggregate of subjects  we want to study

  • things
  • cases
  • Bacterias
  • Animals
  • Humans

3

 

 

 

 

Define sample

 

 

 

a sample refers to a set of observations drawn from a population.

4

 

 

 

Define observation

 

 

 

 

Study unit / subject / individual

5

 

 

 

 

Define variable

 

 

 

Quality or quantity measured for each subject in the sample (age, sex, colour, weight)

6

 

 

 

 

Define dataset

 

 

 

A set of values on all variables of interest for all
observation in the study

7

 

 

 

 

Define parameters

 

 

 

 

Parameter are quantities used to describe characteristics of the population

8

 

 

 

 

Parameters are quantities such as:

 

 

 

Mean height of Swedish men

 

Prevalence of Hepatitis C in Swedish drug users

Proportion of breast cancer patients who develop another cancer

9

 

 

 

 

μ

 

 

 

 

Population mean

10

 

 

 

 

σ2

 

 

 

 

population variance

11

 

 

 

 

p

 

 

 

 

population proportion

12

 

 

 

 

Define target population

 

 

 

The population to whom we wish to
generalize our findings

13

 

 

 

 

Define study population

 

 

 

 

The population from which we sample


14

 

 

 

 

what are the measurements of central tendency? 

 



Median

Mean 

Mode 

15

 

 

 

What measure of tendency is good to use when data contains outliers? 

 

 

 

 

Median

16

 

 

 

 

Define mode

 

 

 

Mode is that most frequently occuring value  in the data

 

16

 

 

 

 

S2

 

 

 

Sample variance

17

 

 

 

 

S

 

 

 

 

Standard deviation of a sample

18

 

 

 

How is the standard deviation calculated?

 

 

 

By taking the is the square root of its variance

19

 

 

 

What does a  low standard deviation indicate?

 

 

A low standard deviation indicates that the data points tend to be very close to the mean

20

 

 

 

 

What does a high standard deviation indicate?

 

 

a high standard deviation indicates that the data points are spread out over a large range of values

21

 

 

 

What does the standard deviation tell us? 

 

 

 

it tells us how much variation or "dispersion" exists from the average (mean, or expected value)

22

 

 

 

What does the variance tell us? 

 

 

 

The variance is describing how far the numbers lie from the mean (expected value)

 

A image thumb
23

 

 

 

What is the constant for 90 % confidence intervall? 

 

 

 

C = 1.64

24

 

 

 

What is the constant for 95 % confidence intervall?

 

 

 

 

C = 1.96 

25

 

 

 

What is the constant for 99 % confidence intervall?

 

 

 

 

C = 2.58

26

 

 

 

what is a stochastic or random variable? 

 

 

 

is a variable whose value is subject to variations due to chance

27

 

 

 

Q image thumb

 

 

 

 

Sample mean

28

 

 

Q image thumb

 

 

 

 

Population mean 

28

 

 

Q image thumb

 

 

 

Population variance 

(Sigma square)

29

 

 

Q image thumb

 

 

 

 

Sample variance 

30

 

 

 

 

What is a nominal variable? 

 

 

 

A variable that assume values  that fall into unordered categories (e.g. maritial status, place of birth)

31

 

 

 

What is a binary or dichotomous variable? 

 

 

 

A nominal variable with only two categories (e.g. gender, yes/no)

32

 

 

 

 

What is a ordinal variable? 

 

 

A variable that assume values that fall into ordered categories

disease status: minor, moderate, and severe

 

Blood pressure: Low, normal, and high

33

 

 

 

What is the

interquartile range?

 

 

 

The interquartile range is equal to Q3 minus Q1

34

 

 

 

Quantitative variables can either be: 

 

 

 

 

Discrete or continuous

35

 

 

 

 

Define discrete variable  

 

 

 

Data that can be arranged into naturally occurring groups. For example number of children in a family or number of cigarettes smoked per day. 

A image thumb
36

 

 

 

Define

continuous variable

 

 

 

A variable with a potentially infinite number of possible values along a continuum. For example height and weight

37

 

 

 

Explain

range of distribution 

 

 

 

The difference between the largest and smallest values in a distribution.

 

38

 

 

 

The number of successes that result from the binomial experiment is denoted by the symbol

 

 

 

 

 

X

39

 

 

 

The number of trials in the binomial experiment is denoted by the symbol

 

 

 

 

n

40

 

 

 

The probability of success on an individual trial in a binominal experiment is denoted by the symbol..

 

 

 

 

P

41

 

 

The probability of failure on an individual trial in a  binominal experiment is denoted by 

 

 

 

 

1 - P

 

42

 

 

 

The mean of any distribution is also called...

 

 

 

 

Expectation

43

 

 

 

Both standard deviation and standard error (SE) are calculated from the...

 

 

 

 

 

 

 

Variance 

44

 

 

 

When calculating variance why do we square the deviations?

 

 

 

to eliminate negative values

45

 

 

 

 

How is the standard error calculated? 

 

By dividing the standard deviation with the square root of n

 

 

 

A image thumb
46

 

 

 

 

What measure of distribution is good to use for the median

 

 

 

 

Percentiles or quartiles 

47

 

 

 

What is a type I error

 

 

 

Type I error occurs when the researcher rejects a null hypothesis when it is true.

48

 

 

 

 

What is a type II error?

 

 

 

A Type II error occurs when the researcher accepts a null hypothesis that is false.

49

 

 

 

What is the confidence interval used for? 

 

 

 

the confidence interval is used to express the degree of uncertainty associated with a sample statistic. 

50

 

 

 

What is a continuous varuable? 

 

 

a variable that can take on any value between its minimum value and its maximum value.

51

 

 

 

 

Z-score is also called...

 

 

 

 

Standard score

52

 

 

 

 

What does a Z-score indicates? 

 

 

 

it indicates how many standard deviations an element is from the mean.

53

 

 

 

 

How is the Z-score calculated? 

 

 

 

       

 

A image thumb
54

 

 

How is the variance of a population calculated? 

Q image thumb

 

 

 

 

A image thumb
55

 

 

 

What does the horizonatal line in a box plot diagram represent? 

 

 

 

 

It represents the median or the 50% percentile 

 

A image thumb
56

 

 

 

 

What type of variables are histograms good for?

 

 

 

 

Continuous variables 

57

 

 

 

 

What does the lower limit of the box in a box plot represent? 

 

 

 

 

the 25th percentile 

58

 

 

 

 

What does the upper limit of the box in a box plot represent? 

 

 

 

 

The 75th percentile 

59

 

 

 

 

what does the lower whisker of a box plot represent? 

 

 

 

 

it is the smallest value within 1.5 times the interquartile range from lower limit of the box

60

 

 

 

 

what does the upper whisker of a box plot represent?

 

 

 

 

it is the largest value within 1.5 times the interquartile range from upper limit of the box

61

 

 

 


What does the outer dots in a box plot represent? 

 

 

Outliers 

values greater than upper whisker or smaller than lower whisker

62

 

 

 

How many percent of the observations do we find within 1 standard deviation of  the mean? 

 

 

 

 

68 %

63

 

 

 

 

How many percent of the observations do we find within 2 standard deviations of  the mean? 

 

 

 

 

95 % 

64

 

 

 

 

The standard deviation has the same unit as the...? 

 

 

 

 

Mean

65

 

 

 

 

Name four characteristics of the Normal distribution

 

• meant for continuous variables


• defined from minus infinity to plus infinity

• symmetrical and bell-shaped

• centered about its mean

66

 

 

 

A Normal distribution with mean
zero and variance one is called 

 

 

 

 

standard Normal distribution.

67

 

 

 

 

Name five sampling schemes 

 

 

Simple random sampling 

Systematic sampling 

Stratified sampling 

Cluster sampling 

Non-probability sampling 

68

 

 

 

 

Simple random sample

 

 

 

Sampling units are equally likely to be part of the  sample units

69

 

 

 

 

Systematic sampling

 

 

a statistical method involving the selection of elements from an ordered sampling frame.

 

Ex. One random number is  generated then  every 5th is choosen. 

70

 

 

 

 

Stratified sampling

 

Divide the population into strata; draw random samples within each stratum;


sampling fractions may vary across strata

 

It ensures that all the strata are represented

71

 

 

 

 

Cluster sampling

 

 

 

Identify clusters or groups of units in the population (e.g. families); draw of
random sample of cluster rather than units (e.g. individuals)

72

 

 

 

Non-probability sampling

 

 

 

 

 

Convenience sampling schemes (e.g. volunteers)

 

Prone to bias

73

 

 

 

Probability can also be said to be the....? 

 

 

 

 

Relative frequence in the long run

74

 

 

 

The probability is always a number between...?

 

 

 

 

 0-1 

75

 

 

 

In linear regressions the independent variable is denoted by what letter? 

 

 

 

 

 

 

 

X

76

 

 

 

In linear regressions the dependent variable is denoted by what letter?

 

 

 

 

Y

77

 

 

 

 

Positive linear association means

 

 

 

 

Positive covariance

78

 

 

 

 

Negative linear association means 

 

 

 

 

Negative covariance 

79

 

 

What are the association? 

Q image thumb

 

 

 

 

Positive

80

     

 

What are the association? 

Q image thumb

 

 

 

Negative

81

 

What are the association? 

Q image thumb

 

 

 

Non! 

Independent

82

 

 

 

The correlation coefficient can never be greater than...? 

83

 

 

 

 

The correlation coefficient can never be smaller than? 

 

 

 

 

-1

84

 

 

 

what does it mean if the correlation coefficient is equal to 0

 

 

 

There are no covariance between two variables 

85

 

 

 

 

Explain residuals 

 

 

 

it is the difference between the observed value of the dependent variable (y) and the predicted value (ŷ) 

86

 

 

 

What is the coefficient of determination (r2) if x does not affect y at all? 

 

 

 

 the coefficient of determination (r2) is 0%

87

 

 

 

 

What does the intercept of an eqation mean? 

 

 

 

The intercept is the value of the dependent variable when the value of the independent variable is = 0 

88

 

 

 

 

what does β (slope) represent? 

 

 

 

β is the value that determines how many units y increases when x increases one unit. 

89

 

 

 

In linear regressions the independent variable is denoted by what letter?

 

 

 

 

X

90

 

 

 

 

What types of variables are used in binominal distributions? 

 

 

 

  

Categorical binary variables

91

 

 

 

The null hypothesis is denoted by...? 

 

 

 

 

H0

92

 

 

 

 

The alternative hypothesis is denoted by...? 

 

 

 

 

H1 or HA

93

 

 

 

 

What are the most common α-levels? 

 

 

 

0.01
0.05
0.10 

94

 

 

 

 

if the confidence level is 95%, then alpha would equal

 

 

 

 

0.05

95

 

 

What do we do if the If the P-value is less than the significance level? 

P

 

 

 

 

We reject the null-hypothesis 

H0

96

 

 

The criteria for rejecting the null hypothesis are:

p ≤α

 

 

 

 

reject the null hypothesis

97

 

 

 

The criteria for rejecting the null hypothesis are:

                   p > α

 

 

 

do not reject the null hypothesis

98

 

 

 

What values can a p-value take? 

 

 

 

only values between 0 and 1 

99

 

 

 

The 95% confidence interval for the mean represents

 

 

 

The interval that contains, with 95% probability, the true mean value in the population.

100

 

 

 

A binomial distribution must meet these four requirements

 

1. A fixed number of tests


2. Each test must be independent


3. There can be only two results (Success or Failure)


4. No test has any impact on any other test.

101

 

 

 

 

Define Z-score

 

 

 

 

A z-score is defined as the number of standard deviations a specific point is away from the mean.

102