test 3 Flashcards

1
Q

what are zener cards

A

cards that test psychic powers, 5 card choices total

success/correct = choose 1 symbol and its correct meaning you predicted it right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

difference between N, n, and p

A

N: population size, total number of trials (ex. number of students)

n: number of experiments in which event of interest occurs (ex. number of cards)

p: probability of success

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a binomial distribution

A

used for determining probability of getting a certain number of successes, where each trial has only 2 possible outcomes: success or failure (used in zener decks)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

difference between normal and binomial distribution

A

binomial: used when you have a fixed number of trials (counting fails or successes)

normal: used for continuous data and can take on any interval, shaped like a bell-curve, no fixed number of trials

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

explain how this sapply function works (which is used inside of data frames):

sapply(0:n, function(X) sum(x==X))

A
  • sapply applies the given function
  • 0:n generates a sequence from 0 to n number
  • function(X) sum(x==X) is applied to each element of this sequence
  • inside function, X represents each element of sequence
  • for each X, it calculates how many times X occurs in the vector x (which was generated earlier)

The sum(x==X) part counts how many times the value X occurs in the vector x.

so in essence: sapply function is used to count how many times each number from 0 to n occurs in the generated x data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what’s the code?

create a frequency distribution of the observed number correct in a zener deck with 25 cards and 100 students

A

first generate a set of data using the rbinom function

x <- rbinom(100, size=25, prob = 0.2)

prob of correct is 1/5 so prob. is 25, 100 students is N and size is 25 cards

now make a data frame for it:

df.zener <- data.frame(Count=0:25, Frequency = sapply(0:25, function(X) sum(x==X)))

this gives you the frequency distribution of the number correct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

mean and standard deviation formulas in a binomial distribution + codes for mean and mean prob.

A

mean = np

n = number of trials
p = prob. of success

standard deviation = √(np(1-p))

  • these formulas always work for finding the mean and standard deviation in a binomial distribution

codes:
mean(x)
mean(x)/n for prob.
sd(x) for standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

95% confidence interval - why its used and how to code for it (using binomial distribution)

A

its used bc its reliable/precise and gives good info ab the data
- similar to 2 standard deviations above the mean

code:

qbinom(0.025, size= n, prob =p) #lower limit

qbinom(0.975, size=n, prob =p) #upper limit

q binom function is used for quartiles, and you are finding the 2.5th percentile and 97.5th percentile to get the 95% confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what’s the code?

use the probability distribution function to get the probability of observing more than a certain value

A

pbinom(qbinom(0.975, size =n, prob=p), size =n , prob=p, lower.tail=FALSE)

pbinom for the probability, lower.tail=FALSE to find the upper threshold, the upper tail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what’s the code?

what’s the probability that at least one person in N gets more than 9 cards correct?

A

prob. of no one getting more than 9 cards correct → (1-P)

then find the complement of that by subtracting it from 1 and raising to the N power → 1- (1-P)^N

code:
1 - (1 - pbinom(9, size=n, prob=p, lower.tail=FALSE))^N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what’s the code?

finding the 95% confidence interval using the normal distribution

A

qnorm(0.025, mean= np, sd=sqrt(np*(1-p))) #lower limit

qnorm(0.975, mean= np, sd=sqrt(np*(1-p))) #upper limit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

formula to find Z score

A

Z = (X - μ)/σ

X = vector of observations
μ = mean
σ = standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Z score (what it is, what values are common + Z score code)

A

Z score: tells u how far a particular data point is from the average of a group of data points, measured in terms of standard deviations

  • Z score of 1 means 1 standard deviation and so on

Z 0.025 = -1.96 & Z 0.975 = 1.96

Z- score code: qnorm(0.975) or qnorm(0.025)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what’s the code?

expected lower and upper 95% confidence limit of X using Z score

when it says expected, it has to do with proportions

A

qnorm(0.025) sqrt(np(1-p)) + np #lower limit

qnorm(0.975) sqrt(np(1-p)) + np #upper limit

formula: X =σZ + μ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

whats the code?

one sided z-test for proportions using p and p0 & two-sided z-test

A

z <- (p - p0) / sqrt(p0 * (1-p0)/N)

one sided
pnorm(z, lower.tail=TRUE)

  • in one sided tests, we only care about the left hand side so (bc observed value is less than what we expected it to be)

what u just calculated gives u p value and if its less than 0.05, you reject null hypothesis, greater = accept null hypothesis

two sided
2*pnorm(abs(z), lower.tail=FALSE)

  • can do either FALSE or TRUE for tail but just be consistent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are Q-Q plots used for?

A

to asses departures from normality

  • whether or not data follows a theoretical distribution
17
Q

whats the code to find the empirical and expected (were the data normal) 99th percentile of weight

A

empirical means observed and you use the quantile function for that

for expected, you use the classic formula of X = σZp + μ

quantile(weight, prob=0.99) #empirical/observed 99th percentile

type=1 at the end is there by default

sd(weight)*qnorm(0.99)+mean(weight) #expected 99th percentile

what you’re calculating is basically the number that 99% of the data fall below

18
Q

how do you plot a Q-Q plot with the normal line too? (what’s the code for it)

use the weight example after attaching babies

A

qqnorm function creates the graph

qqnorm(weight, xlab=”theoretical quantiles”, ylab=”maternal weight (lbs)”) #q-q plot

qqline(weight, col=”red”) #normal line

19
Q

what is a t test used for

A

used to compare the means of 2 different populations

  • more helpful when there are relatively small sample sizes and you are dealing with a normal distribution
20
Q

lower and upper limit values of the normal distribution

A

qnorm(0.025)
qnorm(0.975)

Z = -1.96 and 1.96

95% of the values on a standard normal distribution will be within this range