midterm Flashcards Preview

Statistics > midterm > Flashcards

Flashcards in midterm Deck (30)
Loading flashcards...
1
Q
  1. Calculate and interpret the column percentages for the crosstabulation of SEX (row) by OBESE (column)
A

MEN ARE 50% OF THE SAMPLE BUT ONLY 38% OF NORMAL WEIGHT PEOPLE; 50% OF SAMPLE BUT 62% OF OVERWEIGHT PEOPLE AND 54% OF OBESE PEOPLE. HENCE MEN ARE OVERREPRESENTED AMONG THE OVERWEIGHT AND UNDERREPRESENTED AMONG THE NORMAL WEIGHT. THE REVERSE FOR WOMEN.

2
Q
  1. What percent of respondents who were hospitalized or died from coronary heart disease CHD were overweight or obese at start of the study?

What do u do using these variables?

  • *1) SEX** Respondents’ gender
  • *2) OBESE** Respondents’ body weight category at start of the study
  • *3) SMOKER** Was the Respondent a smoker at the start of the study?
  • *4) Age50** Age of Respondent at start of study
  • *5) CHD** Was Respondent hospitalized or died from cardiovascular heart disease (CHD) during the 10‐year study observation period?
  • *6) HEARTRATE** Respondent’s heart rate (beats/minute) at start of study
A

Run a crosstab of

5) CHD Was Respondent hospitalized or died from cardiovascular heart disease (CHD) during the 10‐year study observation period?

AND

2) OBESE Respondents’ body weight category at start of the study

ANSWER: 69 (15.8 + 53.5)

3
Q
  1. Is there a difference in the percentage of men and women who were smokers at start of the study?

What do you do?

  • *1) SEX** Respondents’ gender
  • *2) OBESE** Respondents’ body weight category at start of the study
  • *3) SMOKER** Was the Respondent a smoker at the start of the study?
  • *4) AGE50** Age of Respondent at start of study
  • *5) CHD** Was Respondent hospitalized or died from cardiovascular heart disease (CHD) during the 10‐year study observation period?
  • *6) HEARTRATE** Respondent’s heart rate (beats/minute) at start of study
A

run a crosstab of Smoker and Sex

it is 50/50, so no

4
Q
  1. Are smokers more likely than nonsmokers to be older, or younger? By what percentage?
A

Run Crossstab of Smoker by Age

YOUNGER. 60% OF SMOKERS ARE UNDER AGE 50 COMPARED WITH ONLY 47.5% OF NONSMOKERS, A DIFFERENCE OF 12.5%

5
Q
  1. Calculate and interpret a positive odds ratio to show the relationship between age and smoking
A

(120/80) / (95/105) ==== 1.66

SMOKERS HAVE 1.66 TIMES GREATER ODDS THAN NONSMOKERS OF BEING
UNDER AGE 50

6
Q

Odds Ratio

A

Odds ratios > 1 are Positive Odds (odds of success higher in row 1 than row 2)
-positive odds range from 1+ to infinity
Odds ratios ≈ 1 = Independence (equal odds)
Odds ratios < 1 are Negative Odds (odds of success less likely in row 1 than row 2)
-negative odds range from 0 to 1

7
Q

How to interpret a positive odds ratio… say 6.67

A

Relative to As,
B’s have a 6.67
times greater odds of being X

8
Q

Convert a negative odds ratio into a positive odds ratio by?

A

putting your decimal answer - .15 - under 1

1 / .15 = 6.67

Relative to A’s, B’s have

a 6.67 greater odds

of being X

9
Q

5a. Is the difference in the odds of smoking by age in Q. 5 statistically significant in the population

Explain.

(How do you show significance with Odds Ratios)

A

5a. Is the difference in the odds of smoking by age in Q. 5 statistically significant in the population?
Explain.

(120/80)/(95/105) = 1.66 (Odds Ratio) –Somkers 1.7 times likely to be under 50

  • STEP 1 - Take ln( of positive odds ratio…..
    • ln(1.66)= .506
  • STEP 2 - Calculate SE = Sqrt(1/n+1/n+1/n+1/n)
    • SQRT(1/120+1/80+1/95+1/105) = 0.202
  • STEP 3 - Divide Ln(Odds Ratio) / SE
    • .505549/.202197 = 2.50
  • STEP 4 -Analyze Yes A has a significantly greater odds of X than B ( t = what we got > critical t = at 1.96 etc p < .05 etc)
    • Yes, people under age 50 have a statistically significantly greater odds of smoking than people 50 or older (t = 2.5 > critical t of 1.96, p < .05)
10
Q

How to test significance with an odds ratio

A

with t

t= ln( positive Odds Ratio) / SE = Sqrt(1/n+1/n+1/n+1/n)

(since t = XXX > critical t = XXX, p<.XXX)

If |t| > 1.96, we say difference is stat significant at p < .05
If |t| > 2.58, we say difference is stat significant at p < .01
If |t| > 3.30, we say difference is stat significant at p < .001

Since 4.56 > critical tn= 3.3 (p<.001) we conclude that vietnmese in la have staistically greater odds than chinese of being first gen

11
Q

How to calculate confidence intervals for the true difference in odds

A

ln( positive odds) +/- 1.96/2.58/3.30

put in two values seperately—> put in EXP( x )

x, x

With 95% confidence, Vietnamese young adults in LA have somewhere between XXX and YYY greater odds than Chinese young adults in LA to be first generation immigrants

12
Q
  1. Among respondents who are not obese, calculate and interpret a positive odds ratio to show the relationship between age and CHD.
A
13
Q

Interpret odds Ratio,

How much greater/lesser are suburban kids’ odds of skipping 3+ times?

————-3+ times——<3 times——

Suburban 35 ———- 180

Urban —— 24————99

35/180= .194 for Suburban

24/99= .24 —> .194 / .24 = .8

negative: .8
positive: 1 / .8 = 1.25

A

.

.8 - Relative to urban kids, suburban kids have .80 times as great an odds of skipping 3+ times

Relative to urban kids, suburban kids have 1.25 times greater odds of skipping 3+ times

For suburban –> 1 - .8 = .2

Suburban Kids have 20% lower odds of skipping 3+ than Urban kids

14
Q

For confidence intervals,

t>

A

‘If |t| > 1.96, we say difference is stat significant at p < .05

‘If |t| > 2.58, we say difference is stat significant at p < .01

‘If |t| > 3.30, we say difference is stat significant at p < .001

15
Q

WRAPPING A CONFIDENCE INTERVAL AROUND A PERCENTAGE TO ESTIMATE TRUE VALUE FOR POPULATION

A

p +/- 1.96* sqrt( (P*(100-P) ) / n )

Example, calculate a 95% confidence interval to estimate the true percentage of 12th graders in the U.S. who skip class 3 or more times

take percent from frequency of skips — 13.8

skips Frequency skips classes
—————Frequency ——Percent ——Valid Percent
1 never 320 ———–64.0 ————-64.0—
2 1-2 times 111 ———- 22.2 ————-22.2
3 3 + ————–69 ——— 13.8—————13.8–
Total 500 ——— 100.0 ——–=–100

for 95

P +/- 1.96*SQRT((P)*(100-P)/N))

P =(13.8)-1.96*SQRT((13.8)*(100-13.8)/500)

P=(13.8)+1.96*SQRT((13.8)*(100-13.8)/500)

WITH 95% CONFIDENCE, BETWEEN 11% AND 17% OF 12TH GRADERS IN THE UNITED STATES SKIP CLASSES 3 TIMES OR MORE

16
Q

Wrapping a 95, 99 and 99.9% confidence interval around a percent

A

get the percent from a single frequency calculation

P +/- 1.96*SQRT((P)*(100-N)/(N))

replace with

  1. 96 for 95
  2. 58 for 99
  3. 3 for 99.9

With 99% confidence, between 11 and 17% of BLANKS do X

17
Q

Is there a statistically significant difference between the percent of rural and urban kids who never skip school in the US?

P 1 = 76.5 N1 = 162

P2 = 57.7, N2 = 123

A

statistical significance between two different perentages –> find T

P1 - P2 / SQRT ( P1( 100-P1)/N1 + P2(100-P2)/N2 )

P1= 76.5, 100-p1= 23.5, N1 = 162

P2 = 57.7, 100-p2= 42.3, N2 = 123

T numerator = 18.8 = =76.557.7

T Denomenator = -3.38 =sqrt((76.5*23.5)/162+(57.7*42.3)/123)

Since -3.38 > absolute-value |t| 3.30 (p < .001), we conclude that a statistically significantly lower percentage of rural kids skip than urban kids in the US

18
Q

How to analzye chi aquare of 130.6 DF=4

DF 4 – .05 (9.48) .01 (13.28)

A

Because 131 > critical chi-squre 13.28 (p<.01) we can conclude that religion and the liklihood of seeing an X rated film is significantly associated

19
Q

chi-square value = 20.8 DF=1

DF 1–> .001 (6.63)

A

because 20.8 > critical chi-square 6.64 (p < .01) we can conclude that being born in the US and the liklehood of being afraid to walk in one’s neighborhood at night is significantly associated

20
Q

DF=20, Chi-square= 18.4

Df 20 (.01 - 37.56 ) (.05 - 31.4 )

A

Because 18.4 <critical>
</critical>

21
Q

what to do when it asks you to

Interpret the main differences revealed by row percentages

A

there appears to be a relationship between the varables

talk about specific ones as well

22
Q

caluclate a t test to determine whether a signifiicantly greater percentage of people with no religious preference (none) than catholics have seen an x rated movie in the past year

t = 7.58

A

Since 7.58 > critical I t I (P<.001) we conclude that there is a statistically signifanctly greater percentage of people with no religious (none) than catholic individuaks who have seen an X rated film in the past year

23
Q
  1. Calculate a t‐test to determine whether a significantly greater percentage of Framingham smokers than nonsmokers were normal weight. Specify the p value for your test result and reach a conclusion in the form shown in LECTURE 11.

IN EXCEL, t =(53‐38)/SQRT((53*47)/200+(38*62)/200) = 3.0

A

SINCE |t| = 3.0 > 2.58 (p < .01), WE CONCLUDE THAT A STATISTICALLY SIGNIFICANTLY GREATER PERCENTAGE OF FRAMINGHAM SMOKERS THAN NONSMOKERS WERE NORMAL WEIGHT.

24
Q

Calculate a 95% confidence interval to estimate the true difference in the percentage of Framingham smokers and nonsmokers who were normal weight. Reach a conclusion.
MIN: 5% =(53‐38)‐1.96*SQRT((53*47)/200+(38*62)/200)
MAX: 25% =(53 38)+1.96*SQRT((53*47)/200+(38*62)/200)

A

WITH 95% CONFIDENCE, SOMEWHERE BETWEEN 5% AND 25% MORE SMOKERS THAN NONSMOKERS WERE NORMAL WEIGHT IN FRAMINGHAM

25
Q

How to analyze when asked to calculate a 95% confidence interval for percentages

A

WITH 95% CONFIDENCE, SOMEWHERE BETWEEN 5% AND 25% MORE SMOKERS THAN NONSMOKERS WERE NORMAL WEIGHT IN FRAMINGHAM

26
Q

most important statement for row percentage interpretation

A

there is an apparent relationship between

27
Q
  1. Calculate an odds ratio to answer this question: How many times greater are 12th grade boys’ odds

than girls’ odds of binging on alcohol?

                  Never            Yes             

Male 161 66

Female 231 42

A

Calculate an odds ratio to answer this question: How many times greater are 12th grade boys’ odds

than girls’ odds of binging on alcohol?

                  Never            Yes             Odds      Odds Ratio

Male 161 66 2.44 .44

Female 231 42 5.5

 .44  - negative odds

Relative to 12th Grade girls, 12th grade boys have a 2.27 times greater odds of having binged on alcohol

28
Q

Calculate an odds ratio to answer this question: How many times greater are 12th grade boys’ odds

than girls’ odds of binging on alcohol?

                  Never            Yes             Odds      Odds Ratio

Male 161 66 2.44 .44

Female 231 42 5.5

.44 - negative odds

Relative to 12th Grade girls, 12th grade boys have a 2.27 times greater odds of having binged on alcohol

3. Conduct a t‐test to determine if the odds ratio is statistically significant in the population, and reach a conclusion.

A

1) take LN( of the positive odds ratio

in this case 1 / .44 = 2.27 —-> ln( 2.27) = .819

  1. Find SE = SQRT ( 1/n + 1/n + 1/n + 1/n )

SE=Sqrt(1/231+1/161+1/66+1/42) = .222

  1. divide = .819 / .222 = T = 3.69

T= 3.69

If |t| > 3.30, we say difference is stat significant at p < .001

Since 3.69 > Critical t 3.33 (p<.001) we conclude that 12th grade boys have a statistically greater odds than 12th grade girls of having binged on alcohol

29
Q

Never Yes Odds Odds Ratio

Male 161 66 2.44 .44

Female 231 42 5.5 .44 - negative odds

Relative to 12th Grade girls, 12th grade boys have a 2.27 times greater odds of having binged on alcohol

4. With 95% confidence, what is the minimum and maximum the odds ratio could be in the population?

A
  1. With 95% confidence, what is the minimum and maximum the odds ratio could be in the population?

SE=Sqrt(1/231+1/161+1/66+1/42) = .222

Ln(pos odds ratio): ln(2.27) = .819

Ln(positive odds ratio): 2.27 +/- 1.96 *.222

2.27—> .819 for ln +/- 1.96*.222

.819 +/- 1.96 * .22

.3788 —-> exp = 1.46

1.2 —> exp = 3.49

With 95% confidence, 12th grade boys have somewhere between 1.46 and 3.49 greater odds than 12th grade girls of having binged on alcohol

30
Q
A