CHemometrics test 1 Flashcards

1
Q

What is a barplot

A

displays distribution of categorical variable - horizontall or vertically

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to simply make a barplot

A

barplot(data)
can include main , xlab and ylab for titles
data should be grouped under categorical variables (and can select that for the data eg
data$categorical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Stacked vs grouped barplot and how to program

A

depends on if your data is matrix rather than a vector - can switch variabesl besides to true or false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are error bars on bar graphs

A

typically the height Plus or - stde

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ggplot? and how to use

A

an easier way to graph

ggplot(data, aes(x = , y = ) geom_point()
So typically have your gg plot function with your data x and y and then other functions after like geompoint, geom smooth etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to make error bars in ggplot

A

geom_errorbar()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is jitter

A

Shows all poitns and adds random spacing to make it easy to visualize

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a spinogram

A

a stacked bar plot but scaled to 1 (displays everything in %age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a box plot

A

box and whisker - displays median, upper and lower quartile (edge of box) and the upper and lower hinge (whisker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a parallel boxplot

A

multiple boxplots displayed side by side - can use to see separation of groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a notched boxplot

A

Box plot with a notch - the notch is a narrowing of box around median - WIDTH is proportional to interqartile range and inverse proportional to size of sample
The notch is the confidence interval around the mean - if two boxes notches DONT overlap - strong evidence that their medians differ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a VIOLIN plot and how plot

A

llibrary(vioplot)
KERNAL density plots superimposed in a mirror image over the box plot(box plot on the inside with black and white lines)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Whats a histogram and how to plot

A

dispays distributoin of continuous variables (divide range into bins) (hist(data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What can you put in addition to histogram

A

can do probability density curve or fit normal curve to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a kernel density plot

A

estimation of probabiltiy density over variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dot plot

A

Dots (catergorical on one axis continuous on other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Scatter plots

A

scatter (continuous on both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are grouping and faceting

A

faceting displays groups of observations in seperate side by side plots; Grouping displays two or more groups of observations in a single plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Descriptive vs inferential stats

A

descriptive describes the stats (eg whats the mean, mode, stdev etc
inferential says something about the data - draws inferences about it (eg these two pop significantly different in regard to this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

5 types of descriptive stats

A

Frequency (how often), central tendency (mean), dispersion (stdev), position (relative position eg quartiles), Shape of observation (skewness and kurtosis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are DOF and how determined

A

DOF = measure of # of independant data pieces used to eval (n- #) - # is number of parameters estimated form data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Talk about skewness and kurtosis

A

skewness is measure of degree of asymmetry,, 0 if symmetrical

Kurtosis is measure peakedness - 3 is normal , if high very peak , if negative its flat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

TWO important notes about inferential stats

A

assumes each replication in a condition is assumed to be independent
Large sample size - more likely statistic to indicate differences exist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Steps for sig testing (7)

A

1 state null hypo
2 State alt hypo
3 check if dist normal
4 select appropriate test
5 choose level of significance and number of tails
6 calc statistical value
7 Obtain critical factor for test and compare crit value with test statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

null vs alternative hypo

A

null means no difference, alt means there is

26
Q

What do tails mean ( 1 vs two)

A

interested in change in just one direction or either

27
Q

What is level of signifigance

A

the probability that what we are saying is not true

28
Q

What is a crit value

A

determine if result significant or note

29
Q

What is COHENS D

A

Measures the size of the difference (trivial, more than a standard dedication etc)

30
Q

How does variance play into t tests of two independent data sets

A

Equal variance t tes assumes stdev of each group arising from same pop
UNEQUAL variance t test - stdev from each group sig different (welch)

31
Q

What is students paired t test

A

comparing related samples (eg time poitns of same test subject)

32
Q

Ways to test for normal dist

A

Shapiro Wiles test, P>0.05 means there is a normal dist
Anderson Darling test P>0.05 means normal dist
Test graphically (histogram, boxplot, QQ plot (quantile quantile - theoretical vs actual)
skew and kurtosis

33
Q

What’s a bimodal distribution

A

when there are two modes

34
Q

stdev vs %RSD vs Variance

A

stdev is stdev, RSD is stdev/mean and variance is stdev squared

35
Q

Whats the range of skew and what do they mean

A

unaccpetable is greater than 1 or less than -1 0 is symmetrical - positive skew is tailing

36
Q

Range of kurtosis

A

3 is normal - unacceptable less or greater than 3

37
Q

What does describe function in psych do

A

gives all the descriptive stats

38
Q

How does test relate to crit value

A

test>crit value means null is rejected (test stat is generated, , crit # is gound in table

39
Q

How can t tests be used (variation

A

One sample mean test against a specific value,
2 independent means (key because have 2 means and 2 stdev - so have s diff - difference in stdev - need to do equal variance test
onesided vs two tailed depending on which direction of variation you care about

40
Q

How to check if variance between two group sis equal or not

A

Welch’s, Levenes (null hypo is that groups are - so p>0.05), F test in R

41
Q

What does ANOVA stand for

A

analysis of variance

42
Q

Why ANOVA vs T test

A

basically multiple sources of variation not just comparing one to one
eg variation with analyst and variation with method/instrumetn (looking at data of 4 analysts making lead measurements in water)
So here we can look at within group variation (from each analyst -comparing their means) OR the between group factors (seeing how the means of groups differ from each other) and more importantly we can isolate and estimate these (in each - there is one main calculation with variation from the other)

43
Q

ANOVA hypotehses

A

H0 - population means of groups are all equal
H1 -pop means of groups are all not equal

44
Q

What are assumptions in ANOVA

A

Independance of observations
Normality of Residuals (the difference between observed value and estimated true value)
Homoscedasticity (Variances of data in the group is the same - homogenous variance)

45
Q

What is variance in ANOVA

A

so within group is Mq or Sw^2 or
Between group is Sb^2 (Mb - Mw)/n

46
Q

ONE WAY vs two way anova

A

One way - single classification variable - multiple groups

47
Q

What is balanced design for ANOVA

A

each group has equal number of people (- observations in each treatment condition)

48
Q

What are ost hoc tests for?

A

ID which groups are different - Anova just tells you if there is a difference not which groups or how much

49
Q

What is TUKEYS HSD?

A

a post hoc test - honestly signfiant differene

50
Q

Whats a confounding factor

A

a variable that can explain group differences on the dependant variable - NOT A VARIABLE WE ARE INTERESTED IN - A NUISANCE

51
Q

Multivariate anova?

A

looks at the efect of multiple dependant variables (eg effect of treatment on concentration of compound y AND Z

52
Q

How to do anova in r studio

A

AOV(dependant vairable ~ independant)

53
Q

What are some post hoc tests

A

Bartletts, LEVENS, TUKEY HSD

54
Q

Variations on ANOVA formula

A

y~A+B+C a prediction of y from A, B and C
(typically controlling for the other 2)
y~A+B+A:B - denotes interaction between variables
y~ABC means each individually but coding interaction between all 3

55
Q

What order do you program into ANOVA in R

A

covariates, then main effects then interactions

56
Q

Example ANCOVA design (Anova with 1 covariate)

A

evaluate whether the dependant variable are equal across levels of an independent variable while controlling for the other (covariate)

57
Q

WHAT ARE ASSUMPTIONS NEEDED FOR ANCOVA

A

1 linearity between covariate and the outcome variable at each level of the independent variable (eg basically your covariate should effect each level of independent variable the same)
2) Homogeneity of regression slopes - they are parallel (covariate vs outcome variable - so basically no interaction between covariate and independent variable)
3) outcome variable approximately normal
4)homoscedasticity - homogeneity of residual ariances for all groups

58
Q

What is adjusted p value and when do we use

A

Adjusted p value is adjusting the p value when you have multiple comparisons - because the more comparisons ou do - the error rate grows with each additional comparison

59
Q

How to calc adjusted p value

A

Bonferroni correction - P value/ Number fo comparisons
(or p *n and compare to alpha

60
Q

ancova vs 2 way anova

A

ancova covariate is a CONTINUOUS VARIABEL - like horus studied per day, 2 way anov ais a whole other set of categories!!

61
Q

What is two way anova with replication vs without

A

withou treplication means there is only one value in each group (eg you took the mean for each) - with replciation means each group has a population of data