CHemometrics test 1 Flashcards

1
Q

What is a barplot

A

displays distribution of categorical variable - horizontall or vertically

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to simply make a barplot

A

barplot(data)
can include main , xlab and ylab for titles
data should be grouped under categorical variables (and can select that for the data eg
data$categorical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Stacked vs grouped barplot and how to program

A

depends on if your data is matrix rather than a vector - can switch variabesl besides to true or false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are error bars on bar graphs

A

typically the height Plus or - stde

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ggplot? and how to use

A

an easier way to graph

ggplot(data, aes(x = , y = ) geom_point()
So typically have your gg plot function with your data x and y and then other functions after like geompoint, geom smooth etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to make error bars in ggplot

A

geom_errorbar()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is jitter

A

Shows all poitns and adds random spacing to make it easy to visualize

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a spinogram

A

a stacked bar plot but scaled to 1 (displays everything in %age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a box plot

A

box and whisker - displays median, upper and lower quartile (edge of box) and the upper and lower hinge (whisker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a parallel boxplot

A

multiple boxplots displayed side by side - can use to see separation of groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a notched boxplot

A

Box plot with a notch - the notch is a narrowing of box around median - WIDTH is proportional to interqartile range and inverse proportional to size of sample
The notch is the confidence interval around the mean - if two boxes notches DONT overlap - strong evidence that their medians differ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a VIOLIN plot and how plot

A

llibrary(vioplot)
KERNAL density plots superimposed in a mirror image over the box plot(box plot on the inside with black and white lines)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Whats a histogram and how to plot

A

dispays distributoin of continuous variables (divide range into bins) (hist(data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What can you put in addition to histogram

A

can do probability density curve or fit normal curve to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a kernel density plot

A

estimation of probabiltiy density over variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dot plot

A

Dots (catergorical on one axis continuous on other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Scatter plots

A

scatter (continuous on both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are grouping and faceting

A

faceting displays groups of observations in seperate side by side plots; Grouping displays two or more groups of observations in a single plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Descriptive vs inferential stats

A

descriptive describes the stats (eg whats the mean, mode, stdev etc
inferential says something about the data - draws inferences about it (eg these two pop significantly different in regard to this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

5 types of descriptive stats

A

Frequency (how often), central tendency (mean), dispersion (stdev), position (relative position eg quartiles), Shape of observation (skewness and kurtosis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are DOF and how determined

A

DOF = measure of # of independant data pieces used to eval (n- #) - # is number of parameters estimated form data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Talk about skewness and kurtosis

A

skewness is measure of degree of asymmetry,, 0 if symmetrical

Kurtosis is measure peakedness - 3 is normal , if high very peak , if negative its flat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

TWO important notes about inferential stats

A

assumes each replication in a condition is assumed to be independent
Large sample size - more likely statistic to indicate differences exist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Steps for sig testing (7)

A

1 state null hypo
2 State alt hypo
3 check if dist normal
4 select appropriate test
5 choose level of significance and number of tails
6 calc statistical value
7 Obtain critical factor for test and compare crit value with test statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
null vs alternative hypo
null means no difference, alt means there is
26
What do tails mean ( 1 vs two)
interested in change in just one direction or either
27
What is level of signifigance
the probability that what we are saying is not true
28
What is a crit value
determine if result significant or note
29
What is COHENS D
Measures the size of the difference (trivial, more than a standard dedication etc)
30
How does variance play into t tests of two independent data sets
Equal variance t tes assumes stdev of each group arising from same pop UNEQUAL variance t test - stdev from each group sig different (welch)
31
What is students paired t test
comparing related samples (eg time poitns of same test subject)
32
Ways to test for normal dist
Shapiro Wiles test, P>0.05 means there is a normal dist Anderson Darling test P>0.05 means normal dist Test graphically (histogram, boxplot, QQ plot (quantile quantile - theoretical vs actual) skew and kurtosis
33
What's a bimodal distribution
when there are two modes
34
stdev vs %RSD vs Variance
stdev is stdev, RSD is stdev/mean and variance is stdev squared
35
Whats the range of skew and what do they mean
unaccpetable is greater than 1 or less than -1 0 is symmetrical - positive skew is tailing
36
Range of kurtosis
3 is normal - unacceptable less or greater than 3
37
What does describe function in psych do
gives all the descriptive stats
38
How does test relate to crit value
test>crit value means null is rejected (test stat is generated, , crit # is gound in table
39
How can t tests be used (variation
One sample mean test against a specific value, 2 independent means (key because have 2 means and 2 stdev - so have s diff - difference in stdev - need to do equal variance test onesided vs two tailed depending on which direction of variation you care about
40
How to check if variance between two group sis equal or not
Welch's, Levenes (null hypo is that groups are - so p>0.05), F test in R
41
What does ANOVA stand for
analysis of variance
42
Why ANOVA vs T test
basically multiple sources of variation not just comparing one to one eg variation with analyst and variation with method/instrumetn (looking at data of 4 analysts making lead measurements in water) So here we can look at within group variation (from each analyst -comparing their means) OR the between group factors (seeing how the means of groups differ from each other) and more importantly we can isolate and estimate these (in each - there is one main calculation with variation from the other)
43
ANOVA hypotehses
H0 - population means of groups are all equal H1 -pop means of groups are all not equal
44
What are assumptions in ANOVA
Independance of observations Normality of Residuals (the difference between observed value and estimated true value) Homoscedasticity (Variances of data in the group is the same - homogenous variance)
45
What is variance in ANOVA
so within group is Mq or Sw^2 or Between group is Sb^2 (Mb - Mw)/n
46
ONE WAY vs two way anova
One way - single classification variable - multiple groups
47
What is balanced design for ANOVA
each group has equal number of people (- observations in each treatment condition)
48
What are ost hoc tests for?
ID which groups are different - Anova just tells you if there is a difference not which groups or how much
49
What is TUKEYS HSD?
a post hoc test - honestly signfiant differene
50
Whats a confounding factor
a variable that can explain group differences on the dependant variable - NOT A VARIABLE WE ARE INTERESTED IN - A NUISANCE
51
Multivariate anova?
looks at the efect of multiple dependant variables (eg effect of treatment on concentration of compound y AND Z
52
How to do anova in r studio
AOV(dependant vairable ~ independant)
53
What are some post hoc tests
Bartletts, LEVENS, TUKEY HSD
54
Variations on ANOVA formula
y~A+B+C a prediction of y from A, B and C (typically controlling for the other 2) y~A+B+A:B - denotes interaction between variables y~A*B*C means each individually but coding interaction between all 3
55
What order do you program into ANOVA in R
covariates, then main effects then interactions
56
Example ANCOVA design (Anova with 1 covariate)
evaluate whether the dependant variable are equal across levels of an independent variable while controlling for the other (covariate)
57
WHAT ARE ASSUMPTIONS NEEDED FOR ANCOVA
1 linearity between covariate and the outcome variable at each level of the independent variable (eg basically your covariate should effect each level of independent variable the same) 2) Homogeneity of regression slopes - they are parallel (covariate vs outcome variable - so basically no interaction between covariate and independent variable) 3) outcome variable approximately normal 4)homoscedasticity - homogeneity of residual ariances for all groups
58
What is adjusted p value and when do we use
Adjusted p value is adjusting the p value when you have multiple comparisons - because the more comparisons ou do - the error rate grows with each additional comparison
59
How to calc adjusted p value
Bonferroni correction - P value/ Number fo comparisons (or p *n and compare to alpha
60
ancova vs 2 way anova
ancova covariate is a CONTINUOUS VARIABEL - like horus studied per day, 2 way anov ais a whole other set of categories!!
61
What is two way anova with replication vs without
withou treplication means there is only one value in each group (eg you took the mean for each) - with replciation means each group has a population of data