VL 9 Flashcards

1
Q

Control Flow: if, else, if , else Syntax?

A

question: is cond TRUE or FALSE?

if (cond1) {
# if cond1 is TRUE
# do something …
} else if (cond2) {
# if cond2 is TRUE
# do something …
} else {
# neither cond1 or cond2 are true
# do something else …
}

example:
> binf=readRDS(‘pbinf-2022-08.RDS’)
> survey=binf$data$survey
> if (nrow(survey)>320) {
+ print(‘new data of 2017 added already’)
+ } else {
+ print(‘new data of 2017 not added yet’) +}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Programming Loops: for(!), while, (repeat)

A

for (i in vector) {
# do something for every element in vector
}
while (cond) {
# do something while cond is TRUE
}
repeat {
if (cond) { break }
# do something at least once
}

example:
> for (i in 1:nrow(survey)) {
+ if (is.na(survey[i,’cm’])) {
+ next
+ }
+ if (survey[i,’cm’]>197) {
+ print(survey[i,1:6])
+ }
+}

95% you use for!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Useful Operators in R

A
  • Mathematical: *, /, +, -, <, > , ==, …
  • Logical: & (and), | (or), %in% (in) , ! (not)…
  • Own: ’%ni%’<- Negate(’%in%’)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Structure of a function in R.
(Write your own function)

A

myCV = function (x) {}

myCV –> Name of the function (whatever you like)
= –> Assignment Operator
function –> function keyword
(x) –> Parameter Argument
{} –> the implementation / function

example: CV function
myCV = function (x) { cv=100*sd(x,na.rm=TRUE)/mean(x,na.rm=TRUE)
return(cv)
}

Always add return to function - just in case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The … Argument?

A

take any argument and delegate it
my.barplot: light blue barplot always with a box around

SPICKER!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correlation

A
  • observe the association between two numerical variables
  • if two numerical variables are associated we say they are correlated
  • the correlation coefficient is a quantity that describes the strength of the association
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Observation

A
  • individuals with high amounts of C20-22 fatty acids have as well higher insuline sensitivity
  • two variables vary together in the same direction
  • there is a lot of covariation or correlation
  • direction and magnitude of a correlation can be
    quantified with the correlation coefficient r
  • value range [-1,+1]
  • value 0: no variation together
  • negative val: one var values increase, other decrease
  • positive val: both change in the same direction
  • values of 1 or -1: straight lines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interpretation of r?

A

The Pearson correlation coefficient, denoted as “r,” measures the strength and direction of a linear relationship between two continuous variables. It ranges from -1 to +1, where positive values indicate a positive correlation, negative values indicate a negative correlation, and values close to 0 indicate a weak or no correlation. It is commonly used to assess relationships between variables in various fields of study.

–> Don’t combine two populations in correlation!
–> Pearson correlation is sensitiv to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the correlation r-squared r^2

A
  • r2 often also called coefficient of determination
  • r2 is between 0 and 1, smaller than r
  • r2 is interpreted as the fraction of variance that is shared
    between the variables

R-squared (coefficient of determination) measures how well a regression model fits the data. It ranges from 0 to 1, where 1 means a perfect fit, and 0 means no fit. It shows the proportion of the dependent variable variance explained by the independent variable(s) in the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s the Spearman Rank Correlation and when use it?

A
  • Spearman correlation is more robust against outliers!
  • Correlation with one outlier is not significant!!
  • Spearman correlation is calculated on ranks of values, not on
    the values directly.
  • It’s a non-parametric test.
  • It does not assumes normal distribution of data.
  • It is more conservative.
  • If in doubt use Spearman correlation

The Spearman rank correlation (ρ) measures the strength and direction of the monotonic relationship between two variables. It is used when the relationship is non-linear, ordinal, or when data contains outliers. It is a non-parametric alternative to Pearson correlation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When to use Spearman and when Pearson?

A
  • Normal distribution and no outliers –> Pearson
  • Non-Normal
    1. try to normalise your data, if its possible –> Pearson
    2. If you can’t normalise data –> Spearman or Kendall tau (even more robust to Outliers)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Effectsize r and rs

A
  • Pearsons r and Spearmans rs are quite similar in their values
  • but rs2 is the proportion of rank variances for
  • Kendalls τ is numerical different
    66-75% of r or rs, don’t square it
  • r of 0.1 small effect, 1% of variance
  • r of 0.3 medium effect, 9% of variance
  • r of 0.5 large effect, 25% of variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is partial correlation?

A

Partial correlation is a statistical method that measures the relationship between two variables while controlling for the influence of other variables. It allows assessing the direct association between the two variables of interest, removing the effects of confounding factors.

Remember: Male and female mixture ..
e.g: partial correlation of body height and weight after removing the effect of sex

When we control for control variable(s) on the relationship between variable 1 and variable 2, we find the following (in)signifikant partial correlation:
r(df) = …, 95%CI = […,….], p < ….

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Mutual Information?

A

Mutual information measures the degree of dependence or shared information between two random variables. It quantifies how much knowing one variable reduces uncertainty about the other. High mutual information indicates strong dependence, while low or zero mutual information suggests independence. It is used in various fields, including machine learning and feature selection.

  • Pearson correlation does work only for linear relationships between two variables
  • mutual information does work for any relationship between two variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly