Behavioural Analytics Flashcards

(202 cards)

1
Q

What does GAM stand for?

A

Generalised Additive Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why are GAMs useful in looking at emotion?

A

We have to understand how emotions change over time - looking at people interacting we often have time series data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are often used to look at emotional states?

A

Often use valence and arousal to look at the emotional state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is trace annotation?

A

Looks at videos that people are looking at and says what is their valence and arousal at any time.

This typically goes up and down. Our classical statistical techniques we use (eg linear models) aren’t great for this. Therefore we need GAMs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What doe GAMs allow?

A

Allows you to analyse trace data in a way that is similar to the ideas within regression.

There are advanced modelling abilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What command in R is used for linear models?

A

lm()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the equation for a straight line?

A

y = mx + c

y = a + Bx + E

where E (epsilon) represents the errors around the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In regression, what should we always be checking?

A

Our assumptions of linearity

Check the partial plots (in a multiple regression) or regression plots and investigate to see if the data has some curved nature.

Eg curvature in residuals vs the fitted values suggests a straight line is not a good way to capture the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Anscombes Quartet?

A

A visual warning that you should always visualise the data and do some EDA.

Different versions of data have similar statistics but are different.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a more modern version of Anscombes Quartet?

A

The Datasaurus dozen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Simpson’s Paradox?

A

A statistical phenomenon where an association between two variables in a population emerges, disappears or reverses when the population is divided into subpopulations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are other non-linear models like GAMs?

A

LOESS - locally estimated scatterplot smoothing

ARIMA - auto regressive integrated moving average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the function for a GAM?

A

y = a + f(x) + E

It is the regression function as before, but we swap out the single beta coefficient for a function.

This function allows us to come up with a way of capturing the data - splines that combine to make a non-linear smooth representation of the function.

Coefficients tell you about the nature of the basis functions, you had them together to give you an overall “wiggly” line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do we call the combination of basis functions?

A

A smooth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the line of code to produce a GAM model?

A

model <- gam(dependent ~ s(predictor), data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is different about gam() compared to lm()?

A

The predictor variable is wrapped in s() which instructs R to come up with a function which best fits this data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Once we have created a GAM model, what functions do we call on the model?

A
  • summary(model)
  • coef(model)
  • plot(model)
  • gam.check(model, pages = 1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In the output, what is the EDF?

A

Effective degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does a very small p value represent?

A

When there is a good fit to a curve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does gam.check() do?

A

Checks to see if there is enough of a curve.

We don’t want a very small P value in the GAM check.

Higher p-values are preferred because they suggest the model residuals are well-behaved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What should we change about the GAM model?

A

Change the basis functions or knots by adjusting the k argument within the smooth function. K controls the wiggliness of the lines.

model <- gam(response ~ s(predictor, k = 15), data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is concurvity?

A

An issue we need to deal with, the smooth equivalent of collinearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What determines the wiggliness of a smooth?

A
  • The number of knots / basis functions
  • The smoothing parameter - lambda
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How do we change the lambda smoothing parameter?

A

Use the term sp = within the GAM specification

model <- gam(response ~ s(predictor, k = 15), sp = 0.1, data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What other way can we change the smoothing?
REstricted Maximum Likelihood method model <- gam(response ~ s(predictor, k = 15), method = "REML", data)
26
How can we improve GAM plots?
- Adding confidence intervals - Adding residuals
27
How can we improve GAM plots by adding confidence intervals?
Adding confidence intervals (variability bands) and shading them. plot(model, se = TRUE, shade = TRUE, shade.col = "rosybrown2")
28
How can we improve GAM plots by adding residuals?
plot(model, se = TRUE, shade = TRUE, shade.col = "rosybrown2", residuals = TRUE, pch = 1, cex = 1)
29
How can you add in covariates to the GAM?
As it is an "additive" model, the separate components simply add to create the overall model. Looking at more than one function in the same model model <- gam(response ~ s(pred1) + s(pred2), data) Or looking at them together as an interaction model <- gam(response ~ s(pred1, pred2), data)
30
In addition to covariates in GAM, how else can you have a multivariate GAM?
Include linear variables model <- gam(response ~ s(pred1) + pred2, data) Include factor/categorical variables model <- gam(response ~ s(pred1, by = sex), data)
31
Tensor product smooths can be made for 2D and spatial data. What do tensors allow for?
Two differing scales to interact
32
What function is used to visualise GAMs?
vis.gam()
33
What is a GAMM?
Generalised additive mixed model They are the multi-level mixed model form. They are more sophisticated.
34
How do you initialise a new plot?
ggplot()
35
How do you change the "dot" of plotted points?
geom_point(shape = 1) creates a hollow circle
36
How do you plot lines on a plot?
geom_abline() - diagonal line geom_hline() - horizontal line geom_vline() - vertical line
37
How do you expand the axis in view of ggplot?
coord_cartesian(xlim = c(-1, 3), ylim = c(-1, 3))
38
Why may linear regression not be ideal for a real-world scenario?
In a real world data collection situation we never get data that falls along a straight line. We always have some aspects of the data that are not explained by the model.
39
What is a residual in linear regression?
The difference between the actual value and the value predicted by the model (y-ŷ) for any given point
40
How do you determine the predictions for data based on a model?
predict(model)
41
How do you determine the residuals of the data based on model predictions?
residuals(model)
42
How do you add the predicted values onto a plot?
+ geom_point(aes(y = predicted), shape = 1) Predicted must be added to the dataframe
43
How do you add on vertical lines to show the difference between a predicted value and the actual value (residuals)?
+ geom_segment(aes(xend = Time, yend = predicted), alpha = 0.5)
44
What does geom_segment() do?
Draws a straight line between two points - we use it to show the residual
45
How do you add labels to show the residuals in a plot?
geom_text(aes(y = predicted + (residuals / 2), label = paste0(round(residuals, 1))), nudge_x = 0.5, size = 2) nudge - so they don't overlap with the points
46
What does adding fill = na to geom_smooth() do?
Removes the default shaded confidence interval
47
What is Anscombe's quartet?
A statistical warning - it comprises four datasets that have nearly identical simple descriptive statistics, yet have very different distributions and appear very different when visualised.
48
What kind of non-linear relationship is better captured by Generalised Additive Models than linear models?
Curvilinear
49
Describe the data in the built-in anscombe data.
Columns 1-4 (x1, x2, x3, x4) contain x-values. Columns 5-8 (y1, y2, y3, y4) contain corresponding y-values.
50
How do you get a fitted line to extend beyond the range of the data?
fullrange = TRUE
51
How do you plot the anscombe data?
anscombeData <- data.frame() for (i in 1:4) anscombeData <- rbind(anscombeData, data.frame(set = i, x =anscombe[,i], y = anscombe[,i+4])) ggplot(anscombeData, aes(x, y)) + theme_bw() + geom_point(size = 3, color ="red", fill = "orange", shape = 21) + geom_smooth(method = "lm", fill = NA, fullrange = TRUE) + facet_wrap(~ set, ncol = 2)
52
How do you account for randomness?
set.seed(123)
53
How do you add randomness to a curve?
randomError <- rnorm(mean=0, sd=0.5*sd(z), n=length(x)) Where z is the function outcome and x is the number of data points
54
What is the difference between se = FALSE and fill = NA in geom_smooth()?
se = FALSE - Removes Confidence Interval fill = NA - Removes the Fill Colour of the Confidence Interval
55
What does geom_path() do?
Adds a line which connects the points in the order they are given in the data frame.
56
What should we always do when fitting a linear model?
Check our assumptions
57
What is the quick "cheat" way to get a smooth curve?
Using the geom_smooth() function.
58
What are the automatic methods of geom_smooth()
Loess if the number of observations < 1000 GAM if the number of observations > 1000
59
What are curvilinear lines also known as?
Smooths
60
What does LOESS stand for?
Locally estimated scatterplot smoothing
61
What is the disadvantage of using geom_smooth() over a GAM?
It is not as sophisticated. It draws the line but does not give us an output to interpret and understand (we need the GAM package for this)
62
What package do we use for gams?
mgcv
63
What is nlme?
A mixed modelling regression package.
64
How is the wiggliness of a GAM curve determined?
The number of joining knots that are allowed in a spline
65
What is a spline?
A piecewise combination of lots of little cubic sections
66
How much data is there usually in emotion data?
Emotion examples use intensive longitudinal data so we typically have lots of data.
67
How do we check the number of basis functions or knots?
Using the p-value of the reported statistic as we can see it tells us "Low p-value (k-index<1) may indicate that k is too low"
68
When using the REML method, what happens to the P value when you re-run?
When using REML we have a p value that actually jumps around a bit. If we run this more than once we can see that the p value changes. This is telling us that something is wrong as this in not reliable.
69
Which method should we use when running a GAM?
REML is probably best as a default If you don't specify this it defaults to using Generalised Cross-Validation (GCV)
70
What does it mean if the GAM is not stable? How can you tell if the model is unstable?
It typically means that the model is overfitting, underfitting, or failing to converge properly. If the model is unstable, the p-value from the k-check (which tests whether the basis dimension k is sufficient) changes each time you run the model. This variability is a sign that the model is not robust.
71
What is one way to check stability?
run gam.check(model)
72
What does gamSim() do?
The function gamSim() generates example datasets that are commonly used for demonstrating GAMs.
73
What does a lower GCV score suggest?
Better model generalisability
74
What is concurvity?
Concurvity is a term used in generalized additive models (GAMs) and similar non-linear models to describe a situation where two or more smooth terms (or predictor variables) are highly correlated in terms of their functional forms. In simpler terms, concurvity occurs when the smooth terms or predictors in the model are strongly related or move in similar ways, leading to potential issues with model estimation and interpretation.
75
What is physio dash?
An R Shiny App
76
What are the three parts of a Shiny app?
- Initialisation code - A UI part - A server part
77
What is the popular software design pattern, MVC?
Model–View–Controller (MVC) The model is where the data is stored and manipulated - this is the server part in Shiny. The View part is the code for the user interface. The controller code glues these aspects together and deals with user inputs.
78
Why is simulated data beneficial to use?
It gives you some fast and straightforward data in a shape you define, this allows you to quickly show how data will work throughout the model and UI parts of your app.
79
In the physio_dash application, what do the sliders do?
Collect information concerning the nature of the simulated data - this information will be sent to the server part of the Shiny app to feed into the simulation functions.
80
Describe how time is created for the the physiological features when the "simulate_data" button is pressed?
It is created as a vector with the seq() function. Time can be difficult to work with - one of the easiest ways is to treat it as an integer, using UNIX time/POSIX time.
81
How does UNIX deal with time?
By taking the number of seconds that have elapsed since Jan 01 1970 (UTC)
82
What is the arima.sim() function?
ARIMA stands for AutoRegressive Integrated Moving Average arima.sim() simulates data for these kinds of models
83
In arima.sim(), what does the model argument take?
model = list(ar = input$autocorrelation) It takes the input from the autocorrelation slider in our UI as the model and uses it as a simple autoregressive (ar) model.
84
How do you add an icon to the shiny app?
shiny::icon("heart-o")
85
How do you output a graph in the Shiny app?
UI - dygraphOutput() Server - renderDygraph
86
What is a dygraph, what does this package do?
Dynamic graphic Dygraphs produces nice dynamic interactive graphs for time series style data.
87
What are some ways that dplyr is used for tidy data manipulation in physio dash?
(Exercise 4 in summary) data <- data %>% dplyr::select(time,Heartrate) %>% # Keeps only time and Heartrate columns dplyr::mutate(biometric = "HR") %>% # Adds a new column biometric with a constant value "HR", identifying the data type. dplyr::rename(value = Heartrate) %>% # Renames "Heartrate" to "value" dplyr::mutate(time_date = as.POSIXct(as.numeric(as.character(time))/1000, origin = "1970-01-01", tz="Europe/London")) # Convert UNIX Timestamp to Human-Readable Time return(data)
88
How is HTML code placed into the shiny app?
Using the tags command
89
What is a requirement for using the dygraphs package for time series data?
Time series data must be presented in xts format. Type ?xts We opt for a POSIX form of data representation, using the POSIXct class with the as.POSIXct() function.
90
What does the combination of dygraph and Shiny allow?
Interactivity
91
In physio dash, what is a difference in dygraphs for heart rate and ECG?
ECG processing detects peaks, extracts RR intervals, and computes NIHR, while HR processing simply plots BPM values. ECG processing extracts HRV metrics from the raw ECG signal, while HR processing only displays BPM trends. Additional plots in ECG - ECG processing allows advanced HRV visualizations, while HR processing is limited to basic HR plotting.
92
What is the features tab of physio dash for?
Taking a deeper dive into the data on an individual level
93
What is the biometrics tab of physio dash for?
Seeks to try and combine the data for a view of the different data streams all at the same time.
94
In a dygraph, how do you add a zoomable range selector?
dyRangeSelector()
95
In a dygraph, how do you change the colours?
eg within renderdygraph custom_palette <- c("red","blue","orange","green","darkgreen","purple") defined outside within dygraph: dyOptions(colors = custom_palette)
96
How do you adjust the legend size within dygraph?
dyLegend(width = 500)
97
How do you input a conditional message for loading, if the data is taking time?
conditionalPanel(condition="$('html').hasClass('shiny-busy')",tags$div("Loading...",id="loadmessage")),
98
How do you add in a piece of text (not a heading/paragraph) to inform the user what a certain aspect is for?
helpText("Click and drag on the plot to zoom and select date ranges"),
99
What does the descriptive tab of physio dash show?
The plot presents the time series together in a series of faceted plots
100
How do you extract and transform the galvanic skin response data for plotting in the descriptive tab?
The descriptive tab plots six plots (facet wrapped) so the GSR must be split into its two components data_SCL <- data_GSR() %>% select(time,SCL,time_date) %>% mutate(biometric = "SCL") %>% dplyr::rename(value = SCL) data_SCR <- data_GSR() %>% select(time,SCR,time_date) %>% mutate(biometric = "SCR") %>% dplyr::rename(value = SCR)
101
What is the difference between merge.zoo() and rbind()
merge.zoo - biometric plot rbind - descriptive plot Use rbind() when you want a tidy, long-format dataset for faceted plotting (ggplot2) (vertical stacking) Use merge.zoo() when you need to align time series and create a wide-format dataset for dygraph()
102
How are GAMs utilised in physio dash?
Generalised Additive Mixed Models (GAMM) used to analyse biometric data and display key statistics in value boxes.
103
What does renderValueBox() allow you?
Create an infographic style display, and in particular we extracted summary statistics and present them in a best manner.
103
How do you fit a GAMM model for the physio data?
eg within renderValueBox() gamm.HR <- gam(value~s(time),data=data_HR(),method ="REML",correlation = corAR1()) summary_gamm.HR <- summary(gamm.HR) summary_gamm.HR <- round(summary_gamm.HR$edf,2) valueBox( summary_gamm.HR, "Heart Rate", icon = icon("heartbeat"), color = "red" )
104
What does GAMM allowed compared to GAM?
This is a more complicated style of regression model that allows us to incorporate correlations in the data into the model. This is useful where the independence assumptions of a regression are violated, we can bring them into the model as an approach to dealing with that violation.
105
What do we need to be aware of with time series data?
As these are time series data we have to be aware of autocorrelation and these gamm models allow us to deal with these issues at least to some extent.
106
What does gather() do?
eg tidyr::gather(key = emotion,value = emotion_score,c("joy","fear","disgust","sadness","anger","surprise")) Creates a longer format, where each row represents a time point and the corresponding score for each emotion. The wide format data (where each emotion has its own column) isn't suitable for plotting or time series analysis because it complicates visualizing how emotion scores change over time. gather() converts the data into a long format, where you can easily plot emotion scores (emotion_score) against time (time), making it simpler to work with for visualization.
107
What is the difference when using readr::read_csv()?
The readr function creates a tibble rather than a dataframe. This is the Tidyverse version of a data.frame/data.table that works in a way that is compatible with Tidyverse code
108
What is and how do you create a gauge chart?
A Gauge chart is a type of chart that uses a radial scale to display data in the form of a dial. renderGauge()
109
What kind of machine learning is the inter rater reliability?
Supervised machine learning - we get an algorithm to learn the relationship that exists between some set of inputs and known outcome. This is carried out on a training set of data, then we provide a novel set of data as an input and the algorithm makes a prediction or classification based on the learnings of the training set.
110
What does "ground truth" refer to?
In the context of machine learning, "ground truth" refers to the actual, true values or correct labels used for training, validation, or testing a machine learning model. Ground truth is the benchmark or standard that a model's predictions are compared against. It is the "truth" that we rely on when training models or evaluating their accuracy. It is an assumption or "operationalisation" of the truth - it typically depends on many theoretical assumptions that have been made when collecting material for the training set.
111
What do poorly defined assumptions result in?
A poorly defined and collected training set of data means the algorithms developed will not function well when given new data from the real world, as it will not perform according to the poorly-defined "ground truth".
112
What should you do with a training set?
Reserve some of it as a test set. We want to be able to test the extent to which the algorithm can do its job. We want to test it on previously unseen data. This also helps us to investigate if the algorithm is overfit to the training data.
113
What does it mean if the model is overfit?
If the algorithm is overfit to a particular set of data, it will perform very well on this data (eg even to go as far as accounting for the unique statistical anomalies) and will not generalise well to unseen data.
114
What is the disadvantage of labelled data?
Labelled data is usually expensive, labour intensive and takes a lot of time to create. Humans are often the people who create labelled data sets, and in effect the machine learning algorithms are trying to copy the human behaviour and set of decision processes that went into labelling data.
115
Why is there a temptation to use as much of the training set as possible, thus minimising the amount of data for testing? Why is this a problem?
It is so expensive to create training sets and machine learning performs better with large quantities of data. It will almost certainly lead to creating an algorithm that is overfit. You need to choose the proportions of test and training data wisely.
116
What does testing on a test set allow?
Allows you to improve the accuracy of the algorithm. How well the algorithm performs on the test set provides an error metric that you can use to improve your algorithm by changing the parameters, or adding components to more complex models.
117
What are synonyms for labelling?
Labelling, annotating, rating and coding All mean someone observing data and making a judgement about it.
118
What is the gold standard for labelling?
Using well trained experts with a well defined coding scheme. eg PhD students at universities have the time to create the labels or annotations. Due to the intensity of the work, large numbers of raters are not normally possible.
119
What is one way to get more raters?
Use naive raters who do not need to be trained in as much depth. This typically requires a sample coding scheme and payment for the recruits. This is useful as control can be retained over the performance of the raters, but it takes a lot of organisation and teaching of the raters.
120
What is another alternative to expert or naive raters?
Crowdsourcing - using the web and internet to get access to a large number of raters. - eg set up a website and use gamification to get people to do labelling - eg use a paid crowdsourcing site eg Amazon Mechanical Turks, typically these tasks need to be very simple and there is very limited control over the people who participate
121
What is a coding scheme?
A coding scheme is a structured system used to classify, categorise, and interpret data. It involves assigning labels, numbers, or categories to different elements of a dataset based on predefined rules. - eg transcription - there could be variations in punctuation choices
122
What can be a problem with coding schemes?
There can be a lot of room for subjective judgement
123
What do we need to do in response to the fact that there can be a lot of room for subjective judgement?
We need to be able to check how well each of the subjective decision makers are in agreement to give some idea of the consistency of the coding scheme, and a measure of how objective its use its.
124
What are the two main ways coding can be done?
- Discrete (categorical or nominal) coding - Continuous (ordinal, interval or ratio variables) coding
125
What is inter rater reliability (IRR)?
Inter-rater reliability (IRR) is a measure of how much agreement there is between multiple raters or observers. It's used to ensure that data is consistent and reliable, regardless of who collects or analyses it.
126
What is Hallgren's account of measurement error?
Observed score = True score + Measurement Error Var(X) = Var(T) + Var(E) ie the variance (the variability in the scores we observe) can be thought of in a useful way by realising that it is made up of the actual true score that we want to observe
127
How are IRR scores typically set up?
To give an estimate of how much of the true scores we are getting. eg an IRR estimate of 0.8 indicates that 80% of the observed variance is due to the true score variance, or similarity in ratings between coders and 20% is due to error variance or differences in ratings between coders.
128
What is the most simple type of inter rater agreement?
Percentage rater agreement - This captures the amount of times two raters agree in a very simple sense % Agreement = (no. observations agreed by raters) / (total no. observations)
129
What does percentage agreement work easily for?
Categorical data. It does not work so well for continuous data, where some sort of agreement interval is needed (ie tuning the continuous data into a form of categorical data).
130
What is the downside of percentage agreement?
It does not take into account chance agreement
131
When can chance agreement be a really big problem?
In a simple classification problem where there are only two categories - it is likely that a lot of agreement occurs by chance. eg rating cells as cancerous or not - about 1 in 20 cells is expected to be cancerous (p = 0.05)
132
What function is used to tell us how similar ratings are?
The agree() function from the IRR package.
133
How do we account for chance agreement between raters?
Cohen's Kappa
134
What does Cohen's Kappa range from?
-1 to +1 0 - no agreement 1 - perfect agreement Negative - systematic disagreement
135
How do we combine ratings of different ratings?
cbind() Each column is a different rater and rows are subjects
136
What are the limitations of Cohen's Kappa?
It is limited to cases where there is categorical data and two raters (nominal - Hallgren). If we want to have faith in our ground truth, we would rather have it coming from more than the opinion of just two raters.
137
For continuous data with more than two raters, what is the appropriate statistic?
Intraclass correlation coefficient (ICC)
138
What is the code for Cohen's Kappa?
kappa2()
139
Who proposed different ICCs?
Shrout and Fleiss
140
How many ways of calculating ICCs did Shrout and Fleiss suggest?
6 different ways Appropriate depending on the characteristics of the data and the goals of the researchers
141
What is something to consider in ICC?
The spread of ratings across the whole data set. Due to expense and time associated with rating, often sets of ratings are partially coded by different people. Eg full coding: every rater coded every subject Eg one rater has much more time available - a subset of the material is coded by other fathers to ensure that the main rater is coding according to the coding scheme (often 10% may be coded by others) Eg (Often in online ratings with many naive raters) different subsets of material are rated by different people, in a way that means no single rater rates all the data. ICC can handle all of these situations but you need to be aware of what style of rating you are using.
142
What is the ideal scenario for ICC?
Fully crossed - two way model
143
Describe the two way model.
We have information about all of the raters rating all of the subjects. This allows you to see how the two things interact - the ratings and the subjects that they rate.
144
What are the four things to consider for ICC?
- If it is fully crossed or not - How the ratings should be interpreted (absolute values or consistency of ratings) - The way the coding is set up (average or single measures) - Wether coders selected for the study are considered to be random or fixed effects
145
What is the difference between IRR and ICC?
Both Inter-Rater Reliability (IRR) and Intraclass Correlation Coefficient (ICC) measure agreement between raters, but they differ in what they measure and how they are calculated.
146
What does a fully crossed design take into account that a non-fully crossed model does not?
In a fully crossed design, the ICC can take into account systematic deviations between the coders because it has that information
147
Why are there different equations for fully crossed and not fully crossed models?
Models which are not fully crossed do not have enough information so the systematic deviation must be left out
148
When do we use a two way model and when do we choose a one way model?
- Two way model: when it is fully crossed - One way model: when it is not fully crossed
149
What is the difference in equation for the fully crossed and not fully crossed?
- In the not fully crossed, only information about the ratings (r) can be used - In the fully crossed, both information about the ratings (r) and raters/coders (c) can be used and there is in an interaction between them, making this a two way design (rc)
150
Discuss why you would want to know how the ratings should be interpreted.
- Sometimes we are interested in absolute values ie that the raters get the right value correct - Eg the intensity of smiles - Sometimes we are interested in the consistency of ratings ie how the ratings change, we want to see if values go up and down in the same way but it does not matter if the exact numbers are different.
151
Discuss why you want to consider the way people set up the coding.
- We can use the average of all the raters to calculate the ICC in a fully crossed design - we have enough information to use this and it will allow more confidence and a higher ICC as we have more of the relevant data - When we use a subset of ratings to justify the ratings of a single coder, we have to say it is a single measures, which is a more conservative calculation
152
Discuss why you should consider if coders selected for the study are considered to be random or fixed effects.
- If coders are selected from a larger population and the ratings are meant to generalise to the population, can use random effects model - Random Model - If you do not wish the generalise the results to a larger population of coders or if coders in the sample are not randomly sampled, use a fixed effects model (subjects considered random by coders considered fixed)
153
What are the different types of ICC we considered?
Different types of ICC exist depending on the study design and whether raters are considered random or fixed effects. Shrout and Fleiss
154
What do A and C refer to in ICC notation?
Uses nomenclature from McGraw and Wong C - consistency A - absolute agreement
155
What is the code for computing the ICC?
eg ICC(1,1) icc(dataicc1, model="oneway", type="agreement", unit="single")
156
What is one issue that can be difficult and common in ICC?
Missing data
157
How do we deal with missing data for ICC?
A package has become available on CRAN that helps deal with missing data irrNA - copes with randomly missing data
158
What format does irrNA expect the data to be in?
Columns - raters Rows - subjects May need to transpose the data
159
How do we transpose the data?
t(data)
160
What is Krippendorf's alpha?
A modern approach to IRR which can be used for all kinds of data (eg nominal, ordinal, interval, ratio etc). It is newer and is therefore not as familiar as other methods (as not been adopted as widely). It is also robust to missing values.
161
Where is Krippendorf's alpha less flexible as the various types of intra class correlation?
Interval data
162
How do we conduct a Krippendorf's alpha?
kripp.alpha() Pass in data and "type" of data if applicable May need to transpose the data
163
What is one of the advantages of Krippendorf's alpha?
The ability to apply the same measurement across different forms of data.
164
What is the drawback of Krippendorf's alpha?
It does not have the flexibility offered by the varieties of icc that we can engage in
165
When applied to interval data, what is Krippendorf's alpha equivalent to?
ICC(1) This is the special case Krippendorf's alpha
166
How does the data expected for kripp.alpha() differ to that expected by irr()?
The data format expected by kripp.alpha() is pivoted from that expected by icc
167
What are regular expressions?
They are a way to describe a set of strings. They allow us to create patterns that can then be used to search and replace very efficiently.
168
In regular expressions, what is the difference between [0-9] and [0-9]+?
[0-9] will find all numbers, but to find all numbers longer than 1 digit you need to add a plus at the end [0-9]+ [A-Za-z0-9]+ finds uppercase letters, lowercase and digits.
169
What is * in regular expressions?
A wildcard - it can match zero or more characters
170
What are the quantifiers in regular expressions?
* + ?
171
What is + in regular expressions?
Matches something one or more times Regex: go+d Matches: god, good, goood (but NOT gd)
172
What is ? in regular expressions?
Matches something zero or one times a? → Matches "", "a" (but NOT "aa") colou?r → Matches "color" and "colour"
173
What is '\d" in regex?
Matches numbers in the same way as [0-9]
174
What is "\w" in regex?
matches any word character like [a-zA-Z], add a + to get full words.
175
How do we find words within a string?
str_detect() In our example, we specified the column the text was in str_detect(headlines$title, "word1|word2|word3") str_detect(headlines$title, "[0-9]+") - for headlines with numbers str_detect(headlines$title, "\"[a-zA-z\"]") - for headlines with quotes - this requires the escape character \
176
How do we find the position of matched words?
str_match to return matched patterns eg str_match(headlines$title, "word")
177
How do we find matched lines?
str_subset to return matched lines eg str_subset(headlines$title, "word")
178
How do you replace matches with new text?
str_replace str_replace_all(headlines$title, "Cameron", "Pancake")
179
How do we import the IMDB dataset?
from datasets import load_dataset imdb_dataset = load_dataset("imdb")
180
How should we investigate a dataset?
dataset.shape dataset.num_columns dataset.num_rows dataset.column_names type(dataset)
181
What is one of the biggest barriers of natural language processing?
We have to get our text into a shape that can be used by these pre-trained models. They each expect the text to come in a very precise format with special tokens added to the text that inform the model of the start and end of sentences for example. eg BERT - get turned into tokens that have integer labels and there are a few special tokens that need to be used to delimit the boundaries of the sentences.
182
What are the tokens for BERT?
[CLS] - all sentences start with a special token [SEP] - all sentences end with [UNK] - when a word is unknown [PAD] - fills out the empty space at the end of sentences, all of our sentences need to be the same length to keep our matrix square
183
How do we create the autotokeniser to prepare the data?
from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("bert-base-cased") This now contains the tokeniser associated with the "bert-base-cased" model
184
What kind of design can you use Cohen's kappa on?
Fully-crossed designs with exactly two coders
185
What is a difference between Cohen's Kappa and ICC?
ICCs incorporate the magnitude of the disagreement to compute IRR estimates. Larger magnitude disagreements result in lower ICCs than smaller-magnitude disagreements.
186
Which ICCs tend to have higher values?
Average-measure ICCs higher than single-measure ICCs.
187
What kind of deletion for missing data does ICC use?
List-wise deletion Therefore it cannot accommodate datasets in fully-crossed designs with large amounts of missing data - krippendorff's alpha may be more suitable when problems are posed by missing data in fully costed designs.
188
When would you use average-measures?
When you have all subjects rated by all coders. The researcher is likely interested in the reliability of the mean ratings provided by all coders.
189
What is the null hypothesis for IRR?
That ICC = 0
190
What is the code you need if a package is not installed?
install.packages("readr")
191
What outputs can you run after a model?
The sum of squares of the residuals print(sum(curveData$residuals^2)) or anova(model)
192
How do you check the assumption of linearity?
plot(model, 1) - we want the first plot: residuals against the predicted model / fitted values
193
What function generates diagnostic plots to check if the model assumptions hold?
appraise(model)
194
How can you plot the model using gratia?
draw(curveGAMModel, rug = FALSE)
195
How do you extract the basis dimensions from the model using gratia?
model$smooth[[1]]$bs.dim
196
How should you investigate imported data?
data <- read.csv("file.csv) - str(data) - glimpse(data) - head(data)
197
What does REML help with?
model stability
198
How do you rearrange the data to select only columns y and x2, removing all others, sorting based on x2?
stableData1 <- dplyr::select(stableData1, y, x2) %>% arrange(x2)
199
How do you check for concurvity?
concurvity(model, full = TRUE) full = FALSE for pairwise comparison Want things to be < 0.8
200
For adding a different smooth type, what should we look at?
?smooth.terms
201
What help commands can you use for the plots in physio dash?
?dygraphs