Stats 6 - Non-Linear Models Flashcards by David Stocker

What is characteristic of Linear models?

All the co-efficients/parameters (β₀, β₁, β₂) in a linear model are linear –> simple
The data can be fitted with the Ordinary Least Sqaures (OLS) Solution –> minimizing sum of the residuals

Example shown below

Note that even the last example with e^B0 is linear as it is a constant term

How well did you know this?

Not at all

Perfectly

How can we characterise Non-Linear Models?

A non-Linear model is not linear in the parameters

Examples of Non-Linear models

In all of these examples, at least one parameter is non-linear (x_i^β2)

How well did you know this?

Not at all

Perfectly

When trying to fit a Linear model to our data, how do we decide what’s best?

Least Squares Solution!

How well did you know this?

Not at all

Perfectly

Can we apply the Least Squares solution to a Non-Linear Model?

No! –> It does not work!

How well did you know this?

Not at all

Perfectly

Why do we care about Non-Linear models in the first place?

Many observations in biology are not well-fitted by linear models –> the underlying biological phenomenon is not well described by a linear equation

Examples:

Michaelis-Menten Biochemical Kinetics
Allometric growth (growth of two body parts in proportion to each other),
Response of metabolic rates to changing temperature
Predator-prey functional response
Population growth
Time-series data (sinusoidal patterns)

Non-Linear model Example – Temperature and Metabolism

Enzyme responsible for Bioluminescence is very temperature dependent –> captured by modified Arrhenius equation

How well did you know this?

Not at all

Perfectly

So how do we fit data when with a Non-Linear Model?

We can use a computer to find the approximate but close-to-optimal least squares solution!

Choose starting values –> guess some initial values for the parameters
Then adjust the parameters iteratively using an algorithm –> searching for decreases in RSS
Eventually, end up with a combination of β where the RSS is approximately minimized.

Note –> Better if your guess of initial parameters is closing to the global minimum

How well did you know this?

Not at all

Perfectly

Outline the general procedure of fitting Non-Linear Models to data

General Procedure

Start with an initial value for each parameter
Generate a curve defined by the initial values
Calculate the RSS
Adjust the parameters to make the curve fit closer to the data (Minimize sum square of residual) - Tricky part
Adjust the parameters again…
Iterative process –> repeat steps 4+5
Stop simulations when the adjustments make virtually no difference to RSS

How well did you know this?

Not at all

Perfectly

What are the two main types of Optimizing Algorithms used when adjusting parameters to minimize RSS?

Gauss-Newton algorithm is often used but doesn’t work well if the model to be fitted is mathematically complicated (parameter search landscape is difficult), plus furthermore it does not help if the values for parameters that you have inputted are far from optimal
Levenberg-Marquardt –> algorithm that switches between Gauss-Newton and “gradient descent” (Helps decide which direction to take in a complicated landscape) –> more robust against starting values that are far from optimal and is more reliable in most scenarios.

How well did you know this?

Not at all

Perfectly

What should you do when your Non-linear model has been fitted?

Once NLLS fitting is done, you need to get the goodness of fit measures –> Is the model representative?

First, we assess the fit visually
Report the goodness of fit results:
a) Residual Sum of Squares (RSS)
b) Estimated co-efficients
c) For each co-efficient, we can present the confidence intervals (How confident we are that the co-efficient is between a specific range), t-statistic and the corresponding (two-tailed) p-value
You may also want to compare and select between multiple competing models

Note –> Unlike Linear models, R² should NOT be used to interpret the quality of an NLLS fit.

How well did you know this?

Not at all

Perfectly

What are the NLLS assumptions?

NLLS has the all the same assumptions as Ordinary least square regression.

No/minimal measurement error in the explanatory variable
Data have constant normal variance –> errors in the y-axis are homogenously distributed over the x-axis range
The measurement/observation errors are normally distributed (Gaussian)
Observations are independent of eachother

How well did you know this?

Not at all

Perfectly

What happens if our error in our Non-Linear model are not normally distributed?

But what happens when the errors are not normal?

We have to interpret the results cautiously and use maximum likelihood or Bayesian fitting methods instead

How well did you know this?

Not at all

Perfectly

What algorithm is normally used in R?

When using the nls() function –> Gauss-Newton algorithm is used

But for the Levenberg-Marquardt (LM) algorithm –> nlsLM() –> we require the installation of a package - minpack.lm

It offers additional features like the ability to “bound” parameters to realistic values

How well did you know this?

Not at all

Perfectly

Outline the Coefficients in the Michaelis Menten equation.

Co-efficients

Vmax –> Maximum rate of reaction –> occurs at saturating substrate concentration
Km –> Substrate concenttation at Vmax/2 –> indication of affinity –> High = Low affinity/Low = High affinity

Km will dictate the overall shape of the curve –> does it approach Vmax quickly or slowly?

We have to remember that Vmax and Km have to be greater than zero –> important when picking starting values

How well did you know this?

Not at all

Perfectly

How to set up a Michaelis Menten model on R?

MM_model <- nls(V_data ~ V_max * S_data / (K_M + S_data))

V_data –> Rate of reaction

S_data –> Substrate of reaction

How well did you know this?

Not at all

Perfectly

When trying to fit a Non-Linear model on R, what will R do if you don’t input starting parameters/Coefficients?

For nls models you need to provide starting values for the parameters

If non are given then it will set all parameters to ‘1’ and work from there –> For simple models, despite the warning, this works well enough.

How well did you know this?

Not at all

Perfectly

After fitting you Michealis Menten Model what should you do?

Hint - Look at image

Study These Flashcards

First Step is to visualize how well the model fit the data

Create Plot

plot(S_data,V_data, xlab = “Substrate Concentration”, ylab = “Reaction Rate”)

Input Trendline from Model

lines(S_data,predict(MM_model), lty=1, col=”blue”, lwd=2)

After plotting, gather some information using summary()

Estimates –> Estimated values for the Co-efficients (Vmax and Km)

Estimate/Std. error = t-value which has a given Pr(>|t|) –> T-Test to test for the statistical significance of the obtained estimate value

Number of iterations –> Number of times the NLS algorithm had to adjust the parameter values to find the minimal RSS solution.

Achieved Convergence tolerance –> tells you on what basis the algorithm decided it was close enough to the solution –> basically if the RSS does not improve past a certain point despite adjusting parameters the algorithm stops searching.

What are the main differences between lm and nls summary() output?

Study These Flashcards

Difference between LM and NLM summary output?

Generally, the same format except for…

The last two rows are specific to an NLS output 

Number of Iterations
Acheived convergence tolerance

Why they are included?

NLLS is not an exact process, it requires computer simulations.

Normally, the last two rows are not reported BUT they can be useful in solving problems if the fitting does not work

What is a quick way to obtain the co-efficient values from a nlm?

Study These Flashcards

You can quickly obtain the Coefficient values from your NLM using the following code…

coef(MM_model)

Can a ANOVA be performed on a Non-Linear model?

Study These Flashcards

NO! –> ANOVA cannot be performed on a non-linear model

How can you obtain confidence intervals for Co-efficient perdictions of a nls model? What can they be used for?

Study These Flashcards

One very useful thing you can do after NLLS fitting you can calculate/construct confidence intervals (CI’s) around the estimated parameters/coefficients

Use the following Code - confint(MM_model)

It can be used for…

The CI’s can be used to test whether the coefficient estimate is significantly different from a reference value
It can also be a quick way to test whether coefficient estimates from the same model with another population sample have statistically different coefficients

In either case…

If the ranges overlap -> They are not statistically different

If the ranges do NOT overlap -> They are statistically different

Image –> Shows us that we are 95% certain that our co-efficient is located between these numbers

Are R² values obtained from a Non-Linear model reliable?

Study These Flashcards

R² values obtained from a Non-Linear model ARE NOT reliable, and thus should not be used

They don’t always accurately reflect the quality of the fit and can definitely not be used to select between competing models

How can we tell R to start with specific coefficients for a non-linear model?

Study These Flashcards

MM_model2 <- nls(V_data ~ V_max * S_data / (K_M + S_data), start = list(V_max = 12, K_M = 7))

Example –> Include start = list (… , …)

Note –> When selecting starting number make sure they are sensible and make biological sense

Does using different starting values impact the final co-efficient?

Study These Flashcards

YES!

Example below for Michaelis Menten Non-Linear model

Co-efficients both set to one
Co-efficients - V_max = 12 and K_M = 7
Co-efficients - Vmax=0.01 and Km=10

A look at the different outputs!

What happens when you using starting values that are too far from their actual value?

Study These Flashcards

If you provide values that are VERY far from the optimal you will receive an error message –> e.g. Singular gradient matrix at initial parameter estimates error

Takeaway message –> NLLS model fitting is NOT an exact procedure.

But given that you provide starting values that are reasonable, NLLS is exact enough

What is a more robust algorithm that can be used if the standard Gauss-Newton doesn't work?

Levenberg-Marqualdt algorithm --\> uses a function called nlsLM() Note - install and load the a package using the following code: install.packages("minpack.lm") require("minpack.lm") If you were to rerun any nls models that intially produced an error message, nlsLM() is more likely to produce an actual output

Can you bound the co-efficient/parameter values in nlsLM?

Yes! You can also bound the starting values --\> preventing them from exceeding or falling below a Max and a Min. Result of this? Computer is more likely to produce the output in fewer iterations Quick Aside --\> The nls() function too has an option to provide lower and upper parameter bounds, but that is only in effect available when using algorithm = "port" (only available for a particular algorithm).

What happens if you set up the bounds of the co-efficients/parameters too tightly?

If you bound the parameters too much  the algorithm will have insufficient parameter space  solution won’t be as reliable.

What is the main diagnostic plot used to test the appropriatness of an NLLS fit?

Plotting the Residuals of a Fitted NLLS model --\>To check for Nomral Distribution At the very least you should plot the residuals of the NLLS model in a histogram Example: hist(residuals(MM\_model6)) You can run further diagnostics with the nlstools package 1. install.packages("nlstools") 2. require("nlstools")

What does Allometric Scaling of traits refer to?

Allometric Relationships take the form of… y = ax^b - Where ‘x’ and ‘y’ are morphological measures - The constant ‘a’ is the value of y when x=1 - ‘b’ is the scaling component Note that this is not a Linear model --\> Hence, would be a good candidate for nls. Example of an allometric relationship: Body Length vs. Body weight --\> the body weight does not increase proportionally for a given amount of body length

How can we compare NLLS models?

Important to compare NLLS model with one or more alternatives for a more extensive and reliable investigation. Remember R²can not be used for Non-Linear models So how do decide which model is better? Akaike Information Criterion (AIC) using the AIC() function --\> Estimates the information lost as a result of fitting the model Example comparing a Nonlinear Model to a Linear model (Quadratic) AIC(PowFit) - AIC(QuaFit) = -2.1474260812509 How can you tell which one is better? Rule of Thumb if the AIC value difference is more than 2 ( \>2 ), we can decide a winner in terms of the better model

How can we gauge the goodness of fit of a NLLS model?

You can NOT use ANOVA or R-Squared Values The best way to assess the quality of a NLLS model fit is to compare it to another, alternative model’s fit. Other than that... - assess the quality of fit is to examine whether the fitted coefficients are reliable For example: 1. Low standard errors 2. High t-values 3. Low p-values

Stats 6 - Non-Linear Models Flashcards

(31 cards)