Statistics & Research Design Flashcards

Question

Biological Definition of Replicates

Answer 1

An exact copy of a sample that is being analyzed, such as a cell, organism or molecule, on which exactly the same procedure is done. This is often done in order to check for experimental or procedural error. In the absence of error replicates should yield the same result. However, replicates are not independent tests of the hypothesis because they are still the same sample, and so do not test for variation between samples. (Wikipedia)

Answer 2

A method for describing and measuring aspects of nature from samples. We need it whenever the features we are trying to study are noisy/variable/unpredictable. Methods allow us to quantify uncertainty (error) in our esimates. (BIOL 1105 Notes)

Answer 3

Numbers that capture important features of a sample (prior to testing). Summarizes details of the sample (e.g., sample size, average tail length). (BIOL 1105 Notes)

Answer 4

Numbers that capture important features of the population after conducting hypothesis testing. Used to determine how well our observed data fits with a particular hypothesis/null hypothesis. (BIOL 1105 Notes)

Answer 5

Aka dependent or outcome variable. The outcome we are interested in - effect. (BIOL 1105 Notes)

Answer 6

Aka independent or explanatory variable. The thing(s) that we hypothesize is affecting the outcome - cause. (BIOL 1105 Notes)

Answer 7

An 'extra' variable that you did not account for and that influences the variable you are investigating. (BIOL 1105 Notes)

Answer 8

Compares a dataset to the expectation derived from a specific null hypothesis. If the data are too unusual under the assumption that the null hypothesis is true, then we reject the null hypothesis. (BIOL 1105 Notes)

Answer 9

A statement about a population parameter that negates our research hypothesis. i.e., there's no effect or relationship (BIOL 1105 Notes)

Answer 10

The probability of getting a result at least as extreme (or more extreme) than the result we actually did get, assuming the null hypothesis is true. If p < 0.05 there is only a 5% chance that we would have obtained the results we did if the null hypothesis were true so we reject the null hypothesis. (BIOL 1105 Notes)

Answer 11

Provide a measure of uncertainty on an estimate by indicating the plausible range in which we can expect the true value of the parameter to lie. e.g., If we repeated the sampling measure many times, the 95% CI is the interval that would capture the true value 95% of the time. (BIOL 1105 Notes)

Answer 12

Aka false positives. Occur when we reject the null when it's actually true. For example, in telemetry this would be detecting the presence of an animal when it was actually absent. (Adams et al, 2012; BIOL 1105 Notes)

Answer 13

Aka false negatives. Occur when we do not reject the null when it's actually false. For example, in telemetry this would be not detecting the presence of an animal when it was actually there. (Adams et al, 2012; BIOL 1105 Notes)

Answer 14

The probability that a study will correctly reject a false null hypothesis. (BIOL 1105 Notes)

Answer 15

Normally statistically we want fewer Type II errors. Because a study that has low type II error is said to have high power which is what we want. This is because if the power is too low, it allows little chance of finding a significant difference even when a real difference exists. But it really needs to be looked at in the practical sense. So for example, in conservation if you get a false positive (a type I error) and say this action needs to be done to protect such and such species even though it wouldn't actually work you could be spending a lot of money for nothing. But if you get a false negative you could be doing something to save that species but don't.....so really I think the spending money for nothing is better and we still want to limit the type II errors. And I think the same for my research. (BIOL 1105 Notes; Brown et al, 2012)

Answer 16

Alpha level. Sample size. The magnitude of the effect/difference we are studying. The variability (spread) in the data. The test we are using. (BIOL 1105 Notes)

Answer 17

Use a larger sample size. In an observational study where we can't control things this is also the only way which would apply to my research. (BIOl 1105 Notes)

Answer 18

A really high power means that we'd virtually always correctly reject a false null hypothesis, i.e., it means we can more easily detect what we're looking for. (BIOL 1105 Notes)

Answer 19

In a power analysis, the objective is to estimate the sample size needed to detect an effect (i.e. departure from the null hypothesis) with a reasonable level of power while allowing for a margin of error. (Brown et al, 2012)

Answer 20

How close a measurement is to the true value. (Zar, 2010)

Answer 21

How close repeated measurements are to each other. (Zar, 2010)

Answer 22

A value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. Examples of effect sizes include the correlation between two variables, the regression coefficient in a regression, the mean difference, or the risk of a particular event (such as a heart attack) happening. (Wikipedia)

Answer 23

Expressed as +/- percentage points, margin of error tells you to what degree your research results may differ from the real-world results, revealing how different – more and less – the stated percentage may be from reality. A smaller margin of error is better as it suggests the results are more precise. https://www.qualtrics.com/experience-management/research/margin-of-error/

Answer 24

When two predictor variables are correlated. When this happens, these variables cannot independently predict the response variable. Multicollinearity is when more than two are correlated. To address you need to check for this during analysis and may have to just keep one when performing hypothesis tests. https://www.britannica.com/topic/collinearity-statistics

Answer 25

The effect of one causal variable on an outcome depends on the state of a second causal variable (that is, when effects of the two causes are not additive). (Wikipedia)

Answer 26

Factors that that vary randomly across individuals or groups and affect the response variable. e.g., receiver location, individual differences, sampling year The most familiar types of random effect are the blocks in experiments or observational studies that are replicated across sites or times. Random effects also encompass variation among individuals (when multiple responses are measured per individual, such as survival of multiple offspring or sex ratios of multiple broods), genotypes, species and regions or time periods. (Bolker et al, 2009; Whoriskey et al, 2019)

Answer 27

Take random effects into account to prevent pseudoreplication. For example, telemetry data are usually collected on a random subset of individuals from a population. To conduct population‐level inference, individual ID, space, time, and receiver location can be included in the model. (Whoriskey et al, 2019)

Answer 28

Models that combine the properties of two statistical frameworks that are widely used in EE, linear mixed models (which incorporate random effects) and generalized linear models (which handle nonnormal data by using link functions and exponential family [e.g. normal, Poisson or binomial] distributions). GLMMs are the best tool for analyzing nonnormal data that involve random effects. (Bolker et al, 2009)

Answer 29

Cronbach’s alpha is a number between 0 and 1 that measures internal consistency reliability of Likert scales. Zero indicates low internal consistency reliability and 1 indicates high internal consistency reliability. Internal consistency reliability is how well a group of questions measure the same construct. In general, a good Cronbach’s alpha is between 0.75 and 0.90. Cronbach’s alpha is impacted by the number of questions with more questions producing a higher Cronbach’s alpha value, therefore, if the Cronbach alpha level is low, it is possible it is just a matter of needing to add more questions and not poor reliability. If the Cronbach’s alpha is above 0.90 the survey likely has redundant questions that can be removed. (Tavakol, 2011)

Answer 30

Ordinal logistic regression is a statistical analysis method that can be used to model the relationship between an ordinal response variable and one or more explanatory variables (which can be discrete, continuous, or ordinal). This will be used for the social study as likert responses are considered ordinal since they are categorical and have no natural order and because I'm looking at the impacts of various responses on pro-environmental behaviour. https://cscu.cornell.edu/wp-content/uploads/91_ordlogistic.pdf

Answer 31

The dependent variable is measured on an ordinal level. One or more of the independent variables are either continious, categorical or ordinal. No Multi-collinearity - i.e. when two or more independent variables are highly correlated with each other. Proportional Odds - i.e. that each independent variable has an identical effect at each cumulative split of the ordinal dependent variable. https://www.st-andrews.ac.uk/media/ceed/students/mathssupport/ordinal%20logistic%20regression.pdf

Answer 32

One of the most common forms of analysis within qualitative research. It emphasizes identifying, analysing and interpreting patterns of meaning (or "themes") within qualitative data. (Wikipedia)

Answer 33

Home range analysis looks at the area an animal uses for the majority of its activities. Kernel density is one method to evaluate home range. It determines the probability of finding an animal at any one spot. (Calenge, 2023)

Answer 34

Benefits: - Easy to implement - More standardized - Easier to quantify - Makes questions easier to answer for respondent Disadvantages: - If their real choice isn't listed, they're forced to choose another - Subject to bias

Answer 35

Each response in Likert scale gets assigned a particular number in a defined way (e.g., 5 = strongly agree = more positive, 0 = strongly disagree = more negative) then these numbers are used to find overall scores. Open-ended questions are assigned theme codes then summarized.

Answer 36

They can heavily impact results. (Ieno & Zuur, 2015)

Answer 37

"Absolute abundance refers to the total number of organisms in a system (i.e., density or population estimates)." "Relative abundance provides and index (e.g., CPUE) of absolute abundance." (Quist et al, 2009 NA Sampling methods book)

Answer 38

"The proportional (percentage) numerical or gravimetric abundance of a species within a collection of species." (Quist et al, 2009 NA Sampling methods book)

Answer 39

Catch per Unit Effort "the number of fish sampled per unit of effort" "assumed to be directly proportional to density" (Quist et al, 2009 NA Sampling methods book)

Answer 40

"assumes that changes in CPUE reflect a proportional change in abundance, which is often not the case" Different sampling gears cannot estimate the same CPUE because the catchability of fish wth each is different. (Quist et al, 2009 NA Sampling methods book)

Answer 41

Either: 1. Divide the total number of fish by the total amount of effort. Or 2. "Calculate as outlined in 1 for each sampling unit (e.g., net set, electrofishing transect, then average." Biologists should use the second one especially if effort wasn't the same across samples. This is because the second method provides a more accurate mean with variances and these are needed when doing analysis. (Quist et al, 2009 NA Sampling methods book)

Statistics & Research Design Flashcards

(66 cards)