Module 1 Flashcards

(60 cards)

1
Q

A psychological theory says that individual differences in one variable BLANK can be predicted from or causally explained by another variable

A
  • Dependent variable
  • Independent variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Other names for independent variable

A

Predictor, covariate, explanatory variable, exogenous variable, X

(these terms are not perfectly synonymous, depending on context, but they are essentially
interchangeable with respect to how they are included in statistical models)

Experimentally manipulated/naturally occurring ie. country people come from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Other names for dependent variable:

A

Outcome, criterion, response variable, endogenous variable, Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

All statistical models are fundamentally ___________

A

descriptive,

in that they describe the nature of a
dependent variable as a function of one more independent variables or covariates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Models commonly used for two things

A

Description

Causal Explanation

Prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

models are also commonly used for

A

causal explanation:
The model represents the process(es) by which differences in independent variables influence differences in a dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

prediction

A

meaning that observed data is used to develop
a model for how independent variables are related to dependent variables, and then that model is used to predict dependent variable scores in future data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

the primary purpose of machine learning is _____

A

prediction

For example, a social media company
might use data about a person to predict whether that person is likely to click on an ad for a
product.
This prediction is based on a statistical model developed using data from people who have
already clicked on the ad.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How psychologists misapply the word predict

A

Yet, psychologists often use language about “prediction” when presenting statistical models that are mainly meant to describe or explain the association between an independent and a dependent
variable.

For example, a researcher might report that a personality trait “predicts” whether adults suffer
from sleep disturbances.
But this “prediction” is likely meant to explain why certain people are pre-disposed to experience sleep disturbances,
and the statistical model is not necessarily going to be applied to future data to determine the
chance that a given person has a sleep disturbance.

But true statistical prediction is not concerned with “why”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The population of interest

A

population:
Definition 1: The set of all entities (e.g., people, animals, cities, etc) for which a theory is
intended to apply.

Definition 2: The set of all entities to which a research study generalizes.

Definition 3: The natural (psychological) process that created the observed data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. The sampling scheme

sample definition

A

finite subset of entities (or observations) drawn from a particular population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

GRE predictor confusion

A

GRE is supposed to predict whether a student will successfully complete grad school - GRE scores predictor - success dependent
- Not about causal mechanism
Good GRE score isn’t going to cause you to have a good PhD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. Define operational variables: How are we actually going to observe or measure the conceptual
    variables?
A

Operational variable = conceptual variable + measurement error

Often, independent variables are assumed to be measured without error.
This assumption holds in experimental studies, where participants are assigned to a particular
treatment or control group. Group membership, the independent variable, is known for all
participants (regardless of whether random assignment was used).
But in a lot of psychological research, both independent and dependent variables are
characterized by measurement error. If ignored, measurement error introduces statistical bias in
model estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the 3 major features of study design:

A

population

sample

define operational variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Continuous variables

A

have a scale with an infinite number of possible values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Discrete variables

A

are categorical; they have a scale with a finite number of possible values.

in psychology - measure many continuous variables on a likert scale which is categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Nominal variables

A

have a scale whose values have arbitrary numerical meaning.

It only makes sense to say whether two observations are equal, but we cannot say that one nominal value is “greater than” or “less than” another.

For example, membership in a treatment or control group might be numerically coded so that 0 = control and 1 = treated, but the specific numerical values chosen are arbitrary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Ordinal variables

A

have a scale such that lower values are meaningfully defined to be less than
higher values, but we don’t necessarily know by how much a lower value is less than a higher
value.

a Likert-type item response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

frequency distribution

A

is a representation (either tabular or graphic) of the observed values of a
variable along with the frequency, or number of observations, occurring with each value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Relative frequency

A

is the proportion (or percent = proportion × 100) of observations at a given value of a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

histogram

A

is a graph of the frequencies observed at each of several intervals (or bins) along the continuous scale of the variable

Histogram provides frequency within each bin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Distributions of continuous variables are characterized by their

A

centre, spread, and shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

outlier

A

is an unusual observation that falls well outside of the range of most of the other observations in the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Outliers can occur because of…

A

sampling error (the outlying observation comes from a different
population than the other observations),

researcher error (e.g., a data entry mistake was made),

participant error (e.g., the participant did not follow the researcher’s instructions),

or just random chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Exclude outlier with which types of error
researcher or participant error
26
what is the spread
the extent of variability or individual differences in the variable E.g. Scores are clustered from blank to blank but a notable number of people have lower scores
26
Unimodal
one general peak
27
sensitivity analysis
do analysis with outlier and without and report on both sets of data
28
descriptive statistics
describe the centre, shape, and spread of a distribution using numerical information
29
parameter
numerical characteristic of a population
30
statistic
value calculated from the sample data that estimates a parameter
31
Which central tendency measure is higher than the other when asymmetric
mean gets pulled in the direction of the skewness
32
3 measures of spread or variability
1. variance 2. standard deviation 3. interquartile range
33
the mean is more affected by blank then blank
the mean is more affected by outliers than the median
34
standard deviation
represents the average amount that a score differs from the mean of a distribution
35
Calculate sample SD
1. Deviations from the mean - observed score subtract the mean 2. Square the answers 3. Mean of the squared deviations SQUARE ROOT OF THE VARIANCE sample SD is an estimate of the population SD
36
Sample variance
Mean of the squared deviations estimate of the population variance
37
Why do we divide the sample SD by N-1 and not N
leads to a biased estimate of the population standard deviation, dividing by n-1 corrects this bias when we calculate the sample mean we 'use up' once piece of information degrees of freedom associated with a univariate standard deviation
38
Interquartile range
IQR is defined as Q3-Q1
39
Range of a distribution
difference btw max and min
40
Boxplot top of box bottom of box hard line whiskers
Q3 Q1 median Q2 whiskers max and min outliers show up as dots
41
Boxplot is negatively skewed if
distance from the median to Q1 is slightly greater than the distance to Q3
42
probability density functions
give the probability of observing a particular value of a variable To get the hypothetical probability distribution
43
Normal distribution
Normal is a population distribution do NOT describe a sample as normal would make sense that the sample was DRAWN from a normal population distribution Normal distribution is a function of the population mean and SD
44
the normal distribution is a BLANK population distribution
HYPOTHETICAL population distribution doesnt make sense to refer to a sample as normal describe as consistent with a normal distribution
45
Mean is known as the blank blank of a population distribution
first moment
46
what is the first moment of a population distribution
mean
47
the variance is known as the blank blank blank of a population distribution
second central moment
48
what is the second central moment of a population distribution
variance
49
The mean and variance are both a BLANK
average - variance is the average of the squared deviations from the mean
50
why is the variance called a central moment
deviating from the mean
51
what is the third central moment of a population distribution
skewness
52
third central moment
skewness
53
skewness
extent to which the distribution is asymmetric
54
skewness formula
the numerator is the sum of cubed deviations from the mean
54
What is the fourth central moment
kurtosis
55
kurtosis
extent to which the distribution shape is flat (negative kurtosis) or has a steep peak with thick tails (positive kurtosis)
55
kurtosis
fourth moment of the population distribution
56
kurtosis formula
raised to the 4th power
57
is it worse to have non-zero kurtosis or skewness
having non-zero kurtosis is more problematic than skewness (ie having kurtosis is worse) distributions with strong skewness also have nonzero kurtosis