M-side Flashcards

Question

Silberzahn et al. (2018) soccer/red card dataset

Answer 1

Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results Demonstrated the influence that data analysis can have on results Used 29 teams (61 total analysts) to analyze the same data set to determine whether soccer referees are more likely to give red cards to dark skin toned versus light skin toned players Teams used various different analytic approaches effect sizes range from 0.89 to 2.93 in odds ratio units 69% of teams found a statistically significant positive effect and 31% of teams didn’t Analysts’ prior beliefs about effect or interest nor level of expertise nor peer ratings of quality of analyses explained variation in the outcomes Findings suggest that significant variation in the results of analyses of complex data may be difficult to avoid, even by experts with honest intentions Crowdsourcing data analysis, a strategy in which numerous research teams are recruited to simultaneously investigate the same research question, makes transparent how defensible, yet subjective analytic choices influence research results

Answer 2

Behavioral science is unlikely to change the world without a heterogeneity revolution the field’s response to concerns about replicability has concentrated almost exclusively on efforts to control type 1 error, buit the single minded focus on this issue is distracting from, and possibly aggravating, more fundamental problems standing in the way of behavioral science’s potential to change the world: --> the narrow emphasis on discovering main effects and the common practice of drawing inferences about an intervention’s likely effect at a population scale based on findings in haphazard convenience samples that cannot support such generalizations narrow focus on main effects in the population as a whole almost necessarily means a focus on effects in the group with the greatest numerical representation Need a heterogeneity revolution with a new paradigm defined by: 1 - a presumption that intervention effects are context dependent 2 - skepticism of insufficiently qualified claims about an intervention’s ‘true effect’ that ignore or downplay heterogeneity 3 - understanding that variation in effect estimates across replications is to be accepted even in the absence of type-I error Specify that this paradigm shift will change current research practice in the following ways: 1 - increased attentiveness, in the hypothesis generation phase, to the likely sources of heterogeneity in treatment effects 2 - efforts to measure characteristics of samples and research contexts that might contribute to such heterogeneity 3 - use of new, conservative statistical techniques to identify sources of heterogeneity that might not have been predicted in advance 4 - large-scale investment in shared infrastructure to reduce the currently prohibitive cost to the individual researchers of collecting data - especially field data - in high-quality generalizable samples 2 key characteristics of emerging paradigm that distinguish it from current one 1 - intervention effects are expected to context and population dependent 2 - decline effects in later replications are not automatically attributed to questionable research practices in original research

Answer 3

Regression is the prediction of one variable’s value based on another variable’s value (Lewis-Beck & Lewis-Beck, 2016). The fundamental goal of fitting a regression line to a dataset is to calculate regression weights that minimize sum of the squares of the prediction errors, and this is generally performed within a framework that assumes a linear relationship between X and Y. understanding of explanatory power via the coefficient of determination (R squared) Parameter estimates may not be significant due to (1) inadequate sample size, (2) Type II error, (3) specification error, or (4) restricted variance in X. core assumptions --> no specification error (linear relationship and relevant variables are in there) --> no measurement error (accurately measured) --> error terms are homoscedastic / not correlated / normally distributed

Answer 4

research perspectives on Meta-analysis As with traditional significance testing, the goal of meta-analysis is to make inferences about population characteristics and relationships using sample data. Thus, meta-analysis and significance testing are tied together by their common purpose. The main difference between them is that one focuses on analysis of a single study, while the other focuses on analysis of a collection of related studies. General meta-analytic process Step 1 - Clearly specify the characteristic being studied Step 2 - Search for research studies which have analyzed that characteristic Step 3 - Establish a list of criteria (i.e., standards) that the studies located have to meet before they are actually included in the meta-analysis Step 4 - Collect and record info from each study which meets the criteria established in the previous step Step 5 - Lastly, summarize the findings of the studies mathematically. Conceptual premise of meta-analysis founded upon the concept of sampling error sampling error - difference between the characteristics of a sample and those of the population from which it was drawn caused by chance and is the direct result of dealing with a sample that typically represents only a small fraction of the population Because sampling errors are random, they have a tendency to average out when combined across studies. Sampling errors tend to form a normal distribution with a mean of zero. The mean test statistic becomes an approximate estimate of the population test statistic

Answer 5

These five steps are generic and shared by both approaches (H&S and H&O): 1) Define the variable/construct. 2) Gather relevant studies. 3) Set inclusion criteria. 4) Extract data. 5) Summarize findings mathematically. Central foundation of meta-analysis: sampling error --> Sampling error = random difference between a sample and the true population value. --> Key idea: by averaging across many studies, random errors tend to cancel out (central limit theorem). --> Both methods account for sampling error, but Hunter & Schmidt emphasize this more as central to their model. Hunter & Schmidt (1990): - Seeks to estimate true effect sizes by correcting for known artifacts like measurement error and range restriction. - Assumes much of the observed variation is due to statistical artifacts, and focuses on understanding the construct-level relationships. - Typically used more in IO Hedges & Olkin (1985): - Takes a conservative, statistical approach, modeling the observed effect sizes without making psychometric corrections, and places more emphasis on inference precision and heterogeneity modeling. Testing for moderators H/S = (a) if 75 percent or more of the variance is from sampling error we assume no moderation (b) run metas for different levels of the suspected variable H/O = Q statistic (homogeneity, significant = suggests moderators are present) Focus on effect H/S: TRUE effect (psychometric) H/O = OBSERVED effect (statistical) Other notes: --> Study Compatibility: Must ensure studies assess the same construct (otherwise results are meaningless). --> Uneven Sample Sizes: Very large studies can dominate results; some methods adjust for this. Confidence vs. Credibility Intervals: - Confidence Interval (CI) = how precisely you estimate the mean effect size. --> "If I repeated this meta-analysis with different samples, the average effect size would fall in this range most of the time." - Credibility Interval (CrI) = range of effect sizes that exist in the population (based on residual variance) --> "In the real world, the strength of this effect could range from this low to this high, depending on the context."

Answer 6

Preferred Reporting items for Systematic Reviews and Meta-Analyses (PRISMA) created to help increase the standardization and transparency in systematic reviews PRISMA 2020 contains a 27-item checklist, an expanded checklist that details reporting recommendations for each item, the PRISMA 2020 abstract checklist, and revised flow diagrams for original and updated reviews

Answer 7

1. Literature Search and Screening (van de Schoot et al. 2021) - AI can automate or semi-automate identifying relevant studies. - Semantic search tools (e.g., Elicit, Research Rabbit) use NLP to find papers that match the meta-analysis topic, beyond just keyword matches. - AI-assisted abstract screening: Tools like ASReview use machine learning to learn your inclusion/exclusion decisions and prioritize the most relevant papers. - De-duplication and sorting: AI can cluster studies, remove duplicates, and tag preprints, grey literature, etc. 2. Data Extraction Text mining and NLP tools can identify and extract: - Effect sizes (e.g., d, r, OR) - Sample sizes - Moderator variables - Study characteristics 3. Effect Size Computation and Conversion AI can help standardize effect sizes, especially when original studies report statistics in inconsistent formats (e.g., F-values, t-tests, odds ratios). - Automated effect size calculators can be paired with NLP to recognize stats and convert them into a common metric like Cohen’s d or r. 4. Artifact Identification and Correction - AI can flag or estimate measurement reliability, range restriction, or missing artifact data by cross-referencing external databases or using imputation models. - For psychometric meta-analyses (Hunter & Schmidt), this can help correct for artifacts more efficiently. 5. Moderator Analysis and Pattern Detection AI (especially machine learning models) can: - Detect complex moderator patterns (e.g., interactions, nonlinear effects). - Help prioritize which moderators to test based on exploratory analysis. - Use clustering or decision trees to identify subgroups of studies with different effects. 7. Reproducibility and Workflow Automation AI tools can be integrated into reproducible workflows (e.g., using R, Python, or PRISMA-compliant pipelines) that document decisions transparently. - AI could even monitor whether your meta-analysis is following PRISMA, MARS, or APA guidelines in real-time. Future Potential: Generative AI + Meta-Analysis Imagine uploading 100 PDFs and asking an AI to: - Extract data - Compute corrected effect sizes - Run a meta-analysis - Generate a forest plot - Summarize moderators - Write the results section — all in one go. This isn’t far off. Researchers are already combining LLMs with statistical packages (like meta in R or metafor) to automate parts of this pipeline.

Answer 8

SUMAD enables researchers to: - Develop or refine theories by examining patterns across various meta-analyses. - Inform evidence-based practices by identifying consistent findings across multiple studies. - Detect moderators or mediators that influence relationships in different contexts. - Assess the generalizability of findings across diverse populations or settings. Issues: 1. Avoid using meta-analytic results based on very small k (number of studies) 2. Don’t remove outlier effect sizes - often due to real differences in contexts, measures, or samples—not errors. 3. Use multiple publication bias tests - SUMAD treats meta-analytic estimates as inputs for theory testing, like a correlation matrix for a meta-analytic SEM (MASEM). - If your inputs are biased, imprecise, or over-sanitized, the theoretical inferences you draw (e.g., mediation, moderation, causal assumptions) will be flawed. - It’s a "garbage in, garbage out" situation — but with even greater consequences because you’re typically not revisiting the original studies.

Answer 9

Ployhart et al (2025) gives definition of ILM, examples, and different types of recommendations: theoretical, design/timing, analytical/modeling, and reporting Intensive longitudinal models: necessitate multilevel data (level 1, level 2), nested data collected through frequent measurements –typically 20 or more – overly dense spaced durations. The desire is to understand temporal dynamics. NOT a specific type of model ESM (Experience Sampling Methodology) = Most active area where versions of ILM are found Measurement occasions must be adjacent and sequential. Classification typology of ILM: - Time as substantive variable: linear, quadratic trend? Or ignore/model as control? - Modeling time: lagged relationships, or examine equal intervals or event duration? - Event/Context as substantive variable: Continuous model (event impacts trend), or even duration impacts? - Linear vs Non linear trends: is it a linear trend, or is it truly non-linear? - Presence of multiple levels: Two levels (within and between) or more than 2 levels? - Model residual covariance structure: Yes or assume independence of residuals? Theoretical recommendations: - treat time as substantive - incorporate time/duration in hypotheses - contrast within- and between effects - justify timing of measurement Design recommendations: - ensure measurement aligned with cadence of construct/process - determine if collapsing scores across measurement occasions - estimate reliability between and within-person - evaluate missing data Analytical recommendations: - model time using growth model - model intercept and slope variance (random) - model nonindependence of residuals - test differences for within and between subject variance Reporting recommendations: sample size number of repeated measure occasions total observations report missing data, report reliability estimates, report model estimation methods report both within- and between estimates, even if not relevant to hypotheses

Answer 10

Person-Centered Modeling: Techniques for Studying Associations Between People Rather Than Variables. Many commonly used variable-centered models (e.g., linear regression, ANOVA, factor analysis, item response theory, multilevel regression, latent growth models) assume that all individuals come from the same population and differ only by degrees. Person-centered models assume that our population are heterogeneous: population is composed of individuals from groups that differ from one another traditional clustering has hard boundaries (e.g., k-means) and mixture models allow for fuzzy boundaries (e.g., LPA) examples: - K-means: Used to group employees by personality profile based on trait scores, but treats profiles as fixed and doesn't account for measurement error or overlapping profiles. - LPA (a mixture model): Estimates the probability that each person belongs to each profile, accounts for error, and allows for statistical testing of profile differences on outcomes. newer methods such as multilevel mixture models can reveal variability in employee profiles across teams, organizations, or countries. --> Techniques from machine learning, such as unsupervised learning algorithms, and cluster algorithms developed in the context of network models can also be used for person-centred modeling

Answer 11

When conducting a person-centered study, researchers encounter many methodological decision points and challenges that can affect the substantive quality and/or informational value of their results. These issues include: (a) determining an appropriate sample size for the method of choice, (b) deciding on model constraints, (c) selecting an optimal number of classes, (d) deciding whether and how to include covariates in the analysis, (e) testing (versus assuming) invariance of latent classes across samples or time points.

Answer 12

Conceptual elements common to various forms of ESM - Natural environment - capturing experiences as closely as possible to how they would naturally occur - Immediacy of experience - prioritizing concrete and immediate experiences over abstract or recalled experiences - Representative sampling - assessing a range of experiences that accurately reflect an individual’s daily life discusses advantages/challenges

Answer 13

Experience sampling methods: A discussion of critical trends and considerations for scholarly advancement. Q1: Building Within-Person Theory with ESM - Most org theories are implicitly within-person, but past research often relies on between-person designs. - ESM can help test and refine within-person dynamics. - Decide if you're studying experiences (momentary) or abstractions (aggregated patterns), as this influences how you build theory and apply ESM. Q2: Isomorphism and Homology - Use multilevel construct validation to test if your items work similarly at both within- and between-person levels. - Homologous relationships = similar patterns at both levels, but explain why theory/processes differ. - Use strategies like: Aggregating Level 1 data, Measuring stable traits during ESM, Modeling both levels simultaneously. Q3: Sample Size & Power - Level 1 (within-person) power is usually high—overpowering may be a concern. - Report effect sizes and justify their importance. - Benchmark averages: ~835 observations (L1), ~83 participants (L2). - Base sample size needs on phenomenon duration (e.g., how long does it take for change to occur?). - Always report actual sample sizes and missingness per variable. Q4: Motivating ESM Participation - Financial incentives work well—clearly explain the payment plan (per survey, per day, etc.). - Future research should test influence tactics (e.g., social proof, implementation intentions) to improve engagement and data quality. Q5: Psychometrics of Within-Person Measures - Don’t use test-retest reliability at Level 1—use variance decomposition and multilevel reliability estimates. - Use Multilevel CFA (MCFA) to assess factor structure; report fit stats, loadings, and alternate models. Q6: Adapting Scales for ESM Use - Report why and how items were trimmed or modified for within-person use. - Conduct content analysis on shortened scales, especially for formative constructs. - Always report reliabilities and MCFA results for adapted measures. Q7: Modeling Trends and Cycles - Consider social context (e.g., 9–5 jobs) and individual patterns (e.g., shift work). - Test for fixed/random trends or cycles (e.g., Beal & Weiss, 2003)—include them only if significant. Q8: Common Method Bias (CMB) in ESM - Person-mean centering removes need for Level 2 controls in Level 1 analyses. - Control for mood effects using tools like PANAS-X or POMS, especially in morning surveys. - If long surveys are a concern, use shortened/single-item mood scales. - Use lagged variables and t-1 controls to reduce bias and support causal inference. - Report analyses both with and without CMB controls. Q9: Varying Start Times & Work Schedules C- omplex work schedules (e.g., shift work) are not a barrier—design individualized survey schedules. - Track reminder and response times to capture temporal patterns in the data. Q10: Using Secondary Data in ESM - Use secondary sources only if they help address the research question and are feasible (e.g., IRB, funding). - Consider daily or cross-level data from others (coworkers, spouses, sensors). - Even if self-reports and secondary sources align, it can still add value. Consider ESM-based experiments with best practices (e.g., randomization, control groups, multi-level manipulations).

Answer 14

Qualitative research in work and organizational psychology journals: Practices and future opportunities paper focuses on inductive qualitative research, which develops insights from empirically collected data (i.e., moving from empirical data to interpretations and abstractions) highly iterative Important of qual research for IOOB 1 - useful for studying research topics that explore individuals’ experiences, sensemaking, or meaning-making phenomena 2 - qual methods attempt to explore or uncover mechanisms, whereas quantitative methods rely on the predetermination (i.e., hypothesizing) of such mechanisms (e.g., indirect effects) to be tested 3 - ideal to study current org changes and newly emerging topics such as new ways of working 4 - allow the study of events/behaviors that occur with a lower base rate or are otherwise hard to capture (e.g., workplace bullying and physical violence) 5 - when studying sensitive topics of phenomena in vulnerable populations, qual research approaches make it easier to establish rapport with informants, which helps them open up and communicate their worldviews 6 - Can help study topics with practical relevance and the possibility for making a real difference. Rather than trying to ignore contextual factors by generalizing across them, qual research can bring to the fore contextual boundaries, limitations, influence factors and explore how they shape how people feel, think, and behave Types of qual approaches most commonly used: thematic analysis, case study, grounded theory Types of qual data collection methods: semi-structured interviews, critical incidents, open-ended questions on surveys, focus groups, etc.

M-side Flashcards

(38 cards)