Exam Flashcards

(37 cards)

1
Q

A type of research design involving one time collection of information from any given sample of population elements is called…

A

Cross-sectional design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

All research designs

A

Descriptive, Exploratory, Experimental

Options: longitudinal, cross-sectional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Cross-sectional design
(Descriptive)

A

Single Cross-Sectional Designs
* Sample: One distinct group of respondents
* Data Collection: Occurs once from this group
* Purpose: Offers a snapshot of a specific group at a particular point in time
* Example: Surveying employees’ job satisfaction in a company in 2023

Multiple Cross-Sectional Designs
* Sample: Several distinct groups of respondents
* Data Collection: Occurs once from each group, often at different times
* Purpose: Compare and contrast different groups at different times without repeated measures on the same group. It’s like taking
multiple snapshots
* Example: Surveying employees’ job satisfaction in the same company in 2023, 2025, and 2027 using different employee samples
each time

Commonality: Both designs give insights into a specific time point, without tracking changes in specific individuals over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Longitudinal Designs
(Descriptive)

A

Definitions
* Cohort: A group experiencing a shared event or characteristic in a specific timeframe
* Example: Individuals who entered the workforce in 2020
* Longitudinal Design: Research methodology collecting data on the same subjects repeatedly over time
* Example: Surveying a group of 100 people in 2020 about their job satisfaction, then re-surveying the same
group in 2022, 2024, and 2026 to track changes
Relationship
* Cohort Analysis are a type of longitudinal study
* While cohort specifies whom you’re studying, longitudinal describes how you’re studying them

A panel is…
* … a survey of individuals, households, companies etc. to obtain data on a single subject at regular
intervals over a longer period, using the same sample and carried out using the same methods each
time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The key differences between longitudinal and cross-sectional designs are

A
  1. Timeframe
    • Longitudinal: Studies the same subjects over a period of time.
    • Cross-sectional: Examines different groups at a single point in time.
  2. Purpose
    • Longitudinal: Tracks changes and development over time.
    • Cross-sectional: Compares differences between groups at one moment.
  3. Data Collection
    • Longitudinal: Multiple observations over time.
    • Cross-sectional: One-time data collection.
  4. Strengths
    • Longitudinal: Captures cause-and-effect and developmental trends.
    • Cross-sectional: Quick, cost-effective, and easy to conduct.
  5. Weaknesses
    • Longitudinal: Time-consuming, expensive, risk of participant dropout.
    • Cross-sectional: Can’t track changes over time or establish causation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

3 research designs with their goal

A
  • Exploratory: Typically small sample, no statistical tests,
    Goal: generate first insights
  • Descriptive: Representative sampling, infer from sample onto population,
    Goal: describe the population as it is
  • Experimental: Manipulation of independent variable in controlled setting,
    Goal: measure causal effect of independent on dependent variable
    Example: independent variable: advertising spot A (vs. spot B) dependent variable: sales
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the difference between correlation and causality

A

Correlation does not imply causation, and many misleading statistical conclusions arise when people assume a causal relationship from a mere correlation. Empirical research aims to establish causality through careful study design, such as experiments where variables are controlled and manipulated.

Correlation indicates an association between variables, while causality proves that one variable directly influences another (cause and effect relationship).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the difference between difference and coherence hypotheses

A

A difference hypothesis checks if two things are different from each other (e.g., “One ad campaign leads to more sales than another”), while a coherence hypothesis looks at how two things are connected (e.g., “Spending more on ads leads to higher sales”).

A difference hypothesis compares two or more groups to determine if there is a significant difference between them (e.g., “Ad campaign 1 results in lower sales than Ad campaign 2”), while a coherence hypothesis examines the relationship between two variables, suggesting that changes in one correspond to changes in the other (e.g., “The higher the ad spending, the higher the sales”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name all types of hypotheses

A

difference hypotheses
coherences hypotheses
one-tailed
two-tailed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define deductive/inductive approaches

A

Deductive approach: classical approach, start with problem discovery, confirmatory.
Starts with a general theory or idea, then tests it with data (top-down reasoning). Example: “If customer satisfaction leads to loyalty, then happy customers should return more often.”

Inductive approach: data approach, start with data processing (observations), exploratory.
Starts with observations or data, then develops a theory based on patterns found (bottom-up reasoning). Example: “Many loyal customers seem happy; maybe satisfaction leads to loyalty.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the concept of panel design

A

A panel design is a type of longitudinal study where the same individuals, households, or companies are surveyed repeatedly over a longer period using consistent methods.

While panel designs provide valuable insights into how variables evolve, they face challenges such as panel mortality (dropout of participants), selection effects (non-representative initial samples), and panel participation effects (respondents altering answers due to survey experience)

Concept of panel designs in descriptive research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Problems in panel design

A

panel mortality (dropout of participants)

selection effects (non-representative initial samples)

panel participation effects (respondents altering answers due to survey experience)

Information collected is predetermined.. hard to make changes!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Difference in primary and secondary data

A

Primary data is collected firsthand by researchers for a specific study or purpose (e.g., surveys, experiments, interviews), while secondary data is pre-existing data collected by someone else for a different purpose (e.g., census reports, company records, published studies).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain the Bradley effect + give possible explanation

A

The Bradley Effect refers to the observed discrepancy between voter opinion polls and actual election outcomes for African-American candidates.

Social Desirability Bias – Some voters may have told pollsters they would vote for the Black candidate to avoid appearing racist but voted differently in private.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What research design is used when a problem exists but you don’t know why?

A

Exploratory research design is used when a problem exists but the cause is unknown.

Objective: Discovery of ideas and insights
Characteristics: Flexible, versatile, Often the front end of total research design

Goal: To gain initial insights and understand the problem.
Methods: Often qualitative, such as interviews, focus groups, or observations.
Characteristics: Flexible, small sample size, no statistical tests.

Example:

A company notices a drop in customer satisfaction but doesn’t know why. It conducts focus groups and in-depth interviews to explore potential reasons, such as poor customer service or product quality issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Discuss the pros/cons (benefits/risks) of LLMs in qualitative research

A

Pros (Benefits):
1. Automated Text Analysis – LLMs can process large volumes of qualitative data (e.g., interviews, open-ended survey responses) quickly and efficiently.
2. Sentiment Analysis – They help identify consumer emotions and attitudes in social media, reviews, and survey responses.
3. Thematic Coding Assistance – LLMs can assist in qualitative coding by identifying common patterns and suggesting themes, speeding up analysis.
4. Scalability – They enable large-scale qualitative studies that would otherwise be too resource-intensive for manual analysis.
5. Reducing Human Bias – LLMs provide a neutral perspective, minimizing personal biases in qualitative research interpretation.

Cons (Risks):
1. Context Misinterpretation – LLMs may misinterpret nuances, sarcasm, or cultural-specific meanings in text.
2. Bias in Training Data – If biased data is used for training, the LLM may reinforce and replicate those biases in its analysis.
3. Loss of Depth and Nuance – AI-generated summaries might overlook unique, outlier insights that a human researcher would find valuable.
4. Over-Reliance on Automation – Researchers might depend too much on LLMs, neglecting the importance of human interpretation and validation.
5. Ethical and Privacy Concerns – Using AI to analyze sensitive qualitative data raises privacy issues, especially with proprietary or personal data.

Example:

A company analyzing customer complaints using LLMs can quickly categorize concerns (e.g., product defects, delivery issues) but might misinterpret sarcasm in reviews (e.g., “Great service—if you enjoy waiting three weeks for delivery!”).

17
Q

What is typical for focus groups

A

Typical Characteristics of Focus Groups
1. Group Size – Usually consists of 6-12 participants to ensure diverse opinions while allowing for discussion.
2. Homogeneous Composition – Participants are often pre-screened to share common characteristics relevant to the study (e.g., similar demographics or consumer behavior).
3. Moderated Discussion – A skilled moderator guides the conversation, ensuring all voices are heard while keeping the discussion focused.
4. Relaxed and Informal Setting – Conducted in a comfortable environment to encourage open and honest discussions.
5. Interaction-Driven – Participants engage with each other, reacting to others’ viewpoints, which adds richness to the data.
6. Recorded for Analysis – Typically audio or video recorded to capture verbal and non-verbal cues for later review.
7. Time Duration – Lasts between 1-3 hours, depending on the depth of discussion needed.
8. Used for Exploratory Research – Helps uncover opinions, perceptions, and motivations rather than statistical validation.
9. Application in Marketing and Social Research – Commonly used for product testing, brand perception studies, and policy discussions.

Example:

A company launching a new soft drink might conduct a focus group with young consumers to discuss branding, taste preferences, and packaging appeal.

18
Q

What is the purpose of qualitative research

A

Purpose of Qualitative Research
1. Gain Deep Understanding – Explores experiences, behaviors, and social phenomena in their natural context.
2. Generate New Insights – Identifies emerging trends, attitudes, and motivations that may not be captured through quantitative methods.
3. Explore Complex or Unstructured Topics – Investigates topics that are difficult to measure numerically, such as emotions, beliefs, or social interactions.
4. Develop Theories and Concepts – Helps build or refine theories based on real-world observations and patterns.
5. Provide Context to Quantitative Data – Explains the “why” behind numerical trends and unexpected findings in statistical research.
6. Understand Social and Cultural Meanings – Examines how people’s backgrounds, environments, and interactions shape their views and decisions.

Example:

A company using in-depth interviews to explore why customers feel emotionally connected to their brand, rather than just measuring customer satisfaction scores.

19
Q

Differences in sampling in qualitative vs. quantitative research

A

Qualitative Research (Exploratory & Interpretative)
* Purpose: Understand meanings, experiences, and social phenomena.
* Sampling Method: Non-random (purposive, theoretical, or snowball sampling).
* Sample Size: Small (focuses on depth rather than breadth).
* Selection Criteria: Participants chosen for their rich insights and relevance.
* Flexibility: Sample can evolve as themes emerge.
* Data Collected: Text-based (interviews, focus groups, observations).

Quantitative Research (Statistical & Generalizable)
* Purpose: Measure variables, test hypotheses, and generalize findings.
* Sampling Method: Random or probability-based (random, stratified, systematic).
* Sample Size: Large (ensures statistical significance).
* Selection Criteria: Participants represent a larger population.
* Flexibility: Fixed sample, determined before data collection.
* Data Collected: Numerical data (surveys, experiments, structured observations).

20
Q

What is a characteristic of qualitative research

A

Characteristic of Qualitative Research
* Exploratory & Interpretative – Aims to understand meanings, experiences, and social phenomena.
* Non-Numerical Data – Focuses on text, images, videos, and observations rather than numbers.
* Small, Purposive Samples – Participants are selected for relevance, not for statistical representativeness.
* Flexible & Adaptive – Research design may evolve based on emerging insights.
* Context-Dependent – Considers social, cultural, and environmental influences on behavior.
* Rich & Deep Data – Provides detailed descriptions rather than broad generalizations.
* Subjective Interpretation – Findings depend on researcher analysis and contextual understanding.

21
Q

Match the statements with internal/external validity:
* The extent to which the results of an experiment can be generalized – from sample to population.
* The degree to which findings are representative.
* The degree to which a causal conclusion can be drawn.
* The extent to which changes in dependent variables can be explained by experimental manipulation and not by external factors.

A

External Validity:
* The extent to which the results of an experiment can be generalized – from sample to population.
* The degree to which findings are representative.

Internal Validity:
* The degree to which a causal conclusion can be drawn.
* The extent to which changes in dependent variables can be explained by experimental manipulation and not by external factors.

22
Q

Internal Validity

A

Definition: The extent to which a study accurately establishes a cause-and-effect relationship between variables.

Focus: Ensures that changes in the dependent variable are due to the independent variable and not external factors (confounders).

Key Concern: Controlling for extraneous variables to rule out alternative explanations.

Example: A lab experiment testing the effect of a new medication controls for diet, lifestyle, and other health factors to ensure the drug is the only influencing factor.

23
Q

External Validity

A

Definition: The extent to which a study’s findings can be generalized beyond the specific sample, setting, or time of the research.

Focus: Ensures results apply to different populations, locations, and real-world settings.

Key Concern: Representativeness of the sample and realism of the study conditions.

Example: If a study on consumer behavior is conducted only with university students, its external validity is low because results may not apply to older consumers.

24
Q

Name the two means to increase internal validity and explain their purposes

A
  1. Controlling Extraneous Variables
    • Purpose: Ensures that only the independent variable (IV) influences the dependent variable (DV) by eliminating or reducing alternative explanations.
    • Methods:
    • Randomization: Randomly assigning participants to groups to evenly distribute confounding factors.
    • Matching: Ensuring participants in different conditions have similar characteristics (e.g., age, gender).
    • Statistical Control: Measuring extraneous variables and adjusting for their effects using statistical techniques.
    • Design Control: Using experimental designs that account for confounders by treating them as additional variables.
    • Example: In a drug trial, using random assignment ensures that pre-existing health conditions do not bias results
  2. Conducting Manipulation Checks
    • Purpose: Verifies whether the independent variable was perceived as intended and that it affects the dependent variable as expected.
    • How?
    • Pre-tests and pilot studies to confirm participants understand the treatment.
    • Asking questions that assess whether participants noticed or responded to the manipulated condition.
    • Ensuring the IV has a measurable impact before drawing causal conclusions.
    • Example: In an experiment testing the effect of brand trust on purchase decisions, a manipulation check could ask participants to rate how trustworthy they perceived the brand after exposure to different advertisements.

By applying these strategies, researchers reduce bias, eliminate confounding factors, and improve causal conclusions, strengthening internal validity.

25
Explain the difference between true and quasi-experiments
1. True Experiments * Definition: A research design where participants are randomly assigned to experimental and control groups. * Key Feature: Uses randomization to control for extraneous variables, ensuring groups are comparable. * Purpose: Establishes strong internal validity by isolating the effect of the independent variable (IV) on the dependent variable (DV). * Example: A medical trial where patients are randomly assigned to receive either a new drug or a placebo. 2. Quasi-Experiments * Definition: A research design without random assignment, meaning participants are placed in groups based on pre-existing conditions. * Key Feature: Lacks randomization, making it harder to rule out confounding variables. * Purpose: Useful when random assignment is impractical or unethical, but it has lower internal validity. * Example: Studying the effect of online vs. in-person education by comparing students who self-selected into either format
26
Matching Definitions with Terms * Test group * Extraneous variable * Test units / units of analysis * Dependent variable * Control group * Independent variable
Matching Definitions with Terms * Test group → The group in an experiment that receives the treatment or manipulation of the independent variable. * Extraneous variable → Any variable other than the independent variable that may influence the dependent variable, potentially confounding results. * Test units / units of analysis → Entities being studied or measured in an experiment, such as individuals, organizations, or products. * Dependent variable → The outcome or effect that is measured in an experiment to assess the impact of the independent variable. * Control group → The group in an experiment that does not receive the treatment or manipulation, serving as a baseline for comparison. * Independent variable → The factor that is manipulated in an experiment to examine its effect on the dependent variable.
27
Discuss the importance of reliability and validity in measurement
* Reliability ensures that a measurement produces consistent and stable results over time. Without reliability, findings may be random or inconsistent, reducing confidence in research conclusions. * Validity determines whether a measurement accurately captures what it intends to measure. Even a reliable tool is useless if it does not measure the correct concept. * Both are crucial for scientific accuracy, ensuring that research findings are credible, replicable, and applicable to real-world scenarios. * Example: A bathroom scale that gives the same weight every time (reliable) but is consistently 5 kg off (not valid) highlights why both factors matter
28
Explain differences between mean, median, mode
Mean (Arithmetic Average) * Definition: The sum of all values divided by the number of values. * Use: Best for symmetrical distributions where data is evenly spread. * Sensitive to outliers. * Example: The average income in a country. Median (Middle Value) * Definition: The middle value when data is sorted in ascending order. * Use: Best for skewed distributions or when there are outliers. * Less affected by extreme values than the mean. * Example: The median house price, which better represents typical home values when a few extremely expensive houses exist. Mode (Most Frequent Value) * Definition: The most common value in a dataset. * Use: Best for categorical data or finding the most common occurrence. * Can be used with mean and median for additional insights. * Example: The most frequently purchased shoe size in a store
29
which is a measure of central tendency
A measure of central tendency is a statistical value that represents the center or typical value of a dataset. It helps summarize a large set of data with a single number. Three Main Measures: 1. Mean (Average): Sum of all values divided by the number of values. * Best for: Symmetrical distributions without extreme outliers. * Example: Average income of a group. 2. Median (Middle Value): The middle value when data is sorted in order. * Best for: Skewed distributions or datasets with outliers. * Example: Median home price in a city. 3. Mode (Most Frequent Value): The most commonly occurring value in a dataset. * Best for: Categorical data or distributions with repeated values. * Example: Most common shoe size sold in a store.
30
how do you interpret p-values in hypothesis testing
Interpreting P-Values in Hypothesis Testing * Definition: A p-value represents the probability of obtaining an effect at least as extreme as the one observed in the sample, assuming the null hypothesis (H₀) is true. * High p-values (> 0.05): Suggest that the sample results are consistent with the null hypothesis, meaning there is not enough evidence to reject H₀. * Low p-values (≤ 0.05): Indicate that the sample results are unlikely under H₀, leading to rejection of the null hypothesis in favor of the alternative hypothesis (H₁). * Misconception: A p-value does not measure the probability that H₀ is true, nor does it indicate the size or importance of an effect. * Example: A medical study testing a new drug finds p = 0.03, meaning that if the drug had no real effect, there would be only a 3% chance of observing the sample results due to random variation.
31
Research challenge with its appropriate solution * Measurement error * Sample size issues. * Social desirability bias.
* Measurement error → Use of multiple item scales (Ensures reliability and reduces random errors in measurement). A Multiple Item Scale is a measurement tool that uses multiple questions (items) to assess a single construct or concept. Instead of relying on just one question, multiple items help to capture different aspects of the construct, increasing reliability and reducing errors. * Sample size issues → Power analysis (Helps determine the optimal sample size needed for statistical validity). Power analysis is a statistical technique used to determine the optimal sample size required for a study. * Social desirability bias → Anonymity (Encourages honest responses by reducing fear of judgment).
32
how do you interpret p-values in hypothesis testing
* Definition: A p-value represents the probability of obtaining an effect at least as extreme as the one observed in the sample, assuming the null hypothesis (H₀) is true. * High p-values (> 0.05): Suggest that the sample results are consistent with the null hypothesis, meaning there is not enough evidence to reject H₀. * Low p-values (≤ 0.05): Indicate that the sample results are unlikely under H₀, leading to rejection of the null hypothesis in favor of the alternative hypothesis (H₁). * Misconception: A p-value does not measure the probability that H₀ is true, nor does it indicate the size or importance of an effect. * Example: A medical study testing a new drug finds p = 0.03, meaning that if the drug had no real effect, there would be only a 3% chance of observing the sample results due to random variation.
33
why is multicollinearity a concern in multiple regression analysis
Why Is Multicollinearity a Concern in Multiple Regression? * Inflated Standard Errors & Unstable Coefficients → Hard to determine the true effect of each predictor. * Reduced Statistical Significance → Important variables may appear insignificant due to inflated p-values. * Interpretation Issues → Overlapping predictor effects make it difficult to isolate individual impacts. Detection & Solutions: * Check Variance Inflation Factor (VIF) → If VIF > 10, multicollinearity is a problem. * Examine Correlation Matrix → Identify highly correlated predictors. * Remove or Combine Variables → Use PCA or drop redundant variables. * Standardize Variables → Helps when using interaction terms. Example: Predicting house prices using square footage and number of rooms, which are strongly correlated, may cause multicollinearity
34
Measures of variation
Range Interquartile range variance standard devaition
35
Distribution shape
symmetrical right-skewed left-skewed
36
Empirical rule
68/95 The Empirical Rule, also known as the 68-95-99.7 Rule, describes how data is distributed in a normal distribution (bell curve). It states that: 1. 68% of the data falls within one standard deviation (σ) of the mean (µ), meaning between (µ - σ) and (µ + σ). 2. 95% of the data falls within two standard deviations (σ) of the mean, meaning between (µ - 2σ) and (µ + 2σ). 3. 99.7% of the data falls within three standard deviations (σ) of the mean, meaning between (µ - 3σ) and (µ + 3σ). This rule helps estimate probabilities and understand how spread out data is in a normal distribution. The 68/95 part refers to the percentages of data within 1 and 2 standard deviations, respectively.
37
Basic Assumptions of Multiple Regression
* No (perfect) multicollinearity: There should be no perfect linear relationship between two or more of the predictors ➢ Variance inflation factor (VIF): A tool to identify multicollinearity. If a predictor's VIF is over 10, it suggests strong multicollinearity and may need removal * Normally distributed errors: Normally Distributed Errors: The model's residuals should follow a normal distribution, centering around 0. This means most errors are small and close to zero. * Homoscedasticity: At each level of the predictor variable(s), the variance of the residual terms should be constant * Linearity: The inclusion of each independent variable preserves the straight-line assumptions of multiple regression analysis. I.e., the parameters (coefficients) must appear linearly, the predictors can be transformed in nonlinear ways.