Exam Flashcards

Question

What is the key concept of Bayes' Rule?

Answer 1

It updates prior beliefs using observed data to calculate a posterior probability.

Answer 2

focus on predicted vote share by state Average of the squared difference between the observed and predicted values for each state Lower value: better fit Squared: always positive, sensitive to outliers

Answer 3

MCMC allows for the estimation of posterior distributions that are too complex for analytical solutions - we have a lot of parameter values. Therefore we use MCMC to explore the parameter space efficiently and account for uncertainty in the predictions. The model samples different combinations of parameter values (e.g., national vote share and polling errors) to generate a range of possible election outcomes. The cleverness of the method is that representative parameter values can be randomly sampled from complicated posterior distributions by using only simple proposal distributions. All we need to decide whether to accept a proposed parameter value is the mathematical formulas for the likelihood function and prior distribution Start by sampling Parameter Values Each parameter represents an important aspect of the forecast, such as: Polling bias (e.g., house effects or demographic underrepresentation). Temporal drift (e.g., changes in public opinion over time). Fundamentals' variability (e.g., standard deviation in structural predictions). The Role of MCMC: The model starts at an initial guess for these parameter values. It takes a "random walk" through the parameter space, proposing new sets of values at each step. The likelihood (fit with observed polling data) and prior (historical expectations) determine whether a proposed set of parameter values is accepted or rejected. --> Can see that we don't have any observations of landslide victories --> impossible according to our fundamentals and the observed data This gives us a posterior distribution. It reflects: The most plausible combinations of parameter values after considering prior knowledge and polling data. Each combination of parameter values corresponds to a potential election outcome. Parameter Values and Posteriors: The posterior reflects the most likely combinations of parameter values (e.g., polling bias, vote share) given the data. Multiple Parameter Values Can Yield the Same Outcome --> A high polling bias (overestimating support for one candidate) might balance out with strong fundamentals favoring another, leading to the same predicted election result.

Answer 4

Vote intentions: The total number of votes each candidate would receive if everyone who intended to vote did so and all votes were counted. - what we meause with polls Official vote count: The actual count of votes after some people decide not to vote or face barriers to voting.

Answer 5

* The closeness of the election stemmed from the ideological and affective polarization between the two parties that has helped create an increasing calcification of people’s political choices. ○ Calcification ○ Between 2016 and 2020 there were only small shifts in outcomes in various states and counties as well as among individual voters. Polarization was visible in how the public perceived the candidates: ○ Trump was perceived as more conservative than in 2016, and Biden was perceived as more liberal than Hillary Clinton. Voting behavior was also more polarized: ○ Trump gained votes among conservatives and Biden gained votes among liberals. ○ Explains Trumps stronger performance among latinos --> Trump’s gains among moderate and conservative Latinos exceeded his losses among liberal Latinos. --> The ideological polarization in voting behavior was more likely to come from that voters changed their issue positions in ways that aligned with their partisanship. A final consequence of calcification is that the election year’s impor- tant events did not appear to affect the outcome very much. --> Local voting patterns did not depend much on Biden’s advantage in televised campaign advertising. There was also little local impact of the rate of COVID infections or deaths in communities For example Strong partisanship helped stabilize Trump’s approval ratings, driven both by the resolute antipathy of Democrats and the continued support of Republicans

Answer 6

Calcified politics refers to a system where political opinions and party alignment become more rigid over time. It results in: Less likelihood for voters to switch party allegiance. Smaller year-to-year fluctuations in election outcomes. Increased partisan parity, where Democrats and Republicans are numerically closer in support.

Answer 7

The "Time for Change" model predicts presidential elections using three fundamental features: 1. The growth rate of the economy during the second quarter of the election year --> Predictor of how they see the economy in November -(economic voting) 2 The incumbent president’s approval rating at mid-year (June approval rating) (captures performance on other issues) 3. The length of time the incumbent president’s party has controlled the White House—the time-for-change factor. (assumes that voters attach a positive value to periodic alternation in power by the two major parties - will feel more positive of giving the opposing party a chance after two terms) - polarization has probably made this less accurate The assumption underlying the time-for-change model is that a presidential election is a referendum on the performance of the incumbent president. Some of the correlation between presidential approval and presidential vote is a byproduct of party identification. However, even after controlling for party identification, there is a strong relationship between approval and vote choice. '

Answer 8

There are 538 electors, based on: 100 Senators. 435 Representatives. 3 electors for Washington, D.C. (added by the 23rd Amendment). Elector allocation is adjusted every 10 years following the Census.

Answer 9

Long-term factors, such as party identification, provide a stable foundation for voter choice. Short-term factors, like campaign events and candidate qualities, interact with this foundation, but their influence is relatively limited. For instance, a charismatic candidate might sway some voters, but the majority will vote according to their party loyalty

Answer 10

2016 - under-representation of R voters in polls (+3% D) - lack of lesser educated in samples -> weighting by education in 2018, 2020 late-deciding voters broke for R 2020 - under-representation of R voters in polls (+4.3% D) (COVID!) - highest state-level error where: more white, more rural, less populated states - not due to errors in sampling frames due to polling mode (coverage) = it was the right population DUE to non-response bias of R voters + R who responded more likely to support Biden! --> Nonignorbale responsebias

Answer 11

Propose a probabilistic likely-voter model that not only uses the typical items (such as those from the Perry-Gallup index), but also adds demographic information about respondents such as age, education, race, income, gender, and strength of partisanship. Since these variables are consistently related to turnout and may provide valuable context for responses to conventional indicators, they should help produce more precise likely-voter probabilities

Answer 12

Longerterm factors detertime the shorter-factors. It is a sociological approach --> focusing on socialisation, how voters gets locked in. They get locked in based on their sociodemographics and partisan identity - short term factors matter based on their partisan identification o Many doesn't consider voting for other parties The Michigan Model emphasizes the long-term, stable factors that shape voter behavior. It highlights the importance of socio-demographic characteristics, party identification, and group-based loyalties. This model suggests that voters are often "locked in" by long-term party affiliation, and while short-term events like campaigns and candidate qualities can have an effect, they are less influential - Long term: Social demographic factors and predispositions as party identification and political ideology. - Short term: The candidates competing in a campaign and the issues raised in it. 1. Sociodemographics: Social characteristics like race, religion, education, and income that shape party identification. 2. Party Identification: A stable psychological attachment to a political party, often formed through socialization. - Party identification = (relatively) stable identity over time at level of individual voter --> occurs through socialization… (parents…) 3. Issue Stances: Voters' evaluations of political issues and candidate positions. 4. Campaign Effects: The influence of candidate qualities, campaign events, and elite cueing in the short term. Long-term factors like sociodemographics and party identification determine the baseline, while short-term factors like campaigns play a smaller but significant role.

Answer 13

We use sociodemographics to move to the state-level predictions In our model -.-> polling data from one state informs us about opinion in other states * We do this through a covariance matrix of states, using sociodemographic factors o Using sociodemographics to see which states look alike and can predict each other --> which ones that goes in the same direction Why important: 1. states are very different sociodemographically 2. states do not move in same direction! Based upon an assumption that similar states regarding sociodemographics should move similar directions - We want to say to our model that these two states doesn't look alike because of this socio-demographic

Answer 14

1. psychological identification –> personal experiences, social integration 2. ‘linked fate’ -> shared experiences 3. membership (unions...)

Answer 15

Structural fundamentals (TfC or others (Higgs, 2008)) can estimate outcome through NPV – SPV for each state (hi) * ev of state - Separate the national forecast (based on fundamentals) from the relative positions of the states (for which past elections and current polls can be combined to make inferences). Preelection polls, not surprisingly, are more reliable as they get closer to the election. but...assumes states move in same way from election to election... --> how about sociodemographic changes in the state and state-specific issues? and why not also use polling data (national and state)? - Preelection polls provide contextual information that can be used to correct potential errors in historical forecasts, increasing both their accuracy and their precision. Challenge = how to combine when TfC is election day prediction + polls from campaign? moving to state-level predictions simplest model : state partisan index : NPV – SPV in our model -> polling data from one state informs us about opinion in other states we do this through a covariance matrix of states, using sociodemographic factors why important: 1. states are very different sociodemographically 2. states do not move in same direction!

Answer 16

The Michigan Model posits that party identification is a much stronger determinant of voting behavior. Main argument: Highlights the importance of long-term factors in shaping ones voting behavior --> Fordi partiidentifikation påvirker ens syn på kandidater og holdninger til enkelte problemstillinger, som afgør ens partivalg. - The long-term factors can be seen as providing a baseline from which short-term, campaign-specific factors operate. - Long term: Social demographic factors and predispositions as party identification and political ideology. - Short term: The candidates competing in a campaign and the issues raised in it. >< disagrees with proximity models (median voter theory) that ascribe a high level of rationality and effort on the part of the average voter  economic voting gives more influence to short term factors

Answer 17

Party identification = (relatively) stable identity over time at level of individual voter - stronger and stronger party identification --> more polarized on identity issues and the most important issues is also where people are divided. Elite cue --> people aligned their issue-positions with their partisan identity Polarization: Polarization refers to the growing ideological distance and division between political parties and their supporters. It leads to stronger alignment with party ideologies and increased hostility toward opposing parties ( affective polarization) Calcification is the rigidity in political behavior, where voters and politicians rarely deviate from party lines, limiting the impact of even significant political events on voting behavior.

Answer 18

Calcification produces hardening and rigidity --> people are more firmly in place and harder to move away from their predispositions. ○ Not the same as polarization: § Calcification means less willingness to defect from their party, such as by breaking with their party’s president or even voting for the opposite party. - Polarization means more distance between voters in opposing parties in terms of their values, ideas, and views on policy. There is thus less chance for new and even dramatic events to change people’s choices at the ballot box. --> the election year’s important events did not appear to affect the outcome very much. ○ Local voting patterns did not depend much on Biden’s advantage in televised campaign advertising. ○ There was also little local impact of the rate of COVID infections or deaths in communities § --> This means smaller fluctuations from year to year in election outcomes. For example --> this explains why Trumps approval rating was the same. The same about the economy Balanced election in 2020?

Answer 19

A vote share of 60% implies a decisive win, but a win probability of 60% implies a relatively close race. This confusion stems from the difference in how people interpret probabilities versus vote shares. Reporting win probabilities as exact figures (like 87.4%) can mislead people into overestimating the certainty of the prediction.

Answer 20

Objective evaluations are based on measurable economic indicators like GDP, unemployment rates, and inflation, while subjective evaluations are based on how voters "feel" about the economy. The subjective measurement is important because Attribution for good/bad economy influenced by partisanship --> Perceptual bias * Brady et al. (2022)--> there is a substantial gap between the proportion of Democrats and the proportion of Republicans that believe the economy is improving. There is also a polarization in this gap from 1999 to 2020 Sum up: Partisanship influences how voters interpret economic conditions. Partisan voters are more likely to perceive the economy as doing well when their preferred party is in power, even if objective indicators suggest otherwise. This effect is driven by motivated reasoning, where individuals selectively interpret information to reinforce their existing beliefs

Answer 21

The prediction model combines two parts: 1. structural fundamental forecast of national popular vote (NPV) (prior) --> Static prediction - Based upon economic voting theory as the voters hold the president accountable for the economy etc. --> People converge towards ‘true’ opinions closer to election (economic voting of swing voters) as predicted by structural fundamentals (Erickson & Wlezien 2008) - Structural fundamentals (TfC or others (Higgs, 2008)) can estimate outcome through NPV – SPV for each state (hi) * ev of state but...assumes states move in same way from election to election... and why not also use polling data (national and state)? = how to combine when TfC is election day prediction + polls from campaign? --> The Michigan model 2. state-level polling forecast (i.e. model describing the distribution of data) 2.1 model of differences between states (NPV –> SPV for each state) --> --> The prior differences of each state in voting behaviour 2.2 national and state-level polls 2.3 sampling and non-sampling error in polls 2.4 model for state and national opinion changes during campaign Here we utilize the insights from the Michigan model and assume that similar states (based on similar sociodemographics) vote similar and move similar ways

Answer 22

It argues that most voters are "locked in" due to long-term sociological factors, unlike the proximity model which assumes rational evaluation of candidates.

Answer 23

Economic voting is a proximity model contra Michigan model. It factors in the short-term factors and that the voters are rational. It refers to the idea that voters base their electoral decisions on the current state of the economy. Voters tend to reward incumbents when the economy is performing well and punish them when it is performing poorly. This concept emphasizes voter accountability, where the president or governing party is held responsible for economic outcomes, even though the role of Congress or external factors might also influence economic conditions It can be retrospective (past economic performance) or prospective (future expectations). Economic voting is influenced by both personal (egocentric) and national (sociotropic) perspectives. Source: The American Voter Revisited, Chapter 13

Answer 24

Their model provides an election day forecast by partially pooling two separate predictions: * (1) a forecast based on historically relevant economic and political factors such as personal income growth, presidential approval, and incumbency; and * (2) information from state and national polls during the election season. Use TFC including an term of economic growth and swing voters as their prior for NPV How polls are coming in: State-level polling * Include polls at the national and state level and take each poll to be an estimate of that day’s average support for the Democratic and Republican candidates for president with modeled bias and variance. (national polls are summed out to the 51 states depending on size) * Model for state and national opinion changes throughout the campaign --> Share information across states contemporaneously and over time. Treat the development of state-level public opinion as a correlated random walk for which we have prior information from the fundamentals-based prediction at t = T (Election Day). * The correlation in the random walk imposes our assumption on the estimates that in the absence of data similar states will move in similar ways, i.e., if we have polls for Washington but not for Oregon, then the daily trend for Oregon will look similar to the trend in Washington with added uncertainty. models for sampling and non-sampling error in polls * Include adjustment terms for pollsters, polled population (e.g. likely voters and registered voters), and polling mode (e.g. live caller or online). --> Our priors for the parameters are centered at 0 with different standard deviations --> are estimating the deviation given this year’s polling data, thus not using potential information about the quality of the pollsters or the reliability of likely voter adjustments from past data, out of concern that these may not provide reliable indications for the current election. * Partisan nonresponse adjustment --> If a pollster does not adjust for the partisan composition of their sample, shifts in support can reflect a changing sample composition. * Add additional uncertainty to each poll where the scale varies based on whether it is a state or a national poll. * Include state and national level polling error terms, which allows for unmodeled measurement error for each poll beyond the stated margin of error Uncertainty: The model estimates a large number of parameters with a relatively small number of polls. To estimate the parameters, we must supply relevant prior information. we started with values that we deemed reasonable a priori such as a 3% polling error for each poll based on historical data, but also evaluated the model output to determine whether the model gave sensible results Model Vote intensions which we then use to predict the electoral outcome

Answer 25

They propose augmenting the set of questions typically used to assess likelihood of voting Propose a probabilistic likely-voter model that not only uses the typical items (such as those from the Perry-Gallup index), but also adds demographic information about respondents such as age, education, race, income, gender, and strength of partisanship.

Answer 26

Bayesian probability: * Inferential statistical techniques: assign precise measures to our uncertainty about possibilities * Uncertainty is measured in terms of probability Sees probability as a measure of subjective belief about how likely an event is – not long-run frequency. For example, saying there’s a 53% chance of Harris winning reflects your degree of confidence, not a repeated frequency. - Whenever we ask about how likely an outcome is, we always ask with a set of possible outcomes in mind. This set exhausts all possible outcomes, and the outcomes are all mutually exclusive. This set is called the sample space.

Answer 27

The updated belief distribution of a parameter after taking into account the evidence from observed data.

Answer 28

Political regions – arbitrary however geographic factors provide some residual information that electoral history does not. These residual factors could include things such as the concentration of manufacturing jobs in the Midwest, natural disasters such as Superstorm Sandy affecting the Mid-Atlantic or really anything else that would not be evident in states' political or demographic similarities. Things that can have an shared effect on the electorate in these places

Answer 29

It is the "pool" from which the sample is drawn and ideally should closely match the target population to ensure representativeness. sampling frame -> how to identify pool of respondents that is representative If your sample frame is only landlines and only old people have landlines the sample frame is biased

Answer 30

There is a lot of noise and little signal in individual polls (and short-term variations) --> they are filled with house effects and bias. Especially in state level polls are the swings often due to noise rather than opinion change Therefore we can pool the pools so that the Sample size increase -> uncertainty about estimates decrease as so do the 95 % confidence interval o Statistical uncertainty decreases with the higher sample size  minimizes the sampling error - Combine the information in the published polls, leveraging them against one another, so as to obtain a clearer picture of what might be going on in the electorate over the campaign Can't pool them naively - polls can be biased due house effects. Use previous estimates of these as indicators when pooling the polls. Statistical model for pooling the polls that: ○ (1) the model pools the polls, over- coming the limits to precision inherent in any one poll; ○ (2) the model smoothes over time, consistent with the notion that although support for either party fluctuates or trends over the course of a campaign, each campaign day need not be considered de novo—yesterday’s level of Coalition support will generally be an excellent predictor of today’s level of support; ○ (3) the model estimates and corrects for the possibility that any single poll is subject to bias or ‘house effects’

Answer 31

They have a very strong prior -> 538 trust less in polls and don't use them until 75 days prior --> “treats the fundamentals as a strong prior and Silver treat them as a weak one that you should be pretty eager to discard once you get enough polling. And I think one should be wary of strong priors in data-poor environments (only one election every four years) like election forecasting.” --> The Fivethirtyeight model expresses a lot of uncertainty because it’s allowing for large, Dukakis-versus-Bush-in-1988-style swings in public opinion and The Economist’s model and Nate’s model are making stronger predictions by assuming that swings during the campaign will stay in the narrow range that we’ve seen in recent national elections. -OUR MODEL --> I turned down the effect of our prior with a small amount Silver argues that fundamentals are also quite unpredicatable Say that the reason behind 538's mistrust in polls is irrational because state-polls and polls from 1948-2020 and polls now are a lot better As a result of these assumptions about polling movement, 538’s model is extremely uncertain about much of anything at this stage. - Example of Texas being a likely tipping-state --> because of to much uncertainty. in Pennsylvania, 538’s 95th percentile forecast covers outcomes ranging from roughly Biden +18 to Trump +17.

Answer 32

Remember on class 2016 vs. 2020 On average in the 2019-20 cycle, all election polls underestimated the performance of the Republican candidate by a whopping 4.8 percentage points. --> So the big issue in 2020 wasn’t that the polls were that inaccurate — they were only slightly more inaccurate than usual — but that they almost all missed in the same direction. Bigger change for systematic polling errors in the future: There’s less chance for errors — overestimating the Democrat in one state, and the Republican in another — to cancel each other out. More and more Online polling data (see Bailey 2024 for probability vs. non-probability sample) - However It’s no longer clear that live-caller telephone polls are outperforming other methods, so they’ll no longer receive privileged status in FiveThirtyEight’s pollster ratings and election model

Answer 33

It links polling data across states, allowing for information sharing between similar states. account for trends

Answer 34

Since only a fraction of registered voters actually vote, polls that include all registered voters risk over-representing groups less likely to vote. Likely Voter models correct for this by identifying the subset of voters most likely to participate, thereby improving the accuracy of polling predictions. Registered voters (RV) out of eligible = ~70% - however there are still some voters who doesn't show up Different models: RV, LV and partisan identification - if we just used RV as sample --> Overrepresenting D’s because they are more likely to answer - nonresponse bias --> if we use ‘certain to vote’ -> swing voters who might SWING can be left out and people misreport whether they will vote - Turnout is skewed based on sociodemographics, education – hispanics is less likely to vote for example Its population = subset of citizens who can be expected to vote on election day prediction of who will turnout + their choice today However Nonignorable nonresponse bias can be an issue?

Answer 35

Key contribution: - Separate the national forecast (based on fundamentals) from the relative positions of the states (for which past elections and current polls can be combined to make inferences). Preelection polls, not surprisingly, are more reliable as they get closer to the election. --> use only one set of preelection polls from 9 months out - Linzer says that updating continously increases precision. Their probability was because 2008 wasn't close o They generate both a posterior distributions for the national popular vote and each state’s position relative to this (based on previous elections)  they add them together to get posterior distri- butions for the proportion voting Democratic in each state. * Focus on state relative positions --> how they normally deviate from the national vote share of democrats o Thats what we want to do with state-polls --> see where they correlate and make them predict each other * Used in our state-correlation matrix --> we use how state relative positions --> how they normally deviate from the national vote share of democrats When we go from the prior (NPV) to the states vote share we use state relative positions (based on previous elections) to see the difference between each state

Answer 36

Ignorable nonresponse (coverage) --> Nonresponse associated with observed characteristics such as age and education --> Solution: Weighting can correct for this, for example, by adjusting the proportion of young men in the final dataset to match their share in the overall population --> for example likely voter models is a kind of weighing Nonignorable nonresponse (missingness)  When people’s willingness to respond is related to their opinions even after accounting for demographics. o if they differ with regard to some characteristic related to the quantity we are trying to measure. * Example: In political polling, for example, nonresponse appears nonignorable for questions about intention to vote - or if Trump supports doesn’t respond. * Cannot use weights because we don’t know how many people are x in the whole population Examples of this: 2016 - under-representation of R voters in polls (+3% D) lack of lesser educated in samples -> weighting by education in 2018, 2020 late-deciding voters broke for R 2020 - under-representation of R voters in polls (+4.3% D) (COVID!) highest state-level error where: more white, more rural, less populated states not due to errors in sampling frames due to polling mode (coverage) non-response bias of R voters + R who responded more likely to support Biden!

Answer 37

In short, Trump stood out for having an extraordinarily stable approval rating as well as a low one. TFC: - The party of the incumbent president tends to win a larger share of the vote when the economy is growing, the incumbent is popular, the incumbent is running for reelection, and the party has held the White House for only one term rather than two or more in a row.  All in 2020 Trump case --> The standard tfc also favored Trump - However The outcome was mostly in line with people’s subjective economic evaluations, which did forecast a narrow defeat for Trump and was very much in line with what Trump’s approval rating predicted.

Answer 38

Two types of sociodemographics (Lewis-beck book) that can affect ones party identification and sets a baseline for short term factors * Group identification - refers to an individual's psychological attachment or identification with a specific social or political group, such as unions, racial or ethnic groups, or religious affiliations. * Class voting - objective demographic traits of individuals, such as age, gender, education, income, and geographic location. **Group membership**: refers to an individual's psychological attachment or identification with a specific social or political group, such as unions, racial or ethnic groups, or religious affiliations. --> The group shapes the politics of its members because they psychologically identify with it – the more identification the more it influences o Identification is based on a sense of shared purpose, identity, or norms within the group  collective identity. Psychological identification –> personal experiences, social integration, socialization 2. ‘linked fate’ -> shared experiences --> o Group membership tends to be stable over time, as these identities are deeply rooted in socialization and historical context. o Examples: African Americans’ consistent Democratic voting patterns due to shared experiences and group cohesiveness Social characteristicsare objective demographic traits of individuals, such as age, gender, education, income, and geographic location. o These characteristics describe the individual without necessarily implying psychological attachment or shared identity. o Social characteristics influence politics indirectly by shaping an individual’s experiences, environment, opportunities Social class --> objective (income, type of employment, education) and subjective * People with social characteristics can differ for at least two reasons. o First, the same social situation tends to engender some of the same experiences and correspondent political attitudes. For example, college students who rely on student loans may be sensitive to campaign promises of educational aid. o Second, a changed social placement can lead to different expectations about one’s political role

Answer 39

First, before conducting a study, researchers explicitly state what they believe to be true, and how confident they are in that belief. This is called a “prior”. Next, after acquiring data, they update this prior to reflect the new information—gaining more confidence if it confirms the prior, and generally becoming more uncertain if it refutes the prior (though not if the new numbers are so definitive that leave little room for doubt). In this framework, the expected distribution of potential vote shares in each state derived above is the prior, and polls that trickle in during the course of the campaign are the new data. The result—a “posterior”, in Bayesian lingo—is our forecast.

Answer 40

Select the 50 posterior draws of π_(iJ ) produced in a single iteration of the sampling algorithm, and tally the total number of electoral votes in states where the Democratic candidate is forecast to receive more than 50% of the two-party vote. I then add the three electoral votes of the District of Columbia, which is reliably Democratic. Repeating this calculation across multiple sampling iterations produces a distribution of predicted electoral vote outcomes. The proportion of these outcomes in which the Democratic candidate receives an absolute majority—270 or more—of the 538 electoral votes is the Democratic candidate’s probability of victory.

Answer 41

Possible reasons include late-deciding voters, nonresponse bias, and under-sampling of key demographic groups.

Answer 42

Negotiations among the 13 states at the constitutional convention --> Two discussed models o The Virginia plan: number of representatives depend on state population o New Jersey plan: Unicameral legislature, where each state receives one vote The Great Compromise established a bicameral legislature: House of Representatives: Representation based on state population. Senate: Equal representation, with each state having two senators.

Answer 43

The problem weighting sets out to solve is that survey samples often have too many people from some groups and too few people from other groups regarding the desired population. Weighting takes on this problem by giving more weight to the people in the “too few” category and less weight to people in the “too many” category. - Here we again differ from nonignorable and ignorable nonresponse  can weight ignorable easily - Two widely used algorithms for generating survey weights: Cell-weighting & Raking. Problems: can have very few in one group (imprecise) and are vulnerable to unrepresentativeness within subgroups

Answer 44

Race White voters are much more likely than those in other racial and ethnic groups to associate with the Republican Party. Hispanic and Asian voters tilt more Democratic. Black voters remain overwhelmingly Democratic. o Have been some shifts toward the GOP in most groups in recent years. We see a shift in every race group towards trump in 2024 - improves in latino Education - Has come a diploma gap – realignmet towards The Republican Party now holds a 6 percentage point advantage over the Democratic Party (51% to 45%) among voters who do not have a bachelor’s degree. - college degree 55 % to Democrats - Since 2017, the gap in partisanship between college graduates and those without a degree has been wider than at any previous point since 1990's - Gender gap between white women with and without White voters are far more polarized along educational lines than are Hispanic and Black voters. Also see voters without a college degree differ substantially by income in their party affiliation  Social class?

Answer 45

Part of the reason Biden’s coalition helped him win is simple math: the demographic groups that tended to support him make up a big chunk of the party. o older voters are more numerous than the youngest voters. o Strong Democrats are more numerous than in- dependents. o Black Americans are numerous in the primary electorates in key states like South Carolina but not in Iowa and New Hampshire.

Answer 46

Deterministic likely-voter models—typically called “threshold” or “cutoff” models—create a likely-voter score for each respondent and then create a decision rule for whether to include or exclude a response when calculating election predictions o Perry-gallup index o Loss of information – we know some non-likely voters voted Probabilistic models: Each respondent is assigned an estimated probability that they will vote. This probability is then used as a weight: Responses from those who are more likely to vote are weighted more heavily than responses from those who are unlikely to vote, but all are included in the election prediction.

Answer 47

1. Prior Prediction: Use national polling and fundamentals (like the economy) to estimate the popular vote. This step avoids overfitting by simplifying through techniques like elastic-net regularization. Models are trained using leave-one-out cross-validation to ensure robustness, with adjustments made for state partisan lean. 2. Incorporate New Data: Add recent polling data, reflecting shifts in voter sentiment. 3. Update Prior Using Markov Chain Monte Carlo (MCMC): Explore thousands of parameter values to evaluate how well they fit the data and align with prior expectations. This generates a range of possible election outcomes, with more likely scenarios appearing more often in the simulations.

Answer 48

Priors are the initial beliefs about parameter values before observing data. They play a critical role in Bayesian inference as they combine with observed data to form the posterior distribution. Our prior is the fundamentals model

Answer 49

1. Sociodemographic characteristics like race, gender, education, religion, and social roles create distinct groups within a society. 2. Group membership often fosters collective identity or shared experiences. For instance, people within a group may perceive shared benefits or challenges (e.g., racial minorities supporting policies that promote equality). 3. Members of a group often make decisions based on aggregated rational cost/benefit calculations, meaning they consider how policies impact their group rather than solely focusing on personal outcomes. Example: Evangelical Christians are typically more supportive of Republican candidates. Over time, shifts in political alignment can occur due to major historical events or policy changes. For example can Trump's appearance maybe be seen as something that has created a realignment

Answer 50

Probability as a measure of subjective belief about how likely an event is – not long-run frequency - Bayesian data analysis adopts this version of probability and utilizes Bayes’ rule to update prior probabilistic beliefs based on observed data. It’s a way of combining what you already believe (prior) with what the new data tells you (likelihood) to get an updated belief (posterior). - Likelihood  You calculate how well the new data supports different possible values of the parameter. For example, if polls/evidence favor Harris more than the prior, the likelihood will push the posterior belief upward. The posterior is a compromise between the likelihood (=Evidence) and the prior

Answer 51

A maintaining election is one in which stable partisan attachments continue to be a major determinant of election re- sults. * The party that has a majority in terms of party loyalties wins the presidency, generally through affecting the partisan attitudes  has more partisan identification * 1964 LBJ A deviating from dominance election would then be one in which the short-term partisan attitudes lead to the election of the presidential nominee of the minority party, without a fundamental shift in the party identification bal- ance in the nation. * It is important to emphasize that this definition of a deviating election is based on short-term defection from partisanship due to the six partisan attitudes, not a long-term decline in the strength of partisan ties. Long-term partisan decline could be demonstrated only when the majority party loses election after election Reinstating dominance - the majority party wins again * JFK 1960 and Carter 1980 The realigning election is one in which the partisanship of people changes. * The realigning election is one in which the partisanship of people changes. * Definition: Significant shift in the sociodemographics group --> can lead to balancing or dominance * Moving a lot of voters around * New Deal realignment FDR  the movement of Catholics and Jews into the Democratic Party marked the New Deal realignment of the 1930s * Lesser extent with Reagan 1980 --> White working class, white evangelists and gender * 2024  realignment of working class (education) – Age, racial groups “Balancing” election (2016 and 2020) o One in which neither party has a majority in party identification. o The partisan balance is so close that either party could win. o This is not an election in which short-term forces work against the majority party, but one in which it is not possible to speak of a majority party. o (speaks to Sides et al.’s point about calcification) o Maybe also 2024 The real story of the last 50-plus years of American presidential elections is a weakening of the Democratic lead in party identification to the extent that elections are very close and can be swung by any number of short-term matters.

Answer 52

Two fundamental components to Bayesian data analysis: The reallocation of credibility across possibilities The possibilities over which we allocate credibility are parameter values in meaningful mathematical models Parameter values are the unknown quantities we aim to estimate -> the polling error, standard deviation of Fundamentals model The prior represents our initial belief about the plausible range of parameter values before seeing the data. - can be informative or non-informative Each set of parameter values is evaluated by comparing how well it fits the observed data. Each sampled parameter value represents one possible "state of the world." The likelihood tells us how well a specific set of parameter values explains the observed data (e.g., polling results). Posterior - The posterior is the updated belief about the parameter values after combining the prior and the likelihood (based on Bayes’ Rule): - The posterior distribution assigns probabilities to all possible parameter values, reflecting how plausible they are given the data. Simulation where we randomly sample the posterior Take a random walk through the parameter space, favoring parameter values which have a high posterior probability based on the prior and polling data To take next step, there is a proposed jump from current position, with jump sampled randomly from a proposal distribution Proposed jump accepted or rejected probabilistically, depending on the relative densities of the posterior at the current and proposed positions Key elements: Descriptions of the data: Mathematical formulas that characterize the trends and spreads in the data Formulas contain parameter values which determine the shape of mathematical forms Control knobs on mathematical devices that simulate data generation; change a parameter value, and you alter a trend in the simulated data

Answer 53

The first step: Generate a prediction for the national popular vote on election day (prior) Fundamentals model are quite stable and often foreshadow voters behaviour - economic voting (Erikson & Wlezien). average Error as late-campaign polls -Avoid overfitting the fundamentals-based model by simplifying (elastic-net regularisation) and training model (leave-one-out cross-validation: chopping up a dataset into lots of pieces, training models on some chunks, and testing their performance on others. In this case, each chunk is one election year) Their fundamentals model: - No benefit for term - they assigned a penalty to parties that had already been in power for at least two terms (in keeping with the spirit of the “Time for Change” brand). - Broad range of economic metrics – change in them 2) State-level polling forecast --> Go from TFC/NPV to State partisan vote share. first Model differences between states to get SPV - Same process as above but project each state’s “partisan lean”: How much it more it favours Democrats or Republicans than America as a whole does, and thus how it would be expected to vote in the event of a nationwide tie. - So, to get the prior for the states vote share: TFC +- the states partisan lean. Based upon each state vs. national difference in the 2020 (and 2016 - economist) elections In this framework, the expected distribution of potential vote shares in each state derived above is the prior, and polls that trickle in during the campaign are the new data. 3) Polling - first a model for polling error, then the state-correlation matrix for Unceartanty regarding polls: Non-sampling error - (1) polls are subject to the vagaries of voter turnout. - (2) House effects The final step in our treatment of polls is pooling the information they provide.  Information sharing across states about state and national opinion changes - (1) an adjustment for overall national trends - (2) extend this method to state polls.  state correlation matrix - Sociodemographics o Every state poll to influence its estimate of where voter preferences stand in every other state, by varying amounts Making the forecast: Simulation MCMC --> Using the posterior, Take a random walk through the parameter space, favoring parameter values which have a high posterior probability --> they are more likely to happen depending on the posterior. Allows state-polling averages to drift a little. Each step of this “random walk” can either favour Democrats or Republicans but is more likely to be in the direction that the “prior” prediction (The tfc and national polls) would indicate than in the opposite one. As the election draws near, there are fewer days left for this random drift to accumulate, reducing both the range of uncertainty surrounding the current polling average and the influence of the prior on the final forecast. The ultimate result is a list of 10,001 hypothetical paths that the election could take. The more likely a scenario (based upon the posterior probabilities), the more often it will appear in these simulations—but even highly improbable ones (such as Ms Harris winning the electoral college despite losing the popular vote) will show up every so often. The resulting probabilities of victory are simply the fraction of these simulations that a given candidate wins.

Answer 54

Three elements in the Trump presidency, 2020 election and aftermath --> upshot was more calcified politicas o (1) Long-term tectonic shifts have pushed the parties apart while making the views within each party more uniform --> gradually increasing partisan polarization. The Democratic and Republican parties have become more internally homogeneous and more different from each other in political ideology, certain demographic characteristics, and certain policy issues important implications for elections. higher levels of partisanship in presidential approval and voting behavior. The weakening power of other factors that have traditionally affected evaluations of presidents and voting in presidential elections. (2) Shorter-term shocks, catalyzed especially by Trump, have sped up polarization on identity issues * Race, gender, ethnicity etc. The Democratic and Republican parties have rapidly divided on issues related to identity, especially race, ethnicity, nationality, religion, and gender ○ There has been a sharp increase in the magnitude of the already existing divisions * These shocks stem directly from the identity, rhetoric, and decisions of political leaders and how the public has reacted to them. --> Main argument is elite cues - A central part of this story is Trump himself. An example Immigration --> The same pattern characterized attitudes on other immigration topics and identity-inflected issues: any longer-term partisan gap quickly became much larger. (3) It is precisely these identity issues that voters in both parties care more about— exacerbating divisions even further and giving politicians every incentive to continue to play to them. --> The political priorities

Answer 55

The problem: How do we sample a large number of values from an unknown distribution? We use a statistical technique called Markov Chain Monte Carlo (MCMC) MCMC takes a "random walk" through the parameter spaces while exploring new values (e.g., NPV, state biases). Steps/parameter values are accepted or rejected based on depending on the relative densities of the posterior at the current and proposed positions --> favoring a set of parameter values which have a high posterior probability. How credible a particular set of parameter values are, is based on the product of the prior (historical or fundamentals-based predictions) and the likelihood (fit with observed polling data - The probability of observing the polling data regarding the new parameter value ). - Focuses more on parameter values that fit the data well, building a posterior distribution. One draw of these parameter values therefore represents one election outcome. This is done 10.000 times --> Together all of these draws define the posterior probability The ultimate result is a list of 10,001 hypothetical paths that the election could take. The more likely a scenario, the more often it will appear in these simulations—but even highly improbable ones (such as Ms Harris winning the electoral college despite losing the popular vote) will show up every so often based upon the proposal distribution. The resulting probabilities of victory are simply the fraction of these simulations of different parameter values that a given candidate wins.

Answer 56

Descriptions of the data: Mathematical formulas that characterize the trends and spreads in the data - These formulas contain parameter values which determine the shape of mathematical forms - Control knobs on mathematical devices that simulate data generation; o --> change a parameter value, and you alter a trend in the simulated data For example: Normal distribution has two parameters: o Mean: controls location of distribution central tendency (location parameter) o Standard deviation: controls width or dispersion of distribution (scale parameter) o Change of them and you would change the distribution of the data

Answer 57

Different modes (e.g., phone vs. online) can produce systematic differences in responses due to social desirability bias or coverage issues.

Answer 58

Hispanic swung towards trump - Gains across non-white racial groups - Could be because of issue-voting  sociodemographics not as strong anymore - Border counties shifts right-wards  Starr County, population 65,920 and 97% Latino, shifted 75 PERCENTAGE POINTS in Trump's direction. o Class-vote/issue vote more than they identify as non-american/latinos Trump was especially strong with less consistent voters.  likely voter models vrs actual turnout underrepresenting Trump coalition? - young male low-propensity voters turned out - first time voters  more republican because of Covid-19 Polling error is the same direction – pretty much all underestimated trump again  non-response bias of Trump voters? - Third time they miss the same direction - Actually, underestimated him less  weight on education can be something but maybe not enough White educational attainment to Trump  realignment in 2016 already  haven’t moved since - A big part of why Trump won Wisconsin in 2024 is that he continued to make gains in the wards with the lowest levels of college education, while Harris only matched Biden’s performance in the most educated wards. - Low income and low education for trump  just 12 years ago, the inverse was true. (Obamas coalition) Presidential approval set to predict popular vote well - The structural fundamentals worked out to be quite predictive  You can’t argue a bad economy apparently (Dickinson 2014) - Immigration and economy

Answer 59

1. Selection by the legislature: Risk of corruption and cabals, and concern that the president wouldn’t be independent. 2. Direct election by the people: Concerns about voter information in a large country, potential advantage for big states, and Southern states losing power (as enslaved people couldn’t vote). Solution --> The Electoral College, where the president is indirectly chosen: States appoint electors equal to their total representatives and senators in Congress. Electors vote for president and vice president, and results are certified by Congress. Large states received the most electors due to population. Small states were guaranteed a disproportionate share of electors through the Senate. Southern states benefited as 3/5 of the enslaved population was counted for determining electors.

Answer 60

A deterministic model uses a fixed cutoff score to classify likely voters. For example, the Perry-Gallup Index assigns a score to respondents based on their responses to a series of questions (like past voting history, vote intent and interest in the election). Voters with scores above a certain threshold are classified as "likely voters" Rentch et al. 2019: * Deterministic likely-voter models—typically called “threshold” or “cutoff” models—create a likely-voter score for each respondent and then create a decision rule for whether to include or exclude a response when calculating election predictions ○ BUT cutoff approaches suffer from the loss of information inherent in labeling each respondent as either clearly a voter or a nonvoter.

Answer 61

Pocketbook vs. National Sociotropic voting occurs when voters base their electoral decisions on the national economic situation rather than their personal financial situation. Egocentric voting, on the other hand, occurs when voters focus on their own personal finances. Sociotropic voting is generally more influential because media coverage tends to focus on national economic indicators, which affects a broader public perception * The popular assumption holds they peer into their pocketbooks - If my personal finances are suffering, I put blame on the president, * Overall, the pocketbook voting hypothesis receives faint support. ○ If economics makes an important difference, that difference must come from elsewhere. * “Sociotropic” effects: Sociotropic evaluations assess the collectivity, rather than the individual. (Lewis beck et al. 2011)

Answer 62

Polling data from one state informs us about opinion in other states. We do this through a covariance matrix of states, using sociodemographic factors, because they in theory affect voting behavior. why important: 1. states are very different sociodemographically 2. states do not move in same direction!

Answer 63

This study analyzes the relative accuracy of experts, polls, and the so-called ‘fundamentals’ in predicting the popular vote in the four U.S. presidential elections from 2004 to 2016. Fundamentals: o The fundamentals consistently overpredicted the Republican vote by substantial margins. o In other words, in each of the past four elections, the Republicans underperformed relative to the fundamentals, and achieved less votes than what could be expected from historical data. The results further suggest that experts follow the polls and do not sufficiently harness information incorporated in the fundamentals. The key here is that the fundamentals-based forecast provided different in- formation than both experts and polls, thus increasing the likelihood that the combined forecast would bracket the true value Combining expert forecasts and polls with a fundamentals-based reference class forecast reduced the error of experts and polls by 24% and 19%, respectively. The findings demonstrate the benefits of combining forecasts and the effectiveness of taking the outside view for debiasing expert judgment.

Answer 64

Electors chosen by political parties: Typically loyalists or donors; their names rarely appear on ballots. Winner-take-all system: All states except Maine and Nebraska allocate all electors to the statewide winner. Statewide elections: All electors are chosen through elections, not state legislatures. Discretion discouraged: Most electors are bound to vote for their pledged candidate.

Answer 65

The Three-Fifths Compromise determined that enslaved people would be counted as three-fifths of a citizen for population and taxation purposes. This increased Southern states’ representation in the House of Representatives as they had a lot of slaves For example, in 1793, Southern slave states gained 14 extra seats (47 instead of 33) due to this compromise.

Answer 66

1. Virginia Plan: Representation based on state population, favoring larger states. 2. New Jersey Plan: A unicameral legislature where each state receives one vote, ensuring equal representation for all states.

Answer 67

The key point is that the economy, and not a new form of “identity” politics based in part on candidate qualities, was the major election issue despite what was the media's interpretation. Based on the fundamentals it called for a close Obama win --> 2024 case??? The evidence suggests that Voters, while perhaps not deeply informed regarding candidates and issues, do cast their vote based on their understanding of fundamentals, such as the state of the economy, as viewed through their own partisan predispositions.

Answer 68

"House effects" refer to systematic differences in polling results between survey organizations. For example, some polling firms may consistently overestimate support for a specific party due to differences in survey design, weighting, LV's or data collection methods. However we can't counter actual house effects in to our model - we only know house effects after an election. But we can use prior knowledge of past biases and incorporating them as adjustments in models. And look at pollster ratings Gelman 2021 argument --> they cancel each other out because we use so many polls

Answer 69

Because we have many parameters, it quickly becomes analytically untrackable  we use Simulation to sample the posteriour We use a statistical technique called Markov Chain Monte Carlo (MCMC), that explores thousands of different values for each parameter in our model, and evaluates both how well they explain the patterns in the data and how plausible they are given the expectations from our prior. It takes random steps (jumps) to new parameter values based on a "proposal distribution." Steps are accepted or rejected probabilistically, depending on how much better (or worse) the new position explains the data compared to the current one. The probability of a step depends on the prior (initial belief) and the likelihood (the new evidence - polling). Together, they define the posterior probability. So we start at a random set of parameter values (gives one electoral outcome) - jump to another which is accepted or rejected (gives a electoral outcome) Some of them involve large nationwide, regional, or demographic polling errors benefiting one party or another. Some will show registered-voter polls suffering from a large bias in one direction; others little difference between types of survey populations or polling methods. The more likely a scenario, the more often it will appear in these simulations—but even highly improbable ones (such as Ms Harris winning the electoral college despite losing the popular vote) will show up every so often. The resulting probabilities of victory are simply the fraction of these simulations that a given candidate wins.

Exam Flashcards

(93 cards)