Statistics and Census Flashcards

(39 cards)

1
Q

Research Assessment Steps

A
  1. Define problem
  2. Specify the boundaries to the problem
  3. Develop a fact base
  4. List goals and objectives
  5. Identify the range of solutions
  6. Define potential costs and benefits
  7. Review the problem statement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

“the process of studying a procedure or business to identify its goal and
purposes and create procedures that will efficiently achieve them”.

a problem-solving technique that breaks down a system into its component
pieces, and how well those parts work and interact to accomplish their purpose

A

Systems Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

used to compare the means of two different sets of observed data and to find to what extent such difference is ‘by chance’

only 2 sets of data can be used

A

T-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The main difference is that a t-test is used for small sample sizes (n <30) or when the population variance is unknown and uses the t-distribution. A Z-test is used for large sample sizes ( n>30) with known population variance and relies on the normal distribution

A

z test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sampling error vs. sampling bias?

A
  • sampling bias is collected in such a way that some members of the intended population have a lower or higher sampling probability than others. It results in a biased sample[1]
  • sampling error difference between the sample statistic and population parameter is considered the sampling error.[
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Each individual has an equal chance of being
selected for the sample.

A

Simple random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

every Xth individual is selected from the list, starting at a randomly chosen poin

A

Systemic Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

population may have 2 or more groups in the study; provides the best results because it ensures even coverage of the population but maintains the random selection probabilities
* can be disproportioanl when sampling is not proportional to the percentage of the group populations

A

Stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

population is divided into smaller geographic units such as neighborhoods wihtin a city or blocks within a district; sample consists of random selection within each city or block and all individuals within those are sampled

A

Cluster sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

assignment of numbers or symbols for the purpose of designating subclasses that represent unique characteristics:
1. renaming (social security/uniform numbers)
2. categorical (male, female)

Weakest level of measurement

A

Nominal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

type of statistical distribution where the data points are clustered more toward the lower side of the scale, and there are very few higher scores, resulting in a longer tail extending towards the right side of the distribution

A

Skewed right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

negatively skewed, shows a distribution where the majority of data points are clustered on the right side, and the tail of the distribution extends towards the left. This means that the smaller values (the “left tail”) are less frequent than the larger values.

A

Skewed left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value.

(in squared units)

A

Variance

squaring emphasizes the deviation from the mean, making it more sensitive to large deviations. This can be helpful for certain statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

the average distance from the mean; in original unit

A

Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Linear regression vs. t test?

A

t-test: significance between 2 datasets
linear regression: extent to which an independent variables influence dependent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Hierarchy of Census Data

17
Q

method of estimating future population size by analyzing data points that indirectly reflect population changes, such as school enrollments, voter registrations, utility connections, or housing permits, rather than relying solely on direct population counts; this data is then used to project future population trends and inform planning decisions based on these indicators.

A

Symptomatic projections

18
Q

you have populations for the county/region but you want projections for your local community; you know that historically, your muni population has been 10% of the county population, so you “stepdown” those numbers

A

Stepdown/Ratio method:

19
Q

Uses births, deaths and net migration to estimate population projections

Most complex and used for census and pyramids
Net migration is the most difficult to predict

A

Cohort-Component Method:

20
Q

ratio that measures the number of people who are not in the labor force compared to the number of people who are. It’s used to measure the strain on the working population.;
A good ratio is low, meaning there are enough working-age people to support the dependent population

A

Dependency rate:

21
Q

2010 Census: Fastest growing state, state with largest overall population increase, fastest growing muni

A

Palm Coast, FL is fastest growing between 2000 and 2010, which grew by 92%

Nevada is fastest growing state AND Texas had largest overall population increase

22
Q

2020 Census: Fastest growing state, state with largest overall population increase, fastest growing muni

A

The Villages in FL grew the fastest

Utah is fastest growing state, TX has largest numeric increase

23
Q

Sizes of census geographies (tracts, block groups, blocks and public use microdata area)

A

Metropolitan Statistical Area: one city with 50,000 or more
Consoidlated MSA: CMSA; metro statistical area with 1M or more (18 CMSAs)
Urbanized area/urban cluster: density settled areas with pop of 50,000, core block groups/blokcs with at least 1,000 per square mile to delineate urban core
Census tracts: generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people
Block group (smallest for Single Family 3 and 4); 300 to 3,000
Census blocks: has the smallest unit of 100% tabulation data; average size is 100 people

24
Q
  • Assumes growth will occur at a constant rate
  • Normally accurate for short projection periods; often overestimates population in the long term

graphed as concave ascending curve

A

Exponential Curve

25
upper limit of population is defined assumes declining rate or percentage of growht as upper capacity is approached furhter out the proejction, closer the buildout graphed as convex, ascendung curve
Modified exponential curve
26
assumes growth begins slowly, increases momentum until ti reaches the inflection point then sows to increments of contuosly decreasing rates
Gompertz Growth Curve
27
Assumes growth is predictable based upon past trends in different area
Comparitive Method
28
assumes relationship between local and larger area will remain constant
Ratio Method (Shift Share)
29
poulation change is function of natural increase (birthds-deaths) adn net migration Population cohorts typically 5-year ranges Visualized in population pyramid model
Cohort-Component Method ## Footnote Cohort survival, on the other hand, is a specific part of the cohort-component method, focusing on how a cohort (a group of people born in the same year) survives over time due to mortality.
30
Constant vs. Shift Share
Constant-Share: Assumes that the local economy will grow at the Same rate as the larger (reference) economy Shift-Share: Recognizes that local economy rarely maintains a constant share of the reference economy Modifies "constant-share" method by adding a "shift" term in the formula that accounts for the differences between the local and reference area's growth rates The "shift" term is equal to the difference between an industry's local growth rate and that of the reference economy for a specified period Assumes that the difference between the local and reference economy growth rates will remain constant over time
31
Selected from a populatoin that is divided into multiple groups or classes provides best results because it ensures even coverage of the population but maintains the random selection probability
Stratified Sample | Disproportional: when sampling is stratified but not proportional to the
32
# Cohort-Component Method * Estimates are calculated for current population levels * Projections are calculated for future population levels * Forecasts are subjective and apply only to selected projections * Migration is the movement of people into and out of a given study area * Birth Rate is the total number of babies born per 1000 females in their childbearing years (typically 15-40) * Death Rate is the total number of deaths per 1000 people in the total population
* Estimates are calculated for current population levels * Projections are calculated for future population levels * Forecasts are subjective and apply only to selected projections * Migration is the movement of people into and out of a given study area * Birth Rate is the total number of babies born per 1000 females in their childbearing years (typically 15-40) * Death Rate is the total number of deaths per 1000 people in the total population
33
A population of at least 100,000 people and includes at least one city that has a population of at least 50,000 people or an urbanized area
Metropolitan Statistical Area
34
assumes that population growth is growing at absolute equal increments per year, decade, or other unit of time. It also assumes that growth will follow a similar pattern in future years. When best to use: Use when the pattern of growth is similar to a straight line; it is especially useful when projecting areas experiencing slow growth (like rural areas) **or decline**
Linear projection
35
used to collect more detailed information from approximately one in six households. In addition to all of the 100-percent data, the ----- questionnaire for Census 2000 collected sample data on the social and economic characteristics of the population and the physical and financial characteristics of housing.
Long form census
36
dissimilarity index
demographic measure of the evenness with which two groups are distributed across component geographic areas that make up a larger area. A group is evenly distributed when each geographic unit has the same percentage of group members as the total population. The index score can also be interpreted as the percentage of one of the two groups included in the calculation that would have to move to different geographic areas in order to produce a distribution that matches that of the larger area. **The index of dissimilarity can be used as a measure of segregation.** A score of zero (0%) reflects a fully integrated environment; a score of 1 (100%) reflects full segregation.
37
A properly selected statistical sample of a large population will always
Provide a mathematical estimate of the accuracy of the calculated population characteristics.
38
Descriptive vs. inferential
descriptive statistics state facts and proven outcomes from a population, whereas inferential statistics analyze samplings to make predictions about larger populations
39
computer-accessible files containing records for a sample of housing units, with information on the characteristics of each housing unit and the people in it. based primarily upon counties, and may be whole counties, groups of counties, or places (at least 100,000 people)
Public Use Microdata Samples (PUMAS)