Empirical 09.09.24 Flashcards
Lenze/Varsamou (39 cards)
Kinds of Sampling and reasons to do sampling
- Probability sampling
- Simple random sampling
- Stratified sampling
- Cluster sampling
- (Multistage sampling)
Especially appropriate in qualitative research methods, where only few individual cases are analysed.
- Nonprobability sampling
- Typical sampling
- Extreme case sampling
- Concentration sampling
- Quota sampling
Reasons to do sampling:
- Cost saving
▪ Collecting smaller amounts of elements is cheaper than conducting a census (complete enumeration). - Time saving
▪ Data collection in the case of sampling takes less time than a census. - A census or complete enumeration is practically impossible
▪ It is theoretically imaginable, but practically not reasonable or possible.
What is sample mean?
When people use the word ‘average’ in everyday conversation, they are usually referring to the mean. Our sample mean should be close to the population mean.
Factors to calculate sample size
Population size, parameter variation, degree of accuracy, and degree of confidence
- Take into account any practical constraints such as budget, time, or feasibility of collecting data. Sometimes you may need to balance statistical considerations with these constraints.
- If necessary, adjust the sample size based on factors like the complexity of the analysis, potential non-response rates, or the need for subgroup analysis.
- Validate Sample Size: After collecting data, validate whether your sample size was sufficient to provide reliable estimates.
Degree of Accuracy
How accurate our sample represent the population. We typically use the terms biased and unbiased to describe the accuracy of sample statistics.
Arbitrary Sampling
The researcher selects individuals or units of study because they are available, convenient and represent some characteristic the researcher seeks to study without following any specific system, e.g. street surveys.
- Representativity is problematic.
Simple Random Sampling
Most basic kind of probability sampling design. Each case in the population has a known and equal probability of being selected for the sample. Goal: construct a sample that is like the population so that we can use what we learn about the sample to generalize to the population.
Cluster Sampling
Cluster sampling is a spatiotemporal defined conglomeration of elements of the population which form a structurally reduced representation of the respective population.
((Кластерная выборка — это пространственно-временная определенная группировка элементов населения, которая представляет собой структурно сокращенную репрезентацию соответствующей популяции))
Example: GMF experience – time and space – Bonn, June 2024. Population – media ppl from all over the worlds, different regions, genders, ages, etc.
Typical clusters: households, school classes, apartment buildings. Elements included: persons, pupils, households.
Procedure: you choose by random sampling a number of clusters and analyse all units of study that occur in these clusters
Stratified Sampling
Divides the population into smaller groups, or strata, based on shared characteristics. Division should be in a way that the sub-samples are still representative.
Each strata should be representative to certain group of population.
Nonprobability Sampling
(Concentration Principle)
The researcher focuses on the part of the population where they SUSPECT the predominant part of these elements to be.
− Example: Investigation about German skiers,
−95% of all German skiers live in Bavaria – using a random sample only from the Bavarian population.
Cut-Off-Procedure = The less productive or rich part of the study population is being cut off.
Nonprobability (Quota principle)
You select individuals or units of study according to some fixed quota (e.g. male, above 50).
▪ Units of study are selected based on pre-specified characteristics so that the total sample has the same distribution of characteristics assumed to exist in the population that is studied.
▪ Quotas generally rely on demographic characteristics.
Sample Drop-outs
Sample drop-out refers to all cases where an element of a sample could not be analysed. We distinguish between random and systematic deniers.
Random drop-out:
Relocation, illness…
Systematic drop-out:
Persons who deliberately refuse to take part in the survey e.g. highly educated people, people with low language knowledge etc.
Sampling Variation: Problems with Sampling
Sampling variation is how much the results change when you take different samples from a group. Because of this, there is always some uncertainty when you try to make conclusions about the whole group based on those samples.
Sampling variation is about how much a number you find might change when you look at different samples of things.
If you are measuring something in different samples, you might notice that the numbers you get can be quite different from one group to another.
Variability and Sampling Error
A closely related term (almost a synonym) is sampling error. An error in sampling isn’t a mistake — it’s a measure of how much a value differs from the “true” value.
Hypothesis testing
A hypothesis is a proposed explanation for a phenomenon. The term hypothesis is a statement about something that is supposed to be true. The logic of a hypothesis test is to compare two statistical data sets
A hypothesis test involves two hypothesis:
* the null hypothesis
* and the alternative hypothesis
The null hypothesis assumes there is no difference between two groups (e.g. Light color has no effect on plant growth). Researcher tries to disprove or nullify.
“Light color affects plants growth”. The researcher tries to prove this type of hypothesis.
Data Analysis (in pre-testing of survey)
This involves looking at patterns in responses to see where confusion, hesitation, disengagement, or drop-out has occurred.
* You can often be discovered by identifying straight-lining (the same answer is always checked regardless of the question), unanswered questions, and inconsistent or unrealistic responses.
Data visualisation
Descriptive statistics is the idea of quantitatively describing data and you can do that through various means. For example, through visualization techniques like:
* graphical representation
* tabular representation
* summary statistics
It presents the data in a more meaningful way, which allows for simpler interpretation through graphs or through numbers
* Descriptive statistics is about variables
Descriptive Statistics: Charts, Graphs and Plots
Which one you choose depends on what kind of data you have and what you want to display.
- If you want to display
relationships between data in categories, you could make a bar graph. - A pie chart shows how categories in your data relate to the whole set
- ## Scatter plots are a good way to display data points. It shows the relationship between two variables
Function Questions in a questionnaire
Function questions control the course of the questionnaire without bringing any contribution to the actual result interest.
These questions guarantee that the survey questions are applied correctly.
- Ice-breaker questions
- Transfer and Resting questions
- Filter and Funnel questions
- Verification questions
Survey Mode
All surveys are conducted in one of the three survey modes:
* Face-to-Face Interview
* Written interview
* Telephone Interview / Survey.
Additionally, there is a version of written interviews that has established itself in recent years: the online survey.
Advantages/disadvantages – drop-out rates, see the reactions, etc. In phone - more drops out. Social desirability. Time and costs.
Experimental research Design
Experimental research design involves comparing two groups on one outcome measure to test some hypothesis regarding causation.
Example:
* If a researcher is interested in the effects of a new medication on headaches, they would randomly divide a group of people with headaches into two groups.
* One of the groups, the experimental group, would receive the new medication being tested.
* The other group (control group) would receive a placebo medication.
* Groups receiving different medications but would be treated exactly the same so that the research could isolate the effects of the medications.
* Both groups would be compared
What is Research Design?
A design or strategy justifies the logic, structure and the principles of the research methodology and methods and how these relate to the research questions and hypothesis
* Provides a framework for the collection and analysis of data
* Expressing causal connection between variables
* Having temporal appreciation of social phenomena and their interconnections
Mean
Mathematical average of all terms
The mean is the same as the average value of a data set and is found using a calculation.
Add up all of the numbers and divide by the number of numbers in the data set.
Median
The median 𝑥 ̃is the data value separating the upper half of a data set from the lower half.
* Arrange data values from lowest to highest value
* The median is the data value in the middle of the set
* If there are 2 data values in the middle the median is the mean of those 2 values.
* For the data set 1, 1, 2, 5, 6, 6, 9 the median is 5.
Mode
Mode is the value or values in the data set that occur most frequently.
For the data set 1, 1, 2, 5, 6, 6, 9 the mode is 1 and also 6.