Statistics Basics Flashcards
(200 cards)
What means “Doing science”?
Collecting Data so that sample information is a useful representation of the world.
Summarizing Data to make it easier to understand and use for describing the real world.
Using data to critically evaluate evidence for or against a specific hypothesis.
Population of Interest (Main Problem)
Too large to study
Sample
Subset of the population of interest. Knowledge gained from measurements on a sample. Scientists can make estimates of the larger population
What determines whether or not the data collected for a study are representative of the real world?
The methods used to obtain such data. Such methods must include unbised, random procedures
What is a Variable?
Is a characteristic of an object or group of objects that can be represented with a number that has more than 1 possible value
Columns represent…
Variables
Rows represent…
Observations
Ratio-Scale Variables
Have a true absolute zero value.
Quantitative data measured on a scale that has a constant increment between successive values.
Ordinal or Rank Scale Variables
Have values that represent the ranked order of the objects or individuals or individuals with regard to a variable.
However, the actual differences between ranks can differ. Ex. Top 3 GPAs: 4.0, 3.96, 3.7
Discrete variables
Can take only specific values and are often based on counting.
Continuous variables
Can take infinite number of possible values, limited only to the number of decimal places to which the value can be precisely measured.
Categorical variables
Have values that indicate the individual belongs to a class or category. Although these values cannot be inherently represented by numbers, they are often analyzed in terms of the count or proportion of individuals that fall within that class or category.
Population of Interest
Entire group of objects or individuals about which information is desired.
Data
Refers to a collection of observations and/or measurements for one or more variables, made on one or more individuals from the population of interest for the purpose of addessing a specific question.
Statistics
Numbers that describe characteristics of a sample. These are calculated by the data obtained from individuals in a sample.
Sample Statistics and its relation to Population Parameters
Sample statistics are used to estimate or infer something about the values of population parameters.
Sample unit
Is an individual unit that comprises the sample or pop. of interest. For example, sample= people; sample unit= person.
Are Sample Statistics considered accurate with a valid study design?
Even with a valid study design, a sample statistic is more or less accurate. They never represent the true values of the pop. parameter.
Population Parameters
Numbers that describe the characteristics of the entire population of interest.
Random Sample Variation (RSV)
Is the variation in the values of a sample statistic computed from different, independent samples taken from the same population.
They will always happen as long as scientists use samples to estimate population parameters.
Why does RSV happen?
It is a consequence of the randomness of the process by which individuals are selected from a population to create a sample.
In other words, it occurs because repeated samples include different subsets of indivudals who vary with regard to the VoI.
One example as to how sample variation and consequent uncertainty (associated with the estimates of the pop. parameters) can minimized.
If data is obtained using appropiate methods and unbiased procedures.
Bias
Is any systematic deviation of sample statistics away from the true value of population parameters.
Systematic referring to consistently wrong.
Three most common reasons for bias
Confounding, Selection Bias, Information Bias