Defining the Data Flashcards

(50 cards)

1
Q

What is a population?

A

The collection of all the individuals of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample?

A

The subset of the population that is selected as the

result of sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a biased sample?

A

Study participants are not representative of the target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an unbiased sample?

A

Study participants are representative of the target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is validity?

A

the extent to which the instruments that are used in the study measure exactly what they should be measuring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is reliability?

A

the extent to which the results of the study are consistent when the study is repeated under the same conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a variable?

A

something whose value can change or vary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is data?

A

the values we obtain when we measure a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two type of variables?

A

1, Categorical “attributes”

2. Quantitative “numbers”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two types of categorical attributes? And their meanings?

A

Nominal: Values are “names” that are unordered categories
Ordinal: Values are “names” that are ordered categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the two types of quantitative numbers? And their meanings?

A

Discrete: Values are integer values 0, 1, 2 … on a proper numeric scale
Continuous: Values are a measured number of units, including possible decimal values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the two types of “continuous” quantitative numbers? And their meanings?

A

Interval: Interval scale variable has no true zero on the scale
Ratio: Ratio scale variable has true zero on the scale (0 just means the absence of something)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is derived variables?

A

variables that you create by calculating or categorising variables that already exist in your data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the two different types of derived variables?

A

Calculated

Categorized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is threshold variables?

A

variables obtained by splitting the values of another variable into categories based on the values of well-known thresholds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a transformed variable?

A

a variable which has been transformed from another variable with a different measurement scale (ex. square rooting numbers, squaring…)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an exposure variable?

A

a variable thought to predict an outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an outcome variable?

A

a variable thought to change as a function of changes in an exposure variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the Center?

A

A representative or average value that indicates where the middle of the data set is located

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is variation in data?

A

A measure of the amount that the values vary among themselves from the average value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is distribution in data?

A

The nature or shape of the distribution of data (such as bell-shaped, uniform, or skewed)

22
Q

What are outliers in data?

A

Sample values that lie very far away from the vast majority of other sample values

23
Q

What is time in data?

A

Changing characteristics of the data over time

24
Q

What are the measures of central tendency?

A

Means, medians & modes

25
What is the central tendency?
the tendency for values in a group to cluster around a central or 'average' value which is typical of the group
26
do extreme values affect the median?
Nope
27
Do extreme values affect the mean?
Yep
28
Do extreme values affect the mode?
Nope
29
What is dispersion? (variability, scatter, spread)
how stretched or squeezed a distribution of values within a sample or a dataset is
30
which percentiles are good summary of a sample?
the “Five Number Summary” (P0, P25, P50, P75, P100)
31
what are the measures of dispersion
Range, interquartile range, and standard deviation
32
What does a small standard deviation mean?
most data points are close to the mean
33
What does a large standard deviation mean?
data points are widely spread from the mean
34
What is a percentile?
is a measure that indicates the value below | which a given percentage of observations in a group of observations fall
35
How do you calculate the IQR (interquartile range)?
Q3 - Q1
36
What is Q1?
25%
37
What is Q3?
75%
38
What is the formula of median when the sample is odd?
[𝑛+1]/2
39
What is the formula of median when the sample is even?
([𝑛/2] , [𝑛/2+ 1])
40
A garden contains 39 plants. The following plants were chosen at random, and their heights were recorded in cm: 38, 51, 46, 79, and 57. Calculate their heights’ standard deviation.
https://byjus.com/maths/standard-deviation-questions/
41
SD indicates the variation where the what is the measure of central tendency?
Mean
42
IQR indicates the variation where the what is the measure of central tendency?
median
43
What is inferential statistics?
statistics used to make inferences based on relationships found in the sample to relationships truly exist in the population
44
What is Descriptive statistics?
statistics used to describe, show or summarize data | in a meaningful way (take pictures of data)
45
What are the two types of statistics?
descriptive statistics and inferential statistics
46
What is a theory?
a generalization about a phenomenon (explanation of how or | why something occurs)
47
What is a hypothesis?
a proposed explanation made on the basis of limited evidence as a starting point for further investigation (without any assumption of its truth)
48
What are the steps of the research process?
1. Initial observation (Research question) 2. Generate theory 3. Generate hypothesis 4. Collect data to test hypothesis 5. Analyse data
49
Why is data important?
 Identifying problems  Planning & making informed decisions  Monitoring/evaluating progress  Test hypotheses & make inferences about populations of interest
50
What is the formula to calculate percentages?
L = sample size [𝑑𝑒𝑠𝑖𝑟𝑒𝑑 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒/100] If L is whole number use average of the L and (L+1). If L is not whole number round to the next whole number