Chpt 1 - Introduction Flashcards

1
Q

Statistics ___________ numerical or non-numerical data

A

Collect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Statistics __________ data for the purpose of making generalizations and decisions

A

Analyze

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the core of statistics?

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is data?

A

Any information that has been collected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is data always numerical?

A

No

For example, which political party does someone support is not numerical

Another example is a yes/no answer to a question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is statistics?

A

The science of organizing and summarizing data, either numerical or non-numerical, to provide useful and accessible information about a particular subject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 4 steps to statistics?

A
  1. collect data
  2. summarize data
  3. analyze and interpret data
  4. draw conclusion from data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 2 different ways to classify statistical sudies?

A

Method 1 - descriptive statistics or inferential statistics

Method 2 - observational studies or designed experiments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the purpose of the descriptive statistics study?

A

Summarize data

Examine and explore information for INTRINSIC interest only

For example, if we wanted to know about 100 students but only had enough time to ask 5 students, if we only talk about the 5 students we asked, that’s descriptive statistics

Basically, we are only describing the data we already have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the purpose of the inferential statistics study?

A

To use information from a sample to draw a conclusion about the population

For example, if we wanted to know about 100 students but only had enough time to ask 5 students, and we use this sample information to make a conclusion about the whole 100 students

Basically, we are making inferences of a population based on the information we have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does inferential statistics consist of?

A

Methods of drawing and measuring the reliability of conclusions about a population based on information obtained from a sample of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does descriptive statistics consist of?

A

Methods for organizing and summarizing information and may include:

  • Constructing graphs and tables
  • Calculating various numerical measures such as averages, variations, and percentiles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a population?

A

A collection of ALL individuals of items under consideration in a statistical study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a sample?

A

Part of a population from which information is obtained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the notation for the population size?

A

N

(Make sure it’s a capital letter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the notation for the sample size?

A

n

(make sure it’s lower case)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the number of individuals in a population called?

A

Population size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the number of individual in a sample called?

A

Sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How are descriptive statistics and inferential statics interrelated?

A

Before carrying out an inferential analysis, descriptive statistics should be applied to organize and summarize information from a sample. This step helps us to choose appropriate inferential methods.

20
Q

In a STAT151 class of 60 students, the average score of 20 randomly selected
students is 71/100. Among these 20 students, 5 of them got scores among 80-100; 12 of them got scores among 50-79; and 3 of them are lower than 49. Is this study descriptive or inferential?

A

Discriptive

21
Q

In a national poll, 1000 adults were asked the following question: “If you won 10
million dollars in a lottery, would you continue to work, or would you stop working?” Based on the results of this poll, researchers made a conclusion “At least 60% Canadians would still work even if they won millions.” Is this study descriptive or inferential?

A

Inferential

22
Q

What is an observational study?

A

Researches simply observe characteristics and take measurements. The data are collected without any plan

23
Q

What is a designed experiement?

A

Researches first impose treatments and controls, and then observe characteristics and take measurements. The data are collected with a “plan”

24
Q

When the instructor asks 5 students about if they know what statistics are before the class even starts, what type of study is this? What type of study would it be if this question was asked at the end of the stats class?

A

Observational

Designed experiment (we have changed something to change the effect)

25
Q

One hundred 30-year-old people participated a project studying the relationship
between exercise and a person’s fitness. These participants were randomly assigned into two
groups.

In Group 1, 50 participants were asked to do exercise more than 5 hours per week.

The 50 participants in Group 2 were asked to do exercise less than 2 hours per week.

Their body mass index (BMI) were measured after 6 months, analyzed, and interpreted.

Is this an observational study or a designed experiment?

Suppose that participants in group 1 significantly decreased BMI than group 2. Can we conclude that a person will become fit from doing a lot of exercise?

A

Designed experiment

Yes, because it’s a designed experiment that can help us determine causation

26
Q

A scientist was interested to know if smoking is a risk factor for lung cancer. She randomly selected 500 people and summarized the data obtained:

smoker w/ lung CA: 200
smoker w/o lung CA: 60
nonsmoker w/ lung CA: 100
nonsmoker w/o lung CA: 140

Totals:
smokers - 260
nonsmokers - 240
w/ lung CA: 300
w/o lung CA: 200

Is this an observational study or a designed experiment?

Can we conclude that smoking is a risk factor for lung CA?

A

Observational study

While we know as common knowledge that smoking can cause cancer, this is not a designed experiment, just an observational study, so causation cannot be determined

27
Q

What are the two main methods for collecting data used in STATS 151?

A

Census

Sample

28
Q

What is census data collection? Why would this be a less attractive option to use?

A

Collect data from all individuals in the population

It is time consuming and expensive

29
Q

What is sampling data collection?

A

Collecting data from a sample of the population

We need to make sure the sample is representative of the population

30
Q

How often is the census collected in Canada?

A

Every 5 years

31
Q

What is the main type of sampling done in STATs 151?

A

Simple random sampling

32
Q

What is simple random sampling?

A

A sample taken in a way that each sample with the sample size has equal chance of being selected

It depends on how they were selected, not the sample itself

33
Q

If a population has 5 letter, A, B, C, D, and E, what are the possible samples with the sample size
n=2

A

A, B
A, C
A, D
A, E
B, C
B, D
B, E
C, D
C, E
D, E

34
Q

Consider a population containing 3 students, Anna, Bruce, and Cindy (ABC lol).

All possible samples have a sample size n=2, what are all the options?

If all of these samples have an equal chance to be selected, what is the chance to select any one of the sample sizes?

If the chances are equal, what type of sampling is this?

A

Possible samples:
A, B
A, C
B, C

There are three samples above, so equal chance for each would be 1/3

Simple random sample

35
Q

What is a simple random sample with replacement?

A

An individual can appear in the sample more than once

36
Q

What is a simple random sample without replacement?

A

An individual can appear in the sample AT MOST once

37
Q

If not particularly indicated in this course, does a simple random sample (SRS) occur with replacement or without?

A

Without replacement

38
Q

Consider a population containing 3 students, Anna, Bruce, and Cindy (ABC lol).

All possible samples have a sample size n=2, if we use simple random sample with replacement, what are all the options?

A

A, A
A, B
A, C
B, B
B, C
C, C

39
Q

What are some ways to obtain a simple random sample (SRS)?

A

By computer

Random-number tables

40
Q

Assume that there are 500 residents in a community and we need to select a SRS with sample size 10.

How do you use a random number table to randomly select the sample?

A
  1. Number all residents from 1-500
  2. Randomly pick a starting point from the random-number table. We look at the first 3 digits only as our biggest number (500) also has 3 digits.
  3. Going from our starting point, we continue down the list selecting the first 10 numbers that are between 1-500
41
Q

Students were asked how much they liked their English class and options were:
- Dislike very much
- Dislike
- Neutral
- Like
- Like very much
What type of data is this?

A

Categorical ordinal

42
Q

What is best, a bar chart or pie chart, for categorical ordinal data?

A

The bar chart works nicely for ordinal data, as most people readily think from least to most (or lowest to highest) (or 1_dislike_very_much to 5_like_very_much). It requires a bit more cognitive work to move your mind among the outcomes (choices) of the categorical variable when we look at the pie chart.

43
Q

When looking at intervals for a histogram of continuous data, how do we denote cutpoints of the intervals and what does this mean?

For example:
Interval 1: 20 to under 30
Interval 2: 30 to under 40

A

[ means left end of data is closed (so the exact value)

) means the right end of the data is open (the up to part)

44
Q

You are building a histogram with the following intervals:
[10,20), [20, 30), [30, 40), [40, 50), [50, 60)

Which interval would 30 go into?

A

[30, 40)

For [20, 30), it actually means up to 30, so 29.99999 would go in this bracket, but not 30

45
Q

What are the cutpoints for the interval [15, 20)?

A

15 and 20

Cutpoints are the values at the left (closed) and right (open) edges of each interval

46
Q

What is the midpoint for the interval [15, 20)?

A

17.5

47
Q
A