Exam 1 Flashcards

Question

Histogtams

Answer 1

* analogous to bar charts * horizontal axis has classes of quantitative data * frequency, relative frequency or percent * bars touch * good for larger data sets * good if you need more flexibility

Answer 2

* show changes over time * vertical axes show each observation * horizontal axes show time when observation was measured * trends can be seen by connecting points

Answer 3

sample size

Answer 4

* Resistant: Median | * Not resistant: Mean

Answer 5

Mean and median can only be used with quantitative data. Mode can be used with both

Answer 6

Mean is greater: right skewed Mean is less than: left skewed

Answer 7

Indicate amount of spread in a distribution types 1. Range: if you dont know this youre screwed 2. standard deviation: accounts for all observations, indicates how far on average observations lie from the mean, not resistant to outliers 3. Interquartile range(IQR): Quartiles of data, used with boxplotd

Answer 8

1. dot plots 2. stem and leaf plots 3. histograms 4. time plots

Answer 9

1. Frequency distribution 2. Relative frequency distributions 3. Pie charts: use relative frequencies, aka circle graph, difficult to construct by hand, best for data sets for few categories 4. Bar charts: easiest way to graph, horizontal axis is distinct values of categorical data, vertical axis is frequencies or relative frequencies 5. Pareto charts: bar graph with bars from tallest to shortest

Answer 10

measured to make comparisons between groups

Answer 11

(predictor) explains the value of response values

Answer 12

relationship between 2 variables

Answer 13

Frequency distribution for bivariate data, also called a two way or cross tabulation table

Answer 14

Proportions based on the explanatory variable for categories of the response variables

Answer 15

Applies to bell shaped distributions 68% of data falls within 1 standard deviation the mean 95% falls within 2 standard deviations 99.7% falls within 3

Answer 16

* measure of relative standing * indicate the below which a certain percentage of observations fall * resistant to outliers * often preferred over mean and STD * Divides data into 100 equal parts, there are 99 percentiles

Answer 17

1. Deciles: divide data into tenths 2. Quartiles: divide data into fourths •1st quartile: aka lower quartile, median of lower half of data, divides lower 25% and upper 75% •Second quartile: median •Third quartile: divides bottom 75% from top 25%

Answer 18

1. Minimum 2. Q1 3. Median 4. Q3 5. Maximum represented by a boxplot

Answer 19

* Preferred measure of variation when median is used * IQR=Q3-Q1 * more resistant to outliers

Answer 20

1. less than Q1-1.5•IQR | 2. greater than Q3+1.5•IQR

Answer 21

and outlier is far removed from the rest of the data

Answer 22

* Acronym for Shape, Outliers, center, spread | * Use to describe distributions of quantitative data

Answer 23

Modality: #of peaks, can be unimodal, binodal or multimodal Skewedness and symmetry

Answer 24

* Use mean of possible because it takes into account of actual observations * mean is good for symmetric observations with a small number of discrete values * median is good for skewed distributions when potential outliers are oresent

Answer 25

Mean and standard deviation are reported together while IQR and range are reported with median

Answer 26

The science of uncertainty, used to evaluate and control the likelihood that a statistical inference is correct. It quantified uncertainty

Answer 27

1. Subjective: guessing a probability based off personal judgement 2. Theoretical: Based on formulas 3. Experimental/empirical: results of a random experiment

Answer 28

1%, 5%, 10%(mainly 5%)

Answer 29

The probability of an event is the proportion of times it occurs in a large number of repetitions in an experiment. Aka frequentist interpretation. Ignores black swan events. Helps understand and visualize meaning of probability

Answer 30

all possible outcomes for an experiment

Answer 31

Tree diagram or venn diagram

Answer 32

A subset of the sample space. A collection of 1 or more outcomes

Answer 33

* Event that does not occur * denoted as A^c * P(A^c)=1-P(A)

Answer 34

* aka mutually exclusive events * events that do not have any outcomes in common * events that cant happen at the same time * compliment events are disjoint

Answer 35

* consists of outcomes that are in both events, the overlap | * disjoint events: P(A and B)=0

Answer 36

* A or B | * Out comes that are in one or the other

Answer 37

Disjoint: = P(A)+P(B) | Not disjoint: = P(A)+P(B)-P(A and B)

Answer 38

The probability of an event occurring when you know that another event has occurred P(A|B)=P(A and B)/P(B) Probability that event A will occur given that B has occurred. We are conditioning event B, meaning it occurred first

Answer 39

P(A and B)= P(A)•P(B|A) P(A and B)=P(B)•P(A|B)

Answer 40

1. P(A|B)=P(A) 2. P(B|A)=P(B) 3. P(A and B)=P(A)•P(B)

Answer 41

The probability that the test will give a positive result, given that the condition tested for is present P(Positive result|condition present)

Answer 42

The probability that the test will give a negative result, given that the condition tested for is not present P(Negative result|Condition isnt present)

Answer 43

* Numerical summary of a population * Numerical summary of a probability distribution * Denoted by greek letters

Answer 44

A numerical measurement of the outcome of a random event

Answer 45

mean=x•p(x) | repeat “x•p(x)” for each sample

Answer 46

A curved graph

Answer 47

* used for continuous random variables | * symmetric and bell shaped

Answer 48

1. Data must be unimodal and approximately bell-shaped | 2. Probabilities are approximate

Answer 49

Round to 4 decimal places

Answer 50

1. Fixed number of trials(n) 2. each trial has 2 possible outcomes 3. the probability of success (p) is the same for each trial 4: Trials are independent

Answer 51

p<0.5: right skewed p>0.5: left skewed

Answer 52

np> or equal to 15 and 1-p=15

Answer 53

Mean=np Std=/np(1-p)

Answer 54

census, sampling, experimentation

Answer 55

Mean and median can be used, they should be close in value

Answer 56

Standard deviation and IQR

Answer 57

Look at how different the mean and median are

Answer 58

Inferential

Answer 59

Reduce the data to simple summaries without distorting too much information

Answer 60

1. Population distribution: almost never observed, we learn about it from sample distributions 2. Sample distribution: aka data distribution, consists of sample data you observe and analyze, should resemble population distribution if good sampling techniques were used 3. Sampling distributions: Describes long run behavior of the statistic, specifies probabilities for all possible values of the statistic for a sample in a given sizr

Answer 61

n•p and n(1-p) are at least 15

Answer 62

1. Randomization condition: values are randomly obtained 2. Independence assumption: Sampled values are independent 3. 10% condition: n is no more than 10% of the population 4. Sample size assumption: n has to be large enough to expect at least 15 successes and failures

Answer 63

1. Randomization condition: values are sampled randomly 2. Independence assumption: sampled values are independent 3. 10% condition: n is no more than 10% of the population 4. Sample size assumption: There is no one size fits all rule, small samples work if population is unimodal and symmetric, large sample is need if skewed

Exam 1 Flashcards

(91 cards)