Re-Study Flashcards

(456 cards)

1
Q

In a long run, probability can be viewed as what?

A

The proportion of times an event happens, or its relative frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample space?

A

A collection of all elementary results, or outcomes of an experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an event?

A

Any set of outcomes, and a subset of the sample space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A sample space of N possible outcomes yields how many possible events?

A

2n possible events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the notation for the sample space?

A

The Capital Omega

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the notation for the empty event?

A

Ø

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the notation for the probability of the event E?

A

P{E}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A union of events A, B, C, is an event consisting of what? What word does this correspond to?

A

all the outcomes in all these events. It corresponds to the word or.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A complement of an event A is what? What word does it correspond to?

A

an event that occurs every time when A does not occur. It corresponds to the word not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

An intersection of events A, B, C… is what and corresponds to what word?

A

an event consisting of outcomes that are common in all these events. It occurs if each A, B, C, … occurs, and therefore corresponds to the word and.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A difference of events A and B consists of what, and corresponds to what phrase?

A

all outcomes included in A but excluded from B, and corresponds to the words “but not.” A but not B.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Events A and B are disjoint if

A

their interesection is empty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If any two events are disjoint in a set of events, they are?

A

Mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Another term for mutually exclusive

A

Pairwise disjoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Events A B and C are exhaustive if

A

their union equals the whole sample space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Occurrence of a mutually exclusive event does what?

A

Eliminates the chance of any other mutually exclusive event occuring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A single event A and it’s compliment is a classical example of what?

A

A collection of disjoint, and exhaustive events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If a collection of events is exhaustive then

A

One event must occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The compliment of a union of two events is

A

the intersection of the compliments of both events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Notation for the difference of A and B

A

A/B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the sigma-algebra?

A

a collection of events whose probabilities we can consider in our problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What makes a collection of events a sigma-algebra on a sample space?

A

It includes the sample space.

It includes every event, and its compliment.

Every coutable collection of events in the sigma-algebra is contained along with their unions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the minimal collection of a sigma-algebra?

A

The sample space, and the empty event.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the minimal collection of evens for a sigma-algebra known as?

A

The degenerate Sigma-algebra.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the power set of the sigma algebra, and what is its size?
The collection of all events and their unions. Its size is 2Omega
26
What is the sigma additivity problem?
for any finite or countable collection of mutually exclusive events, P{E1 U E2.....} = ***P***(E1) + ***P***(E2)...
27
What is the formal definition of probability?
Probability is a function of events with the domain sigma-algebra and the range [0,1] that satisfies the sigma-additive property, and the sample space has unit probability or P(sampel space) = 1.
28
What is the probability of an empty event?
0
29
The probability of an event is equal to what?
The sum of all of the mutually exclusive outcomes contained in that event.
30
Only what kind of events satisfy the Sigma-additivity property?
Mutually exclusive events.
31
How do you calculate the probability of events that are not mutually exclusive?
32
What is the compliment rule?
33
How do you calculate the probability of independent events?
34
When are events independent?
When the occurence of one event does not affect the probabilities of other events occuring.
35
What is the notation for the sigma algebra?
36
What is a random variable?
A variable that depends on chance.
37
What is a stochastic process?
A experiment model in which the random variables depend on time.
38
What is fundamental to correctly determining the likelihood of an experiment's outcomes?
Precisely defining the experiment is fundament to determining
39
When can we say for certain the value of a random variable?
We can't, we can only talk about the distribution or all possible values of a random variable with the likelihood of occurence.
40
What are the three interpretations of probability?
Classical, subjective or bayesian, and frequentist
41
What is the classical interpretation of probability?
We have an intuitive idea of probability and in some situations already know how to compute it. Such as rolling a 6 sided dice with equally likely outcomes.
42
What is the frequentist interpretation of probability?
We have an intuitive idea of probability in some situations that we do not compute on our own, but is based on past observations.
43
What is the subjective or Bayesian interpretation of probability?
Probability is a degree of belief. We have an intuitive idea of probability that may not fit the classical or frequentist interpretations.
44
What is an example of Bayesian interpretation of probability?
One in which the experiment can not be made, it is destructive. What is the probability of that bridge collapsing?
45
When do we say an event occurred?
When the outcome of an experiment is a member of that event.
46
An experiment will have how many outcomes?
Exactly one.
47
What does it mean that ¬A is relative to the sample space?
It means that ¬A includes everything in the sample space not in A.
48
What is the probability of each outcome when the sample space consists of n equally likely outcomes?
1/n
49
How is the probability of an event calculated?
(# of out comes in event)/(# of outcome in sample space) \* for equally likely outcomes.
50
In reality most situations do not have what?
Do not have equally likely outcomes.
51
Equally likely outcomes are usually associated with the phrases
"fair game" or randomly selected.
52
Outcomes forming an event are often called what?
Favorable outcomes.
53
What does sampling with replacement mean?
means that every sampled item is replaced into the initial set, so that any of the objects can be selected with probability 1/n at any time.
54
What provides special techniques for the computation of favorable outcomes and total outcomes?
Combinatorics.
55
When sampling with replacement, the same object may what?
Be sampled more than once.
56
What does sampling without replacement mean?
every sampled item is removed from further sampling, so the set of possibilities reduces by 1 after each selection.
57
When are objects distinguishable?
if sampling of exactly the same objects in a different order yields a different outcome, that is, a different element of the sample.
58
When are objects indistinguishable?
if the order is not important, it only matters which objects are sampled and which ones are not. Indistinguishable objects arranged in a different order do not generate a new outcome.
59
What is an example of is an example of distinguishable objects without replacement?
A password.
60
How are permutations with replacement calculated?
Where n is the possible selections, and k is how many selections.
61
How are permutations calculated without replacement?
62
What are permutations?
Possible selections of k distinguishable objects from a set of n are called
63
How do you calculate combinations without replacement?
64
The numbe of permutations of k, n is equal to what?
the number of possible allocations of k distinguishable objects among n available slots.
65
What are combinations?
Possible selections of k indistinguishable objects from a set of n
66
What is an example of a combination?
An antivirus software reports that 3 folders out of 10 are infected, how many possibilities are there? Order in this case does not matter, A, B, C is the same outcome as B, A, C.
67
What is conditional probability?
event A given event B is the probability that A occurs when B is known to occur.
68
How is conditional probability denoted?
69
How is Conditional probability of A given B calculated?
70
How can Conditional probability of A given B be simplified to give us the probability of the general intersection?
71
How can independence be mathematically defined?
72
How do we know if events are independent?
If
73
is the probability of A given B equal to the probability of B given A?
74
What can be used to find
Bayes Rule
75
What is Bayes Rule?
76
What is independence?
Events A and B are independent if occurrence of B does not affect the probability of A
77
In the case of two conditional events A and B how is the probability of A calculated using the law of total probability?
78
How is Bayes rule for two events calculated using the law of total probability for A?
79
What is often used to calculate the denominator in Bayes Rule?
The law of total probability.
80
What does the law of total probability do?
It relates the unconditional probability of an event A with its conditional probabilities
81
When is law of total probability used?
when it is easier to compute conditional probabilities of A given additional information.
82
83
A random variable X is a function of what?
It is the function of an outcome σ of an experiment, X = f(σ), in other words it is a variable that depends on chance. We can not know what X is until an experiment has an outcome.
84
What is the domain of a random variable.
The sample space is its domain.
85
What is the range of a random variable?
It an be the set of all real number or any subset of the real numbers, only dependent on what values a random variable can take.
86
When working with a random variable X, what do we chart?
We chart all of the possible values x, and their corresponding probabilities.
87
What is known as the distribution of X?
The collection of all probabilities related to X.
88
What is the set of all possible values of X called?
The support of the distribution.
89
What is the cumulative distribution function?
90
What is the probability mass function of a value x?
91
What are discrete random variables?
variables whose range is finite or countable.
92
What A is an inteval from a to b, how can its probability be computed directly from the cumulative distributive property?
93
94
What is the set
exhaustive and mutually exclusive events for different pairs (x, y).
95
What is the addition rule for when using two random variables?
96
When are two random variables independent?
97
What are continuous random variables?
variables whose range assume a whole interval of values. This could be a bounded interval (a, b), or an unbounded interval .
98
Expected value is denoted with what?
99
What is the general formula for the expectation?
100
101
102
103
What is an example of a continous random variable?
A long jump is formally a continuous random variable because an athlete can jump any distance within some range.
104
How is the variance of a random variable calculated?
105
When does the variance equal zero?
106
What is the expectation of a random variable?
its mean, the average value
107
108
How is the correlation coefficient calculated?
109
110
111
What is Chebyshev's inequality?
112
Suppose the number of error in a new software has Exp(X) = 20, and the standard deviation of 2, the probability of the software having more than 30 errors is
113
If X and Y are integers what can their expectations be?
Any real number.
114
What does expectation show?
where the average value of a random variable is located, or where the variable is expected to be, plus or minus some error.
115
How is the variability of a random variable's value measured?
Measured by its distance from the Expectation.
116
How is standard deviaton denoted?
σ
117
How is standard deviation calculated?
It is the square root, +/-, of the Variance.
118
What is covariance?
summarizes interrelation of two random variables.
119
What does it mean if Cov(X, Y) = 0?
There is no correlation between the two variables.
120
What does the correlation coefficient do?
tells how strongly two variables are correlated, values near 1 indicate strong positive correlation, values near -1 show strong negative correlation, and values near 0 show weak correlation or no correlation.
121
For independent X and Y, Cov(X, Y) equals what?
It equals zero.
122
123
The probability of at least 2 is the compliment of what?
1 or less, at most 1
124
What is the probability of Event 1 and Event 2 when Event 2 is a member of Event 1
The probability is the probability of Event 2
125
What is a Bernoullie variable?
A random variable that can only take on two possible values, 0 and 1.
126
What is a Bernoulli trial?
An experiment with a binary outcome.
127
What are some examples of Bernoulli trials?
* Pass or fail tests * Heads, or tails * Boys or girls
128
What are the two generic names used for the outcomes of Bernoulli trials?
Successes and Failures; however successes do not have to be good and failures do not have to be bad.
129
In a Bernoulli distribution, if P(1) = p, what is P(0)?
1-p
130
The expectation of a Bernoulli trial is always what?
The proability of a success.
131
The variability of a Bernoulli variable is always what?
the product of the probabilities of succes and failure.
132
The number of Bernoulli trials needed to get the first success has what kind of distribution?
It has geometric distribution
133
What is an example of an experiment with geometric distribution?
A search engine goes through a list of sites looking for a given key phrase, and terminates as soon as the key phrase is found. The number of sites visited is geometric.
134
Geometric Random Variables can take what?
Any integer value from one to infinity
135
What is the probability mass function for a geometric distribution?
136
How is the probability of a Binomial distribution described?
How many success in n trials.
137
What kind of Variable has Binomial distribution?
A variable described as the number of successes in a sequence of independent Bernoulli trials.
138
What is the PMF of a Binomial distribution?
139
What do works like least and most usually mean?
They usually mean that the CDF should be sought for.
140
What table has the CDF of Binomial Distrubutions?
Table A2
141
What is the expectation of a binomial distribution?
np, where n is the number of trials.
142
What is the variance of a Binomial distribution?
npq, where q is 1-p
143
What is the geometric distribution?
144
What has negative binomial distribution
the number of trials needed to obtain k successes
145
What are Poissonian events?
events that are extremely unlikely to occur simultaneously or within a very short period of time.
146
Binomial varibles count what?
the number of successes in a fixed number of trials
147
Negative Binomial variables count what?
the number of trials needed to see a fixed number of successes.
148
What is the Poisson distibution?
149
What has Poisson distibution?
The numer of rare events occuring within a fixed period of time
150
What are examples of Poissonian events?
traffic accidents, arrivals of jobs, telephone calls, virus attacks, floods, and earthquakes.
151
What table has the values of CDFS of Poissonian distributions?
Table A3
152
If the period of time changes in a problem using Poisson distribution what needs to be adjusted?
only the frequency to what the average would be over the new time period.
153
what is the Expectation of a negative binomial distribution?
k/p where k is how many successes
154
What is the variance of a negative binomial distribution?
155
How do you calculate the PMF of a negative binomial distribution? part 1
156
How do you calculate the PMF of a negative binomial distribution? part 2
157
How do you calculate the PMF of a negative binomial distribution? part 3
158
What is Poisson Approximation of a Binomial distribution?
159
For all continuous variables, P(x) = ?
zero.
160
In both continuous and discrete cases, the CDF is what?
a non-decreasing function that ranges from 0 to 1.
161
What is different about the CDF with continuous variables from discrete variables?
The CDF is a continuous function, and there are no jumps in the CDF.
162
With continuous variables, probabilities are what?
Areas under a density curve.
163
What is the probability density function?
A derivative of the CDF, f(x) = F'(X)
164
What is the total area under a pdf equal to?
The total area under a pdf is equal to 1.
165
What are the four families of continuous distributions discussed in this chapter?
Uniform, Exponential, Gamma, and Normal.
166
When is Uniform distribution used?
In any situation when a value is picked at random from a given interval.
167
Uniform distribution has constant what?
Density.
168
What is the density function for uniform distribution?
f(x) = 1/(b-a)
169
What must be true for use of a Uniform distribution?
|b-a| must be a finite countable number
170
What does [a, b] represent in uniform distribution?
the domain of the uniform density function.
171
What is an example of a situation with uniform density?
If a flight is scheduled to arrive at 5pm actually arrives at a Uniformly distributed time between 4:50 and 5:10, then it is equally likely to arrive before five and after five.
172
What is the uniform property?
the probability is only determined by the length of the interval, not by its location.
173
How is the variance of a continuous random variable calculated?
\int x^2f(x)dx - E(x)^2
174
How is the expectation of a continuous variable calculated?
\int xf(x)dx
175
What is the Uniform Distribution?
176
What is exponential distribution often used for?
To model time.
177
What are some examples of exponential distributions?
waiting time, interarrival time, hardware lifetime
178
When is the time between events exponential?
when the number of events is Poisson
179
if X is time, measured in minutes, what is lambda?
The frequency, number in a time.
180
If arrival occurs every half minute what is the expectation?
E(X), so it is .5, we expect to get one every .5 minutes.
181
If arrivals occur every half a minute what is lambda?
lambda is 1/.5, so 2.
182
What does it mean that Exponential variables are memoryless?
It means that having waited for t minutes gets "forgotten," and does not affect the future waiting time.
183
The time T until the next event is greater than t can be rephrased as what?
Zero events occur by the time t.
184
What is the exponential distribution?
185
When does total time have gamma distribution?
When a certain procedure consists of alpha independent steps that each takes exponential amounts of time.
186
In a process of rare events, what has gamma distribution?
In a process of rare events, with exponential times between any two consecutive events, the time of the ath event has Gamma Distribution.
187
When does a gamma distribution become exponential?
When a =1.
188
What is the Gamma Distribution?
189
How can we significantly simplify gamma probabilities?
By thinking of a Gamma variable as the time between some rare events.
190
What is the gamma poisson formula?
For a Gamma(a , lambda) variable T can be modeled with Poisson(lambda/t) where P{T\>t} P{X \< alpha} and P{T\<= t} = P{X \>= alpha}.
191
What is normal distribution often found to be good for modeling?
sums, averages, errors, and physical variables like weight, height, and temperature.
192
What is the CDF of the normal distribution?
193
How do you normalize a Random variable X to Z?
194
What is the full formula for Variance of X?
Var(X) = E[(X-E(X)^2)] = E(X^2) - E(X)^2
195
Var(kX + C) = what?
K^2Var(X)
196
What is the formula for covariance?
Cov(X,Y) = E[(X-E(X))(Y-E(Y))] = E(XY) - E(X)E(Y)
197
What is the Variance of the sum of two variables?
Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y)
198
Variance is always expressed in what?
Squared units.
199
What is the correlation between two variables?
Cov(X,Y)/(std(X)\*std(Y))
200
The correlation between two variables will always lie between what?
-1 and 1
201
Let X be the percentage change in value of investment A in the course of one year, and let Y be the percentage change in value of investment B. You have 1 dollar. You invest a in A and 1-a in B. What is the return on investment?
Return: aX+(1-a)Y
202
Let X be the percentage change in value of investment A in the course of one year, and let Y be the percentage change in value of investment B. You have 1 dollar. You invest a in A and 1-a in B. What is the expected return?
Expected Return: aE(X) + (1-a)E(Y)
203
Let X be the percentage change in value of investment A in the course of one year, and let Y be the percentage change in value of investment B. You have 1 dollar. You invest a in A and 1-a in B. What is the variance in your return on investment?
a^2Var(X) + (1-a)^2Var(Y) + 2a(1-a)Cov(X,Y)
204
What can a continuous random variable model?
Any analog measurements can be the domain of continuous random variables.
205
Why can't we sum all of the probability measures of a continuous random variable?
We can not enumerate all of the possible values a continuous random variable can take, so we have to integrate.
206
What breaks down for continuous random variables, and what does not?
The PMF breaks down, but the CDF does not.
207
Let X be a continuous random variable. What is P{X = x}?
0
208
For continuous random variables, P{ X \<= x} is the same as what?
It is the same as strictly less, P{X \< x}. This goes for all states a \< X and X \> a, and a \< X \< B
209
What is the powder description of PDF?
PDF is not PMF, it describes only density. If we were to break chalk up into a power and distribute it along a number line of where values that a continuous random variable is more likely to take, the denser areas would be where the likeliest values are located. A PDF curve represents this graphically. The CDF is a measure of the density within that region, as though we were to take the power in that region and measure it, but if we were to only measure the weight of one infinitesimal value, it would be zero.
210
For a continuous random variable, P{a \< X \< b} is equal to what?
where f(x) is the pdf.
211
P{X \< b} = ?
212
What are the two conditions for a function to be a legitimate PDF?
213
Given f\_x(X) = { cx^2 for 0 \< x \< 1 and zero other wise} Find c.
214
Given f\_x(X) = { 3x^2 for 0 \< x \< 1 and zero other wise} Find the cdf.
x^3
215
What is the cdf of f(x)?
216
When we are modeling continuous random variables, how is the probability density function denoted?
217
What is the probability outside of the region enclosed by the pdf?
0
218
P(X \> x) = ?
P(X\> x) = 1 - P(X \< x) = 1-F(x)
219
Without knowing a random variable's distribution, what can we do if we know its parameters?
We can use Chebyshev's inequality.
220
What parameters must we know to use Chebyshev's inequality?
The variance(or standard deviation) and the expectation
221
How do we define the joint cumulative distribution function of two continuous random variables?
222
How is the joint probability density function of a continuous random variable defined?
223
How do we obtain the marginal pdf of a continuous random variable from a joint pdf?
We integrate the joint pdf over the random variables we want to eliminate.
224
225
Given a joint pdf, how can we find P{{a \< X \< b} and {a \< Y \< b}}?
226
227
228
When we are speaking of uniform distribution, what does a larger value of b-a mean?
Larger variance
229
Derive the Expectation and variance of the uniform distribution.
Do it
230
What is the pdf of the exponential distribution?
231
Derive the CDF, Expectation, and Variance of the exponential distribution.
do it
232
How is the exponential distribution related to the Poisson Process?
The expectation of the exponential distribution is the reciprocal of the expectation of the poisson expectation. The time between Poisson processes is exponentially distributed.
233
What does lambda represent?
lambda is the average number of events in some time interval.
234
What does the memoryless property mean?
The past doesn't matter, only the present. but more formally: where t is our time of arrival.
235
Derive P{T\>t}
236
What is the only continuous distribution that has the memory less property?
The exponential distribution
237
If we redefine lambda as the average number of events per Unit time and assume a period of interest as [0, t], what is the average number of events in that period?
t\*lambda
238
Gamma distribution is used to describe what?
times between events of Poisson process ( a process in which events occur continuously and independently at a constant rate).
239
What is the gamma pdf?
240
What is the gamma function?
241
What is the expectation and variance of the Gamma distribution?
242
What can we learn from the graph of a PDF?
The variance translates to the wideness of the PDF. The more "spread" out the PDF, the more values a random variable can take.
243
What else is the Normal distribution known as?
the Gaussian distribution
244
What is the mathematical representation of the probability that a random variable is within the neighborhood of 2 standard deviations?
245
Besides 1-F(t), how can P{T\>t} be calculated for Exponential modeling?
246
What does the Poisson formula really mean at its heart?
It means that if less than this many events happens in this interval of time, then T \> t for the gamma random variable.
247
When are two continuous random variables independent?
If and only if their joint PDF and/or CDF can be factored into two individual pdf's one of x and one of y, and the conditions on x and y remain the same.
248
What does it mean mathematically if X and y are independent continuous random variables?
E(XY) - E(X)E(Y) = 0, or E(XY) =E(X)E(Y)
249
How do we find the expectation of a continuous random variable?
250
How do we find the variance of a continuous random variable?
251
How do we find the covariance of a continuous random variable?
252
253
254
255
What is a stochastic process?
A random variable that also depends on time.
256
A stochastic process is a function of what?
Two arguments X(t,w) time and the result. w exists in the sample space and is an outcome of an experiment
257
What are values of X(t, w) called?
These are states
258
What do we get if we fix a time of a Stochastic Process?
We get the function of an outcome: Xt(w)
259
What is a realization or trajectory of a sample process?
It is a fixed outcome where we obtain a function of time Xw(t)
260
When is a stochastic process discrete-state?
When Xt(w) is discrete for each time t
261
When is stochastic process continuous state?
When Xt(w) is continuous for each time t.
262
When is a stochastic process a discrete-time process?
When the set of times T, are discrete, that is consists of separate, isolated points.
263
The CPU usage process in percent is what kind of stochastic process?
A continuous time, continuous state process.
264
The temperature reported every hour and rounded to the nearest integer is what kind of process?
A discrete time, discrete state process.
265
When is a stochastic process a continuous time process?
When T is a connected and possibly unbounded interval.
266
If a stochastic process is markov, the conditional distribution of X(t) is the same under what two conditions?
1. given observations of the process X at several moments in the past. 2. given only the latest observation of X.
267
If a process is markov P{future |past, present} = ?
P{future|present}
268
What is a Markov chain?
A discrete-time, discrete state Markov stochastic process.
269
When is a Markov chain homogeneous?
When all its transition probabilities are independent of t. Being homogeneous means that transition from i to j has the same probability at any time.
270
What does the Markov property mean?
It means that on the value of X(t) matters for predicting X(t+1)
271
It is the probability of moving from state i to state j by means of h transitions.
272
What is an h-step transition probability?
It is the probability of moving from state i to state j by means of h transitions.
273
The distribution of a Markov chain is completely determined by what?
The initial state X(0) and one-step transition probabilities.
274
What is our long term forecast?
The limit of our h-step transition probability.
275
In some town, each day is either sunny or rainy. A sunny day is followed by another sunny day with probability 0.7, whereas a rainy day is followed by a sunny day with probability .4, what is the transition probability?
p11 = .7, p12= .3, p21 = .4, p22 = .6
276
All one-step transition probabilities can be conveniently written as what?
An nXn matrix
277
What do the rows represent in a transition probability matrix?
The from state
278
What do the columns represent in a transition probability matrix?
The to state.
279
What in a one-step transition probability matrix sums to 1?
Each row, but this is generally not true for the column totals.
280
How do we fin the h-step transition probability matrix?
We raise the matrix P to the h power
281
A computer is shared by 2 users who send tasks to a computer remotely and work independently. At any minute, any connected user may disconnect with probability 0.5, and any disconnected user may connect with a new task with probability 0.2. Let X(t) be the number of concurrent users at a time t. What are the states of the Markov chain?
0, 1, 2
282
A computer is shared by 2 users who send tasks to a computer remotely and work independently. At any minute, any connected user may disconnect with probability 0.5, and any disconnected user may connect with a new task with probability 0.2. Let X(t) be the number of concurrent users at a time t. What is the transition probability matrix ?
283
If the following is the 2step transition probability matrix, what is the probability the system will go from 2 users to 0 users after 2 units of time?
0.4225
284
If we have n independent variables X with the same expectation and standard deviation, what can we use to predict their sum?
The central Limit Theorem
285
If we have n independent variables X with the same expectation and standard deviation, what is the standardized sum X+X+X+X?
286
287
What are the conditions for using the central Limit theorem?
It applies to any random variables of virtually any thinkable distribution so long as they have the same finite expectation and variance, and as long as n is large \> 30.
288
When a large number of random variables that are independent and have the same expectation and standard deviation, the sum of those random variables create what?
A new random variable with Normal distribution.
289
What kind of curve is the normal density curve?
A bell shaped curve, symmetric, and centered around the expectation.
290
What does normalizing a random variable have the effect of?
Taking the bell curve and placing its expectation around zero and giving it a standard deviation of 1.
291
The spread of the normal density curve is controlled by what?
The standard deviation.
292
What does Z represent?
A standard normal random variable.
293
What is the standard normal distribution?
It is a normal distribution with the expectation = 0, and std = 1.
294
A disk has free space of 330 megabytes. Is it likely to be sufficient for 300 independent images, if each image has expected size of 1 megabyte with a standard deviation of .5 megabytes?
Each image is a random variable with its own expectation and standard deviation, so we use the central limit theorem.
295
What is the formula for Normal approximation to Binomial Distribution?
296
What is the continuity correction, when is it used?
It is used when we approximate a discrete distribution with a continuous distribution. P(x) = P{X = x} = P{x-0.5 \< X \< x+0.5}
297
How do we obtain a standard normal variable from a nonstandard normal variables?
298
How does changing the expectation to a normal distribution affect its density curve?
Changing the expectation shifts the curve to the right or to the left.
299
How can we unstandardize Z?
300
How is table A4 read?
The rows represent the first two digits of Z and the column represents the third digit of Z
301
What is the probability density function of a Cauchy random variable?
302
A cauchy random variable does not have what?
An expectation.
303
What is the Cauchy distribution CDF?
304
When a distribution is know known, what can Chebyshev's inequality give us?
A bound
305
modelling experiments as random variables describes what?
A single outcome of the experiment.
306
A random process is a function that assigns what to what?
A time function to every outcome of a random experiment.
307
We have a stochastic process if the outcome of an experiment results in what?
A function of time.
308
What is Chebyshev's inequality?
309
If the outcomes of a stochastic process are continuous, but the Times are discrete, we have what?
A continuous random sequence.
310
If the outcomes of a stochastic process are discrete and the time is continuous, what kind of process do we have?
A discrete random process.
311
When do we have a continuous random process?
When we have a stochastic process in which the outcomes are continuous and so is the time.
312
What does the autocorrelation of a stochastic process tell us?
It tells us how correlated an outcome at t2 is dependent on an outcome at t1.
313
How can a distribution of states be represeted?
As a 1x n matrix or a row vector.
314
How do we find the distribution of states after h transitions using matrix algebra?
Ph = P0P^h, P0 represents the distribution at X = 0, it can be a state distribution (0, 0, 1) or a distribution of probabilities (1/3, 1/3, 1/3). After matrix multiplication, we end up with a vector that represents our final distribution.
315
How do we find the distribution of states after h transitions using matrix algebra?
Ph = P0P^h, P0 represents the distribution at X = 0, it can be a state distribution (0, 0, 1) or a distribution of probabilities (1/3, 1/3, 1/3). After matrix multiplication, we end up with a vector that represents our final distribution.
316
What is a steady-state distribution?
A collection of limiting probabilities.
317
A fast system will go through a very large number of transitions very quickly, its distribution of state is what?
A steady state distribution.
318
What is the steady state formula?
piP = pi
319
All pie in the steady state distribution must add to what?
They must add to 1.
320
The steady state distribution is the solution to what?
The system piP = pi, and the sum of all pi = 1
321
What kind of markov chains have steady state distribution?
Regular markov chains
322
What are regular markov chains?
Chains where transitoin p^h \> 0, that is there are only nonzero entries in the matrix after many transitions.
323
When there is a state i with pii = 1, the markove chain cannot be what?
It can not be a regular markov chain. This is called an absorbing state.
324
What is a limiting matrix?
A limiting matrix is the transition matrix that is created by P^h as h goes to infinity.
325
What does the limiting matrix look like?
It is a matrix where all rows are identical.
326
What is the notation for a population parameter and its estimator?
327
What do we need to know to solve uncertainties?
we need to know the problems distribution and its parameters.
328
What must we do to gain sufficient information about the parameters of an observed system?
collect data
329
What do we use to make statements about a very large set?
We use collected and observed samples
330
what is the population?
the set that consists of all units of interest.
331
What is a sample?
a set of observed units from the population.
332
What is a statistic?
any function of a sample. i.e. an arithmetic mean.
333
Although it happens with a low probability, a sample may sometimes give misleading information, a probability of which is?
Binomial
334
Sampling and non-sampling errors refer to what?
any discrepancy between a collected sample and a whole population.
335
The cause of sampling errors is?
the mere fact that only a portion of the population is observed.
336
What causes non-sampling errors?
inappropriate sampling schemes or wrong statistical techniques.
337
Three examples of wrong sampling techniques?
○ Sampling from a wrong population. ○ Dependent observations- people surveyed together may have opinions dependent on each other. ○ Sampled specimen not being equally likely to be selected.
338
What is simple random sampling?
a sampling design where units are collected from the entire population independently of each other, all being equally likely to be sampled.
339
What is an i.i.d?
independent identically distributed random variables
340
We consider a sample to be what?
a set of random variables obtained by observation.
341
Observations collected by means of simple random sampling design are what?
Independent identically distributed random variables.
342
What are simple descriptive statistics?
measuring the location, spread, variability and other characteristics that can be computed immediately from a collected sample.
343
What is the mean?
measuring the average value of a sample.
344
What is the median?
Measuring the central value of a sample.
345
What do quantiles and quartiles show?
Where certain portions of a sample are located.
346
Each statistic of a sample estimates what?
The corresponding population parameter.
347
What does the sample mean estimate? How is the sample mean denoted? What is the definition of sample mean?
348
What is expected to converge to E(X) as a sample approaches a larger and larger size?
The sample mean.
349
What are the three properties of the sample mean?
1. unbiasedness 2. consistency 3. asymptotic normality
350
What is a disadvantage to sample mean?
its sensitivity to extreme observations.
351
What is asymptotic normality?
the distribution of the normalized estimator converges to standard normal distribution as n approaches infinity.
352
What is consistency, rigorously?
353
What does unbiasedness mean?
Unbiasedness means that in a long run, collecting a large number of samples and computing the estimator, on the average we hit the population parameter exactly.
354
What is the rigorous definition of unbiasedness?
355
This is much less sensitive to extreme observations than sample mean.
the median
356
Three Definitions of median, sample median, and population median
357
What are the three skewedness of a sample?
Symmetric: Median = arithmetic mean Right-skewed: median \< mean Left-skewed: median \> mean
358
How do we find the median of a continuous distribution?
We solve F(x) = .5
359
How do we find the median of a discrete distribution?
* For a discrete distribution F(x) = .5 has either a whole interval of roots, in which case any number in this interval excluding the ends is a median, or no roots at all. * If there are no roots at all, the smallest x with F(x) \>= .5 is the median. * It is the value of x where the CD jumps over .5
360
How do you find the median of a sample?
361
How can you measure the median speed of cars?
Drive so that half of cars overtake you, and half are overtaken
362
the construction of these tell us how well we can expect our sample parameter to match the population parameter
Construction of confidence intervals.
363
What is a p-quantile?
where p is a percentage
364
Notation for population p-quantile, gamma quantile, quartiles, and medians, and their estimators
365
The first, second, and third quartiles are the
25th, 50th, and 75th percentiles. They split a population into four equal parts.
366
what is a gamma perentile?
* A gamma-percentile is (0.01gamma) quantile. ![]()
367
A median is at the same time what?
0.5-quantiles, 50th percentile, and 2nd quartile
368
Notation of how gamma percentiles, quantiles, quartiles and the median relate.
369
With a sample of 30, how do we find the .25 quantile?
* For p = 0.25, we find that 25% of our sample of n=30, is np or 0.25\*30 = 7.5. For n(1-p) = .75\*30 = 22.5. From a sample of 30, we will see that th 8th element has no more than 7.5 observations to the left and no more than 22.5 observations to the right, there fore the 8th element in a sample of 30 is The estimator of Quartile 1.
370
What is sample variance?
371
Sample standard deviation measures variability in
The same units as x, Variance is in units squared
372
What is standard error?
The standard deviation of an estimator
373
Standard errors show
precision and reliability of estimators. * They show how much estimators of the same parameter can vary if they are computed from different samples. ![]()
374
These three estimators are sensitive to outliers, or extreme observations.
Sample mean, variance, and standard deviation
375
This is a measure of variability that is not very sensitive to outliers
The interquartile range
376
What is the rule for identifying outliers?
377
What should we do before we do anything with data?
We must look at it
378
IQR = ?
IQR = Q3-Q1
379
Six things a quick look at a data graph can suggest
* a probability model, i.e., a family of distributions to be used. * Suitable statistical methods for the given data. * Presence or absence of outliers. * Presence or absence of heterogeneity. * Existence of time trends and other patterns. * Relation between two or several variables.
380
5 ways to visualize data
* Histograms * Stem-and-leaf plots * Boxplots * Time plots * Scatter plots
381
We use hypothesis testing to
confirm or reject a statement about a sample population
382
What is the kth population moment?
E(Xk)
383
What is the formula for the sample (X1, X2…Xn) kth moment?
384
What is the first sample moment?
The arithmetic mean xbar
385
What are central moments?
Central moments are moments which are computed after the data is centralized by subtracting the mean.
386
For k ≥ 2, what is the kth population central moment?
387
the expected value is equal to what?
The area above the cdf curve
388
How do we calculate the expected value using only a CDF?
389
CDF can find the first moment, but the second moment is found with what?
The pmf
390
When are methods of moments used?
Under the strong assumption that we know the distribution of our population.
391
How can we find n parameters if we know the distribution of a sample?
We take n moments of our sample, and n moments of our distribution and form a system of n equations.
392
When using the method of maximum likelihood, what are we looking for?
Assuming we know the distribution, we are trying to find the parameters that would make that sample most likely to be chosen.
393
Method of Maximum likelihood is formally what?
maximizing the joint probability of each variable in a given sample according to the distribution. Because they are independent, in the discrete case this is P(x1)\*P(x2)…\*P(xn)
394
To find the parameters that make the sample the most likely we find where
the derivative d/dxP(x) = 0, does not exist, or the boundaries of the sample parameter.
395
How do we simplify maximizing the joint probability of a sample?
By using the formula
396
Simplify and use maximum likelihood to find the lambda of a sample that is of poisson distribution
397
What are the two methods for estimating parameters of a population?
Method of Moments Method of Maximum Likelihood
398
How is the kth population moment defined as?
399
What is the kth sample moment defined as?
400
What is the first sample moment?
The sample mean.
401
What is centralizing the data?
Subtracting the mean from each element.
402
What is the kth population central moment defined as?
403
What is the second population central moment?
Var(X)
404
What is the second sample central moment?
The variance, with n replace by n-1
405
How is the k-th sample central moment defined?
406
What is the method of maximum likelihood?
○ We find such parameters that maximize the probability of getting our data sample.
407
For a discrete distribution, which formula do we maximize?
The joint PMF of iid
408
For a continuous distribution, what formula do we maximize to find the data's parameters?
For a continuous distribution, we maximize the joint density.
409
What is a computational shortcut for using the method of maximum likeliness?
A nice computational shortcut is to take logarithms first
410
Take the logarithm of the Poisson distribution joint pmf, and maximize it to find the parameter lambda.
411
What do standard errors measure?
Standard Errors serve as the measures of our estimators accuracy.
412
How is the standard error of our estimated expectation calculated?
413
Use the method of maximum likelihood on the exponential density to find Lambda, and use the logarithm shortcut.
414
Sometimes the likelihood has no critical points inside its domain, then
it is maximized at its boundaries.
415
What is a parameter?
A numerical fact about a population.
416
What are all numerical facts about a sample called?
Statistics
417
What does lambda hat represent?
The rate in a sample.
418
Describe a simple random sample?
The sample is drawn from a population randomly, like tickets from a box. Each is drawn without replacement.
419
The size of our sample must not affect what?
The distribution of the population, in other words, it must not be so large the distribution of the box no longer holds.
420
What is the expectation and variance of p hat, and the standard error of phat in a binomially distributed sample?
E(P) = phat, Var(phat) = p(1-p), SE = sqrt((p\*(1-p))/n)
421
What is the standard Error of an estimator?
std(estimator)/sqrt(n)
422
SUM Var(Xi) = ?
Var Estimator/n
423
What do we need to know to calculate the standard error of a sample?
The standard deviation of the population as a whole.
424
What is simple bootstrapping?
We decide the sample standard deviation can be used as the population standard deviation.
425
What is the standard error?
The standard error is the standard deviation of the expectation.
426
What does unbiased mean?
It means that the estimator does not tend to skew to the left or right
427
What is a biased estimator of the Standard deviation of the population, and how do we fix this?
The standard deviation of a sample. We divide the Variance of the sample by n-1 instead of n.
428
We are using the standard deviation of a sample as a standard deviation of the population, how do we calculate this?
SUM(Xi-X\bar)^2/n-1
429
What is a confidence interval?
The range within which the true value resides with some confidence level.
430
When is the confidence interval the narrowest?
As n goes to infinity
431
What does a 90% confidence interval mean?
It means that if a sampling process was done many times, 90% of the intervals produced would capture the true parameter.
432
Confidence level is not what?
The probability that the actual value of the parameter is within the confidence interval.
433
This is often referred to as the confidence level.
The coverage probability (1-\alpha).
434
How do we calculate the confidence level using the normal distribution?
b = Estimator - Z\_alpha/2\*Sigma(Estimator)
435
What is the margin of error = ?
sigma\*Z\_alpha/2
436
If we want a confidence interval of 90%, what is alpha and the critical z score?
alpha = .1 critical Z score = +/- 1.645
437
Derive the Var(X\bar), and SE or SD(X\bar)
438
This is sometimes called the critical probability:
1-(alpha/2)
439
When can we use the standard deviation of a sample as the standard deviation of a population?
When the sample size is sufficiently large.
440
When must we use to the t distribution?
When a sample is normal, but small.
441
We have a small sample from a normally distributed population, C.I. for the mean = ?
X\bar +/- (t\_alpha/2)\*s/sqrt(n) where s is the standard deviation of the sample.
442
What is the degree of freedom?
d.f. = v = n-1
443
How do we find t\_alpha?
T\_alpha the anazlog of Z distribution.
444
What happens when the degree of freedom goes to infinity?
We get a normal distribution.
445
What is it that we can and cannot tell about a hypothesis?
We cannot tell if a hypothesis is true or not; all we can do is determine whether the data provides sufficient evidence against the null hypothesis in favor of the alternative hypothesis.
446
How is the null hypothesis denoted?
H\_0
447
How is the alternative hypothesis denoted?
H\_a
448
What is a null hypothesis involving the mean of a population?
H\_0: our sample mean equals the population mean.
449
What are the three types of alternatives?
1. Two sided alternative: H\_a mu does not equal mu. 2. one-sided, left tail H\_a: mu is left then sample mu. 3. one-sided, right tail H\_a: mu is more than sample mu.
450
Sampling error occurs when we do this:
wrongly accept or reject our null hypothesis
451
What does low level of significance mean?
it means that only a large amount of evidence can result in rejection of our null hypothesis.
452
What is a type one error?
When the result of our test tells us to reject our null hypothesis, although it is true.
453
What is a type II error?
When our test tells us to accept our null hypothesis, though it is false?
454
Which type error in hypothesis testing is considered worse?
A type II error.
455
What makes a good test result in erroneous decision?
Extreme observed data
456
What is the significance level of a test?
The probability that the test tells us to reject our null hypothesis wrongly.