Specialization IB - question 4 (Probability) Flashcards

(58 cards)

1
Q

Stavebni bloky pravdepodobnosti

A
  • Nahodny experiment - proces s nejistym vysledkem
  • Sample space (vyberovy prostor) - mnozina vsech moznych vysledku experimentu
  • Sample point - prvek z Sample Space
  • Event (jev) - podmnozina vyberoveho prostoru (Sample space) vysledek/vysledky nahodneho experimentu
  • Da se pracovat jako s mnozinami (A ∪ B, Co-A)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Stavebni bloky na prikladu

A

Nahodny experiment - hod kostkou (6-stranna)
Vyberovy prostor je {1, 2, 3, 4, 5, 6}
Elementarni jev (Sample point) napr. 1
Jev nahodneho pokusu napr. dostaneme suda cisla na kostce (podmnozina je {2,4,6})

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Vlastostni prace s mnozinami

A

Komutativita - A ∪ B == B ∪ A
Asociativita - A ∪ (B ∪ C) == (A ∪ B) ∪ C
Distributivita - A ∪ (B ∩ C) == (A ∪ B) ∩ (A ∪ C)
Identita - A ∪ ∅ = A, A ∩ S = A
Komplement - A ∪ Co-A = S
De Morganovy Zakony -Co-(A ∪ B) = Co-A ∩ Co-B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Diskretni pravdepodobnostni prostor

A

trojice (S, F, P)
* S = Sample space (Vyberovy prostor)
* F = power set S (2^S)
* P = pravdepodobnostni funkce F -> [0, 1] s podminkami, ze P(S) = 1

Konecny/Pocitatelny prostor (pravdepodobnosti se daj scitat/pocitat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Spojity pravdepodobnostni prostor (Continous probability space)

A

trojice (S, F, P)
* S = Sample space (Vyberovy prostor)
* F = sigma-algebra S = F je podmnozinou 2^S
* P = pravdepodobnostni funkce F -> [0, 1] s podminkami, ze P(S) = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Vlastnosti sigma-algebra

A

F je podmnozinou 2^S

  • Obsahuje prazdny prvek (∅ ∈ F)
  • Je uzavrena na doplnek (Closure on complement) - jestli A ∈ F tak potom i Co-A ∈ F
  • Je uzavrena na spocetna spojeni (Closure on union) - jestli A_1,..A_n ∈ F tak potom ⋃︀ A_i ∈ F (i from 1 do n); n = inf
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Podminecna pravdepodobnost

A

Pravdepodobnost jevu A, ze se stane jev B

P(A|B) = P(A ∩ B) / P(B)
<=>
* P(A ∩ B) = P(B) * P(A|B) ; if P(B) != 0
* P(A ∩ B) = P(A) * P(B|A) ; if P(A) != 0
* P(A ∩ B) = 0; if P(B) = P(A) = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nezavisly jev?

A

P(A|B) = P(A)
P(B|A) = P(B)

P(A ∩ B) = P(A) * P(B)

(Ven diagram)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Law of total probability

A
  • B_1, …, B_n tvori S
  • A je event

P(A) = ∑︀ { P(A|B_i)*P(B_i) }

Pouziti: If independent cases (B_i) and we want to know some general property (across all cases) (A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Bayes’ rule

A

Pouziti: Zname obecnou vlastnost a chceme ji zjistit pro konkretni pripad (subcases)

P(B_j | A) = { P(A|B_j)*P(B_j) } / P(A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Nahodne veliciny

A

Funkce, S-> R, ktera prirazuje realne cislo X(s) pro s ∈ S

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Pouziti nahodnych velicin

A

v Probability mass distribution, Cumulative distribution function, stredni hodnota, rozptyl (atp.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Obraz - diskretni nahodna velicina

A

Mnozina Im(X) = { X(s) | s ∈ S }

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Inverzni obraz nahodnych velicin - diskretni nahodna velicina

A

jev takovy, ze: [X = x] = { s ∈ S | X(s) = x }

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Probability mass distribution - diskretni nahodna velicina

A

Probability mass distribution je funkce p_X : R -> [0, 1]

P(X = x) = ∑︀ P(s) (pro s, ze X(s) = x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cumulative distribution function - diskretni nahodna velicina

A

je funkce F_X: R -> [0, 1]

F_X (t) = P(X ≤ t) = ∑︀ p_X(x) ( x ≤ t )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

priklady Diskretni distribuce

A
  • Uniform (rovnomerna) - kazdy vzorek ma stejnou sanci (napr. hod kostkou)
  • Bernoulli - vysledek je 0 nebo 1 - napr. hod minci
  • Geometric (Geometricka) - pocet bernoulliho pokusu nez prvni uspech (napr. hod minci dokud nepadne hlava)
  • Binomial - pocet k uspechu v n za sebou jdoucich pokusech (napr. pocet hlav v 10 hodech minci)
  • possion - vyskyt jevu za cas
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Probability mass distribution - Uniform

A

p_X (x_i) = 1/n ; if x_i ∈ Im(𝑋)
p_X (x_i) = 0 ; otherwise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Probability mass distribution - Bernoulli

A

p_X (x) = p ; if x = 1
p_X (x) = 1-p ; if x = 0
p_X (x) = 0 ; otherwise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Probability mass distribution - Geometric

A

p_X (a) = p * (1-p) ^ (a-1) ; if a ∈ N
p_X (a) = 0 ; otherwise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Probability mass distribution - Binomial

A

p_X (k) = ( n over k ) * { p^(k) } * { (1-p) ^ (n - k) } ; 0 ≤ k ≤ n
p_X (k) = 0 ; otherwise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Cumulative distribution - Uniform

A

F_X (t) = 0 ; t < x_1
F_X (t) = i/n ; x_i < t < x_(i+1)
F_X (t) = 1 ; >= x_n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Cumulative distribution - Bernoulli

A

F_X (t) = 0 ; t < 0
F_X (t) = 1-p ; 0 ≤ t < 1
F_X (t) = 1 ; t ≥ 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Cumulative distribution - Geometric

A

a=1 … |t| ∑ p * (1-p)^(a-1)

25
Cumulative distribution - Binomial
k=1 ... |t| ∑ ( n over k ) * p^(k) * (1-p)^(n - k)
26
Spojita nahodna velicina
Nema PMF - ale density function (hustota pravdepodobnosti) F(x) = from -inf .. x ∫︀ f_X (t) dt
27
Joint discrete probability mass function, cumulative mass function
p_(X, Y) = P(X=x && Y=y) another formula: p_(X, Y) = P(Y=y | X=x)*P(X=x) = P(X=x|Y=y)*P(Y=y) je v rozmezi [0, 1]
28
Priklady distribuci spojitych nahodnych velicin
* uniformi - rovnomerne * normalove - napr. distribuce IQ * Exponencionalni
29
Diskretni nahodny vektor
je to vektor nahodnych velicin (X_1, ... , X_r) nejcasteji pro dve veliciny na porovnani
30
Stredni hodnota (Expected value)
Ocekavana hodnota. Pro uniformni hodnoty prumer E(X) = ︁∑ x * P(x) ; x ∈ Im(X)
31
Stredni hodnota uniformni distribuce
E(X) = ∑ (x*1)/n = ( ∑ x ) / n
32
Stredni hodnota Bernoulli
E(X) = 0 * (1-p) + 1*p = p
33
Stredni hodnota binomial
E(X) = np
34
Stredni hodnota geometricke
E(X) = 1/p
35
Stredni hodnota geometricke
E(X) = 1/p
36
Vlastnosti strednich hodnot
* Linearita 1. E(c * X) = c * E(X) 2. E(X + Y) = E(X) + E(Y) * Jestli X, Y nezavisle -> E(X * Y) = E(X)*E(Y) * Podminena stredni hodnota: X, Y diskretni nahodne veliciny. 1. E(X|Y=y) = ∑ x * P( X=x | Y=y ) * Markov inequality (Markova nerovnost) - X je non-negative nahodna velicina s konecnou stredni hodnotou. Potom plati pro vsechny t > 0: P(X ≥ t) ≤ E(X) / t Markov inequality stanovuje upper bound probablity na bazi stredni hodnoty.
37
Rozptyl
second central moment VAR(X) = E([X- E(X)] ^2) = ∑ (x - E(x))^2
38
Stochastické procesy
collection of random variable { X_t | t ∈T } * T = cas * X_t = stav X v case t sekvence epxerimentuje, ktere sleduji vyvoj v case (napr. teplota CPU, vyvoj populace) - bereme jenom hodnoty v diskretnim case (mereni v casovych bodech - treba jednou za hodinu)
39
Markovovy řetězce
Jsou stochasticke procesy takove: P(X_(t+1) = a | X_t = b) neboli budouci hodnota zavisi pouze na aktualni hodnote Da se rict, ze vyjadruji pravdepodobnost prechodu do dalsiho stavu - ergo daji se reprezentovat automatem, matici (transition matrix). napr. Ze stavu 'A' muze prejit do stavu 'B', 'C', 'D' avsak pravdepodobnosti vsech prechodu z 'A' se musi rovnat 1 (100%). Tedy P('A', 'B') + P('A', 'C') + P('A', 'D') + P('A', 'A') = 1 Pouziti v NLP napr. mame vety, analyzujem posloupnout a stvorime automat: Zacina: slovem I, nasledne treba 30% don't, 70% like a tak
40
Teorie informace - entropie
H (X) = - ∑ p_X(x)*log p_X(x) ; p_X(x) je pravdepodobnostni funkce Entropie 0 = vime, co se stane Nejvetsi entropii ma uniformni distribuci Vyjadruje se v bitech (informace). Stanovuje nejmensi ocekavany pocet bitu na preneseni dane informace (ocekavane - stredni hodnota vpodstate) log_b = b - base, volime dle jednotky ... log_2 pro bity. log_8 pro bytes treba H(1/(mn), ...) = H(1/m, ...) + H(1/n, ...); m a n jsou nezavisle
41
teorie kodovani - huffmanovy kody
Casto pouzite pro loosless kompresi dat funguje, ze nejcasteji vyskytujicimu prvku se da nejkratsi kodove slovo (0) a naopak - prvku x z X kde x ma nejmensi pravdepodobnost, tak x ma nejdelsi slovo. a_1 = 1/2 ... 0 a_2 = 1/4 ... 10 a_3 = 1/8 ... 110 a_4 = 1/8 ... 111
42
teorie kodovani - kapacita hlucneho kanalu
R = { log_2{M} } / n # (bitu pro prenos - throughput) Kapacita kanalu: C = max I(X; Y) = max [H(Y ) − H(Y | X )] kde * nahodna velicina X je vstup * nahodna velicina Y je vystup
43
Standard Deviation
SQRT( VAR(X) ) // rozptyl je vypocitany jako jednotka^2 .. napr m^2
44
Vlastnosti rozptylu
* VAR(aX + b) = a^2 * VAR( X ) * VAR(X) = E(X^2) - E(X)^2
45
Covariance (Kovariance)
COV(X, Y) = E[X -E(X)] * E [Y - E(Y) ] meri linearni zavislost * Cov < 0 = anticorrelated * Cov = 0 neutralni linearni zavisle * Cov > 0 - linearne zavisle
46
Zakon velkych cisel
Cim delsi sekvence nezavislych nahodnych velicin (experimentu), tim bliz se dostavame ke stredni hodnote (strong law of large numbers)
47
Chebyshev inequality
P(|X - E(X)| ≥ t) ≤ VAR(X) / t^2 - stanovuje uper bound na rozptylu
48
Continuous-time Markov chain (CTMC)
Zatimco Markov chain diskretni je v periodickych casech, continous specifikuje libovolne mnozstvi casu nez nastane prechod
49
Joint Entropy
entropie paru (n-tice) X,Y - jsou korelovatelne H (X, Y) = - ∑_x ∑_y { p(x, y)*log p(x, y) }
50
Conditional Entropy
Mira nejistoty (H(X|Y)) hodnoty X za predpokladu, ze je dana hodnota Y H(X|Y) = 0 jestli X je kompletne zavisle na Y H(X|Y) = H(X) jestli X a Y jsou nezavisle H (X|Y) = - ∑ P(X=x | Y=y)*log{ P(X=x | Y=y) }
51
General Chain Rule
H(X_1, X_2, ... X_n) = H(X_1) + H(X_2|X_1) + .. + H( X_n | X_1 ) Podobne: D(p(x, y ) || q(x, y )) = D( p(x) || q(x)) + D( p(y|x) || q(y|x)) H(X , Y) = H(Y) + H(X|Y)
52
Venn diagram (entropie)
Circle A = H(X) Circle B = H(Y) cast A and B (prunik) = I(X; Y) dohromady A a B (union) = H(X, Y) H(X|Y) = H(X) - I(X;Y) H(Y|X) = H(Y) - I(X;Y)
53
Relative entropy
Vyjadruje, jak se pravdepodobnosti rozlozeni p lisi od pravdepodobnostniho rozlozeni q D(p || q) = - ∑ p(x) * log ( p(x) / q(x) )
54
Mutual Information
Meri mnozstvi informaci, ktere veliciny X a Y maji spolecne I(X; Y) = 0 - nezavisle I(X; Y) = ∑ p(x, y) log { p(x, y) / ( p(x) * p(y) ) } I(X;Y) = H(X) + H(Y) - H(X,Y)
55
Conditional Mutual Information
I(X;Y|Z)=H(X|Z) - H(X|Y, Z) Venn diagram: cast, ktera je spolecna pouze X a Y
56
Probability of an error for the code
Mame kod (M, X^n, g) * M = pocet kodovych slov * f = enkodovaci funkce f: X^n: {1, 2, 3, ..., M} -> X^n * g = dekodovaci funkce g: Y^n -> {1, 2, 3, ..., M} Kanal je pak definov (X, p(y|x), Y) * X je vstupni abeceda * p(y|x) je pravdepodobnostni funkce * Y je vystupni abeceda λ_i = SUM_{y^n takovych, ze g(y^n) != i } { p( y^n | x^n ) } max prob of error λ_max = max λ_i
57
Symmetricky kanal
Kanal je pak definov (X, p(y|x), Y) * X je vstupni abeceda * p(y|x) je pravdepodobnostni funkce * Y je vystupni abeceda mapovani: 0 -> 0: 1-p 0 -> 1: p 1 -> 0: p 1 -> 1: 1-p neb ze pravdepodobnosti, ze se bit prenese spravne jsou stejne.
58
Memoryless channel
Bez pameti: zavislost vystupu zalezi ciste na aktualnim vstupu p( y_k | x^k)