인과추론의 데이터과학. (2021, Oct 1). [Session 7/8 - 보충 1] 베이지안 네트워크 (Bayesian Network)
[Video]. YouTube.
인과추론의 데이터과학. (2021, Oct 1). [Session 7/8 - 보충 2] 베이지안 네트워크에서의 상관관계 증명
[Video]. YouTube.
Session 7/8 보충 1
Probability
- \(P(A)\) : unconditional / marginal Pr
- marginal probability: 주변 확률 (=개별 사건의 확률)
- \(P(A|B)\) : conditional Pr
- joint Pr
$$ P(B|A) = \displaystyle{{P(B)P(A|B)}\over P(A)} $$
$$ \begin{align*} P(A\cap B) &= P(A,B) \\ &= P(A)P(B|A)\\ &= P(B)P(A|B)\end{align*} $$ - \(P(A, B, C)\)
$$ \begin{align*} P(A,B,C) &= P(A)P(B,C|A) \\ &= P(A)P(B|A)P(C|A,B)\end{align*} $$
Marginalize
- Conditional Pr / Joint Pr ⇒ Marginal Pr
- \(P(A=0) = P(B=0)P(A=0|B=0) + P(B=1)P(A=0|B=1)\)
⇒ \(P(A) = \displaystyle\sum_B{P(B)P(A|B)} = \sum_B{P(A,B)}\)
Independent
- dependent : correlation, association이 있는 것
- \(P(A, B) \ne P(A)P(B)\)
- independent: \(A\perp B\)
- \(P(A|B) = P(A)\)
- \(P(A,B) = P(A)P(B|A) = P(A)P(B)\)
- conditional independent
- \(P(A,B|C) = P(A|C)P(B|C)\) ⇒ \(C\)라는 조건 하에, \(A\)와 \(B\)가 independent하다
Causal Markov Assumption
- \(P(X,Y,Z) = P(X)P(Y,Z|X) = P(X)P(Y|X)P(Z|X,Y)\)
- Under DAG(Causal Graph) + Markov Assumption ⇒ Bayesian Network Factorization
- \(X → Y → Z\)
- \(P(X,Y,Z) = P(X)P(Y|X)P(Z|Y)\)
Session 7/8 보충 2
Mediator
- \(X → M → Y\)
- Bayesian Network Factorization: \(P(X, Y, M) = P(X)P(M|X)P(Y|M)\)
- Objective: \(P(X, Y) = P(X)P(Y)\) ??
- if \(P(X, Y) = P(X)P(Y)\) ⇒ independent
- if \(P(X, Y) \ne P(X)P(Y)\) ⇒ dependent
- Proof
- \(P(M|X) = P(M)\) ?? ⇒ \(P(M|X) \ne P(M)\)
- if \(P(M|X) = P(M)\)
- \(M\perp X\) (DAG)
- \(P(M|X) = P(M)\) ?? ⇒ \(P(M|X) \ne P(M)\)
$$ \begin{align*}P(X,Y) = \displaystyle\sum_M{P(X,Y,M)} &= \sum_M{P(X)P(M|X)P(Y|M)}\\&=P(X)\sum_M{P(M|X)P(Y|M)}\\&\ne P(X)P(Y)\end{align*} $$
- Conditioning: \(X → M(Condition) → Y\)
- Objective: \(P(X,Y|M) = P(X|M)P(Y|M)\) ??
- Bayesian Network Factorization: \(P(X,Y,M) = P(X)P(M|X)P(Y|M)\)
- Proof
- \(P(X,Y,M) = P(M)P(X,Y|M)\)
Confounder
- \(X ← C → Y\)
- Objective: \(P(X,Y) = P(X)P(Y)\) ??
- Bayesian Network Factorization: \(P(X,Y,C) = P(X|C)P(Y|C)P(C)\)
- Proof
- \(P(X)P(Y) = P(Y)\displaystyle\sum_C{P(X,C)} = P(Y)\sum_C{P(C)P(X|C)}\)
- \(\displaystyle P(Y)\sum_C{P(C)P(X|C)} = P(Y)\sum_C{P(X|C)P(C|Y)}\) ??
⇒ \(P(C) = P(C|Y)\) ?? ⇒ dependent ⇒ \(P(C) \ne P(C|Y)\)
⇒ \(P(X,Y) \ne P(X)P(Y)\)
- Conditioning: \(X ← C(Condition) → Y\)
- Objective: \(P(X,Y|C) = P(X|C)P(Y|C)\) ??
- Bayesian Network Factorization: \(P(X,Y,C) = P(X|C)P(Y|C)P(C)\)
- Proof
- \(P(X,Y,C) = P(C)P(X,Y|C)\)
- \(P(X,Y|C) = P(X|C)P(Y|C)\) ⇒ conditional independent
Collider
- \(X → C ← Y\)
- Objective: \(P(X,Y) = P(X)P(Y)\) ??
- Bayesian Network Factorization: \(P(X,Y,C) = P(X)P(Y)P(C|X,Y)\)
- Proof
- \(\displaystyle\sum_C{P(C|X,Y)} = 1\)
- Conditioning: \(X → C(Condition) ← Y\)
- Objective:
- \(P(X,Y|C) = P(X|C)P(Y|C)\) ??
- ✅ \(P(X|C) = P(X|C,Y)\) ??
- Bayesian Network Factorization: \(P(X,Y,C) = P(X)P(Y)P(C|X,Y)\)
- Proof
- \(P(X|C) = \displaystyle {P(X)P(C|X)\over P(C)}\)
- \(P(X,C|Y) = P(X|Y)P(C|Y, X) = P(C|Y)P(X|Y,C)\)
- \(P(X|Y,C) = \displaystyle {P(X|Y)P(C|Y,X)\over P(C|Y)}\)
- \(X\), \(Y\)는 dependent → \(P(X) = P(X|Y)\)
- \(\displaystyle{P(C|X)\over P(C)} = {P(C|Y,X)\over P(C|Y)}\) ??
- 좌변: Effect of \(X\) on \(C\)
- 우변: Effect of \(X\) on \(C\) after controlling for \(Y\)
⇒ \(\displaystyle{P(C|X)\over P(C)} \ne {P(C|Y,X)\over P(C|Y)}\)
- Objective:
Correlation ≠ Causation
- Confounder : \(X ← C → Y ← X\)
- Bayesian Network Factorization: \(P(X,Y,C) = P(X|C)P(Y|C,X)P(C)\)
- Proof
- \(P(X)P(Y,C|X) = P(X|C)P(Y|X,C)P(C)\)
- \(P(Y,C|X) = \displaystyle{P(X|C)P(C)P(Y|X,C)\over P(X)} = P(C|X)P(Y|X,C)\)
- \(\displaystyle{P(X|C)P(C)\over P(X)} = {P(X,C)\over P(X)} = {P(X)P(C|X)\over P(X)} = P(C|X)\)
- \(P(Y|X) = \displaystyle\sum_C{P(Y,C|X)} = \sum_C{P(C|X)P(Y|X,C)}\)
- do operator : \(C → Y ← do(X)\)
- causal effect: \(P(Y|do(X))\)
- Bayesian Network Factorization: \(P(do(X), Y, C) = P(do(X))P(Y|do(X), C)P(C)\)
- Proof
- \(P(do(X), Y, C) = P(do(X))P(Y,C|do(X))\)
- \(P(Y|do(X)) = \displaystyle\sum_C{P(Y,C|do(X))} = \sum_C{P(Y|do(X),C)P(C)}\)
- \(\displaystyle\sum_C{P(C|X)P(Y|X,C)} = \sum_C{P(Y|do(X),C)P(C)}\) ??
- \(\displaystyle\sum_C{P(C|X)} = \sum_C{P(C)}\) ??
- \(do(X) → X\)
- \(X → C\) ⇒ dependent ⇒ \(P(C|X) \ne P(C)\) ⇒ correlation ≠ causation
- \(\displaystyle\sum_C{P(C|X)} = \sum_C{P(C)}\) ??
반응형
'Causal inference' 카테고리의 다른 글
Part 1. 01~02 (0) | 2023.08.30 |
---|---|
[KSSCI 2021] 인과추론의 데이터 과학 - Session 9 (0) | 2023.08.29 |
[KSSCI 2021] 인과추론의 데이터 과학 - 사전학습자료 (0) | 2023.08.15 |
[KSSCI 2021] 인과추론의 데이터 과학 - Session 7 (0) | 2023.08.08 |
[KSSCI 2021] 인과추론의 데이터 과학 - Session 3, 4 (0) | 2023.08.01 |
댓글