Assignment 3 Solution

$30.00

Category: Tag:

Description

3.1 Inference in a chain

X1 X2 X3 • • • XT-2 XT-1 XT

Consider the belief network shown above with random variables Xt 2f1; 2; : : : ; mg. Suppose that the CPT at each non-root node is given by the same m m matrix; that is, for all t 1, we have:

Aij = P (Xt+1 =jjXt = i):

  1. Prove that P (Xt+1 = jjX1 = i) = [At]ij, where At is the tth power of the matrix A. Hint: use induction.

  1. Consider the computational complexity of this inference. Devise a simple algorithm, based on matrix-vector multiplication, that scales as O(m2t).

  1. Show alternatively that the inference can also be done in O(m3 log2 t).

  1. Suppose that the transition matrix Aij is sparse, with at most s m non-zero elements per row. Show that in this case the inference can be done in O(smt).

  1. Show how to compute the posterior probability P (X1 = ijXT +1 = j) in terms of the matrix A and the prior probability P (X1 = i). Hint: use Bayes rule and your answer from part (a).

3.2 More inference in a chain

X0 X1

Consider the simple belief network shown to the right, with nodes X0, X1, and Y1.

To compute the posterior probability P (X1jY1), we can use Bayes rule:

P(X Y

) =

P (Y1jX1) P (X1)

Y1

1j 1

P(Y1)

(a) Show how to compute the conditional probability P (Y1jX1) that appears in the numerator of Bayes rule from the CPTs of the belief network.

(b) Show how to compute the marginal probability P (Y1) that appears in the denominator of Bayes rule from the CPTs of the belief network.

Next you will show how to generalize these simple computations when the basic structure of this DAG is repeated to form an extended chain. Like the previous problem, this is another instance of efficient inference in polytrees.

1

X0

X1

X2

. . .

Xn-1

Xn

Y1

Y2

. . .

Yn-1

Yn

Consider how to efficiently compute the posterior probability P (XnjY1; Y2; : : : ; Yn) in the above belief network. One approach is to derive a recursion from the conditionalized form of Bayes rule

P (YnjXn; Y1; Y2; : : : ; Yn 1) P (XnjY1; Y2; : : : ; Yn 1)

P (XnjY1; Y2; : : : ; Yn) =

where the nodes Y1; Y2; : : : ; Yn 1 are treated as background evidence. In this problem you will express the conditional probabilities on the right hand side of this equation in terms of the CPTs of the network and the probabilities P (Xn 1 = xjY1; Y2; : : : ; Yn 1), which you may assume have been computed at a previous step of the recursion. Your answers to (a) and (b) should be helpful here.

  1. Simplify the term P (XnjY1; Y2; : : : ; Yn 1) that appears in the numerator of Bayes rule.

  1. Show how to compute the conditional probability P (YnjXn; Y1; Y2; : : : ; Yn 1) that appears in the

numerator of Bayes rule. Express your answer in terms of the CPTs of the belief network and the probabilities P (Xn 1 = xjY1; Y2; : : : ; Yn 1), which you may assume have already been computed.

  1. Show how to compute the conditional probability P (YnjY1; Y2; : : : ; Yn 1) that appears in the de-nominator of Bayes rule. Express your answer in terms of the CPTs of the belief network and the

probabilities P (Xn 1 = xjY1; Y2; : : : ; Yn 1), which you may assume have already been computed.

3.3 Node clustering and polytrees

In the figure below, circle the DAGs that are polytrees. In the other DAGs, shade two nodes that could be clustered so that the resulting DAG is a polytree.

2

3.4 Cutsets and polytrees

Clearly not all problems of inference are intractable in loopy belief networks. As a trivial example, consider the query P (QjE1; E2) in the network shown below:

E1

Q

E2

In this case, because E1 and E2 are the parents of Q, the query P (QjE1; E2) can be answered directly from the conditional probability table at node Q.

As a less trivial example, consider how to compute the posterior probability P (QjE1; E2) in the belief net-work shown below:

E1

Q

E2

In this belief network, the posterior probability P (QjE1; E2) can be correctly computed by running the polytree algorithm on the subgraph of nodes that are enclosed by the dotted rectangle:

E1

Q

E2

In this example, the evidence nodes d-separate the query node from the loopy parts of the network. Thus for this inference the polytree algorithm would terminate before encountering any loops.

(continued)

3

For each of the five loopy belief networks shown below, consider how to compute the posterior probability P (QjE1; E2).

If the inference can be performed by running the polytree algorithm on a subgraph, enclose this subgraph by a dotted line as shown on the previous page. (The subgraph should be a polytree.)

On the other hand, if the inference cannot be performed in this way, shade one node in the belief network that can be instantiated to induce a polytree by the method of cutset conditioning.

E1

E1

Q Q

E2

E2

E1 Q E2

E1 Q E2

E1 Q E2

4

3.5 Node clustering

Consider the belief network shown below over binary variables X, Y1, Y2, Y3, Z1, and Z2. The network can be transformed into a polytree by clustering the nodes Y1, Y2, and Y3 into a single node Y . From the CPTs in the original belief network, fill in the missing elements of the CPTs for the polytree.

X

X

P (Y1 = 1jX)

P (Y2 = 1jX)

P (Y3 = 1jX)

0

0.75

0.50

0.25

1

0.50

0.25

0.75

Y1

Y2

Y3

Y1

Y2

Y3

P (Z1 = 1jY1; Y2; Y3)

P (Z2 = 1jY1; Y2; Y3)

0

0

0

0.9

0.1

1

0

0

0.8

0.2

0

1

0

0.7

0.3

Z1

Z2

0

0

1

0.6

0.4

1

1

0

0.5

0.5

1

0

1

0.4

0.6

0

1

1

0.3

0.7

1

1

1

0.2

0.8

X

Y1

Y2

Y3

Y

P (Y jX = 0)

P (Y jX = 1)

P (Z1 = 1jY )

P (Z2 = 1jY )

0

0

0

1

1

0

0

2

0

1

0

3

Y

0

0

1

4

1

1

0

5

1

0

1

6

0

1

1

7

Z1

Z2

1

1

1

8

5

3.6 Likelihood weighting

Consider the belief network shown below, with n binary random variables Bi 2f0; 1g and an integer random

variable Z. Let f(B) =

n

2

i

1

Bi

denote the nonnegative integer whose binary representation is given

i=1

by

B B

1

:::B B

Suppose that each bit has prior probability P (B = 1) =

1

, and that

2

n n

2 1.

P

i

1

P (ZjB1; B2; : : : ; Bn) =

jZ f(B)j

1 +

where 0 < < 1 is a parameter measuring the amount of noise in the conversion from binary to decimal. (Larger values of indicate greater levels of noise.)

bits Bi

integer Z

(a) Show that the conditional distribution for binary to decimal conversion is normalized; namely, that

Pz P (Z = zjB1; B2; : : : ; Bn) = 1, where the sum is over all integers z 2 [ ; +1].

  1. Consider a network with n = 10 bits and noise level = 0:1. Use the method of likelihood weighting to estimate the probability P (Bi = 1jZ = 128) for i 2 f2; 5; 8; 10g.

  1. Plot your estimates in part (b) as a function of the number of samples. You should be confident from the plots that your estimates have converged to a good degree of precision (say, at least two significant digits).

  1. Submit a hard-copy printout of your source code. You may program in the language of your choice, and you may use any program at your disposal to plot the results.

6

3.7 Even more inference

Show how to perform the desired inference in each of the belief networks shown below. Justify briefly each step in your calculations.

  1. Markov blanket

Show how to compute the posterior probability P (BjA; C; D) in terms of the CPTs of this belief network—namely, P (A), P (BjA), P (C), and P (DjB; C).

  1. Conditional independence

This belief network has conditional probability tables for P (F jA) and P (EjC) in addition to those of the previous problem. Assuming that all these tables are given, show how to compute the posterior probability

P (BjA; C; D; E; F ) in terms of these additional CPTs and your answer to part (a).

  1. More conditional independence

Assuming that all the conditional probability tables in this belief network are given, show how to compute the posterior probability P (B; E; F jA; C; D). Express your answer in terms of the CPTs of the network, as well as your earlier answers for parts (a) and (b).

A B D

C

F

A B D

C E

F

A B D

C E

7

3.8 More likelihood weighting

  1. Single node of evidence

X Q

Y Z

E

Suppose that T samples fqt; xt; yt; ztgTt=1 are drawn from the CPTs of the belief network shown above (with fixed evidence E = e). Show how to estimate P (Q = qjE = e) from these samples using the method of likelihood weighting. Express your answer in terms of sums over indicator functions, such as:

I(q; q0) = (

1

if q = q

0

0

otherwise

In addition, all probabilities in your answer should be expressed in terms of CPTs of the belief net-work (i.e., probabilities that do not require any additional computation).

  1. Multiple nodes of evidence

Suppose that T samples fq1t; q2t; xt; yt; ztgTt=1 are drawn from the CPTs of the network shown below (with fixed evidence E1 = e1 and E2 = e2). Show how to estimate P (Q1 = q1; Q2 = q2jE1 = e1; E2 = e2) from these samples using the method of likelihood weighting. Express your answer in terms of indicator functions and CPTs of the belief network.

Q1 X Y

E1 Z

E2 Q2

8


error: Content is protected !!