Homework #4 Multivariate Gaussians, Kalman Filtering, Maxi-mum Likelihood, EM Solution



For theory part, please make sure you have detailed steps. Final grades are given based on steps not the nal answer. Various starter les are provided in the ipython-notebook. Your report should be a PDF le containing:

Page 1: Problem 1

Page 2: Problem 2

Page 3: Problem 3

Pages 4 & 5: Problem 4 (c)

Page 6 and onwards: PDF rendering of completed ipython notebook, which will contain your answers to Problems 4(a), 4(b), 5, 6.

1. Maximum Likelihood [5 pts]

The Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a xed period of time if these events occur with a known average rate and independently of the time since the last event. Assume we obtain m i.i.d.

samples x(1); : : : ; x(m) distributed according to the Poisson distribution P (x = k) = ke for


  • = 0; 1; 2; 3; : : :. What is the maximum likelihood estimate of as a function of x(1); : : : ; x(m)?

2. Linearity of Expectation, Positive Semi-de niteness [5 pts]

A matrix A 2 Rnxn is positive semi-de nite (often denoted by A 0) if and only if:

Aij = Aji

8z 2 Rn : z>Az 0

If A is also symmetric, prove that covariance matrices, i.e., matrices of the form = E[(X EX)(X EX)>] are guaranteed to be positive semi-de nite.

  1. The gaussian integral and completion of squares trick [10 pts] Let A 2 Rn n be a positive de nite matrix, b 2 Rn, and c 2 R. Prove that,








21 bT A 1b)


(2 )2




x b c) dx =


Hint: You can directly use the fact that the integral of a multivariate Gaussian equals 1.

CS287 Homework #4


4. Kalman Filtering, Smoothing, EM

  1. Implementation of Kalman Filtering [10 pts]

In this question you will implement a Kalman Filter. Look at Homework4 Q.ipynb for more detailed instructions. Results should be included in the completed ipython notebook.

  1. Implementation of Smoothing [10 pts]

In this question you will implement a Kalman Smoother. Look at Homework4 Q.ipynb for more detailed instructions. Results should be included in the completed ipython notebook.

  1. Implementation EM. [10 pts]

In this question you will implement the EM algorithm to estimate the covariance matrices. Look at Homework4 Q.ipynb for more detailed instructions. Results should be included in the completed ipython notebook.

  1. Application to Species Population Size Estimation from Observations of Total Population Size.[10 pts]

Consider three species U; V; W that grow independently of each other, exponentially with growth rates: U grows 2% per hour, V grows 6% per hour, and C grows 11% per hour. The goal is to estimate the initial size of each population based on the measurements of total population.

Let xU (t) denote the population size of species U after t hours, for t = 0; 1; : : :, and similarly for xV (t) and xW (t), so that

xU (t + 1) = 1:02xU (t); xV (t + 1) = 1:06xV (t); xW (t + 1) = 1:11xW (t):

The total population measurements are y(t) = xU (t) + xV (t) + xW (t) + v(t), where v(t) are IID, N (0; 0:36). (Thus the total population is measured with a standard deviation of 0.6).

The prior information is that xU (0); xV (0); xW (0) (which are what we want to estimate) are IID N (6; 2). (Obviously the Gaussian model is not completely accurate since it allows the initial populations to be negative with some small probability, but we’ll ignore that.)

How do you formulate this problem as a Kalman ltering problem by providing your A; B; C; d; Q; R. How long will it be (in hours) before we can estimate xU (0) with a variance less than 0.01?

How long for xV (0)? How long for xW (0)? Starter code can be found in Homework4 Q.ipynb.

Results should be included in the completed ipython notebook.

  1. Correlated Noise. [10 pts]

In many practical situations the noise is not independent. Consider the following stochastic system, for which the noise is not independent:

CS287 Homework #4

x0 N( 0; 0)

xt+1 = Axt + wt

wt = 0:3wt 1 + 0:2wt 2 + pt

pt N (0; pp)

yt = Cxt + vt

vt = 0:8vt 1 + qt 1

qt N (0; qq)

p 1 = q 1 = v 1 = w 1 = w



2 = 0

Describe how, by choosing the appropriate state representation, the above setup can be molded into a standard Kalman ltering setup. In particular, describe the state, the dy-namics model, and the measurement model such that the problem is transformed into the standard Kalman ltering setup with uncorrelated noise.

5. Sensor Selection [10 pts]

We consider the following linear system:



Axt + wt



Ctxt + vt

where A 2 Rn n is constant, but Ct can vary with time. The noise contributions are independent, and

x0 N (0; 0); wt N (0; w) vt N (0; v):

Here is the twist: the measurement matrix Ct at each time comes from the set S = fS1; : : : ; SK g.

In other words, at each time t, we have Ct = Sit . The sequence i0; i1; i2; : : : speci es which of

the K possible measurements is taken at time t. For example the sequence 2; 2; 2; : : : means

that Ct = S2 for all t. The sequence 1; 2; : : : ; K; 1; 2; : : : ; K; : : : is called round-robin: we cycle through the possible measurements, in order, over and over again.

Here is the interesting part: you get to choose the measurement sequence i0; i1; i2; : : :.

You will work with the following speci c system:






A = 4 10:1:1





; w = I; v = 0:1 ; 0

= I



and K = 3 with

0:64 ; S2 = 0:37 0:86 0:37 ; S3 = 0 0 1 :

S1 = 0:74 0:21

  1. Using One Sensor. Plot trace( tj0:t) versus t for the three special cases when Ct = S1 for all t, Ct = S2 for all t, and Ct = S3 for all t.

  1. Round-robin. Plot trace( tj0:t) versus t using the round-robin sensor sequence 1; 2; 3; 1; 2; 3; : : :.

  1. Greedy Sensor Selection. Plot trace( tj0:t) versus t using greedy sensor selection. In

greedy sensor selection at time t the choice of i0; i1; : : : ; it 1 has already been made and it

has determined tj0:t 1. Then tj0:t depends on it only, i.e., which of S1; : : : ; SK is chosen

as Ct. Among these K choices you pick the one that minimizes trace( tj0:t).

CS287 Homework #4


See Homework4 Q.ipynb part 3 and look for Your code here for the parts that you need to ll in. Results should be included in the completed ipython notebook.

Note none of these require knowledge of the measurements z0; z1; : : :.

6. Extended Kalman Filter (EKF)[20 pts]

In this question you get to implement an extended Kalman Filter (EKF) for state estimation for nonlinear dynamics and observation models.

Notes: Let x 2 RxDim be the system state, u 2 RuDim denote the control input applied to the system, and z 2 RzDim be the vector of observations obtained about the system state using sensors. We are given a discrete-time stochastic dynamics model that describes how the system state evolves and an observation model that relates the obtained observations to the state, given here in state-transition notation:


= f(xt; ut; qt);

qt N (0; Q);



= h(xt; rt);

rt N (0; R):


Given the current state estimate x^t, control input ut, covariance t, and observation zt+1, the

EKF update equations are given by:


= f(x^t; ut; 0) + Kt(zt+1

h(f(x^t; ut; 0); 0));



= (I

KtHt) t+1


where At



(x^t; ut; 0);

Mt =


(x^t; ut; 0);







(f(x^t; ut; 0); 0);

Nt =


(f(x^t; ut; 0); 0);





= At tAT + MtQMT ;

Kt =

HT (Ht


HT + NtRNT ) 1









Q. See Homework4 Q.ipynb part 4 and look for Your code here for the parts that you need to ll in. Use the routine provided in numerical jac function to compute the required Jacobians (Eqs. (3c), (3d)). The script considers a two-dimensional nonlinear system de ned by:



= 0:1 (xt[1])2 2 xt[1] + 20 + qt[1];


= xt[1] + 0:3 xt[2] 3 + 3 qt[2]

and zt[1]

= xtT xt + sin(5 rt[1]);



= 3 (xt[2])2=xt[1] + rt[2];

where dynamics noise qt and the measurement noise rt are assumed to be drawn from a Gaussian distribution with zero mean and speci ed variances Q = [ 20 02 ] and R = [ 10 100 ], respectively.

For verifying correctness, starting from an initial state x0 N ([ 1010 ] ; [ 10 01 ]), a random trajectory X1:50 was generated and corresponding observations Z1:50 were collected.

The provided code plots the ground truth value of the state X1:50 along each dimension and also plots the state estimate given by the EKF (blue dotted line) and plots 3-standard deviations of the variance corresponding to each dimension. It also prints out the state estimate x^50 and covariance 50 at the nal time step to help you verify correctness of your implementation. Results should be included in the completed ipython notebook.

error: Content is protected !!