Problem Set IV Solution

$30.00 $24.00


You'll get a: . zip file solution : immediately, after Payment


  1. Autoencoder (30%). Train an autoencoder (AE) network (provided in Matlab) with aligned faces obtained from the PS3 question 1. Reconstruct the training data with di erent sizes of latent (hidden) layers to answer the following questions.

No regularity in AE:

    1. Can you recover the original aligned faces with a full size of hidden layers? Plot the results.

    1. Set 1%; 3%; 10% of hidden layer size (compare with original data dimension) to plot out the reconstruct face and report the reconstruction error.

3) Increase the weight w of L2 regularity term in AE, plot out the reconstruction errors with di erent weights w = f0:1; 0:2; 0:3; 0:4; 0:5g.

  1. Regression (70%). Given data (X; Y ) with X 2 Rd and Y 2 f0; 1g, our goal is to train a classi er that will predict an unknown class label y~ from a new data point x~. Consider the following model:

Y Ber

1 + e XT



N(0; 2I):

This is a Bayesian logistic regression model. Your goal is to derive and implement a MAP (maximum a posterior) Bayesian inference on .

(a) Write down the formula for the unormalized posterior of j Y , i.e.,


p( j y; x; ) /


p(yi j ; xi)p( ; )


(b) Show that this posterior is proportional to exp ( U( )), where


U( ) = X(1 yi)xTi + log(1 + e xTi ) + 212 k k2:


  1. Implement MAP to infer .

  1. Use your code to analyze the iris data (provided in txt le), looking only at two species, versicolor and virginica. The species labels are your Y data, and the four features, petal length and width, sepal length and width, are your X data. Also, add a constant term, i.e., a column of 1’s to your X matrix. Use the rst 30 rows for each species as training data and leave out the last 20 rows for each species as test data (for a total of 60 training and 40 testing). Use the estimated to get a prediction, y~, of the class labels for the test data.

  1. Compare this to the true class labels, y, and see how well you did by estimating the

average error rate, E[jy y~j] (a.k.a. the zero-one loss). What values of , , and L did you use?