# Homework 3 Solution

\$30.00

Category:

## Description

Task: Work through each set of exercises. If you get stuck on any of the exercises you can ask Yi or myself for help by email or during o ce hours.

What to submit: Submit your answers for all of the exercises in this document to the appropriate dropbox on the Carmen site. Answers for the concept check and proof sections can be hand-written (e.g., submitted as a scanned image), but please make sure that your writing is readable. Answers to the coding section must be written in python and must be runnable by the grader.

Due date: Submit your answers to the Carmen dropbox by 11:59pm, Jun. 27th.

Concept check

1. (2pt) Using the alarm network on slide 3 of the Bayesian Inference slides, compute P(B j + j; +m).

1. (3pt) Refer to the Naive Bayes Classifier shown below. Suppose C has domain fc1; c2; c3g and each Xi is a Boolean variable with values true and f alse. Using the Bayesian net G, compute the following distribution, show-ing the manner in which you derived your answer.

P(C j X1 = f alse; X2 = true; X3 = f alse):

3. (3pt) The sigmoid function

1

s(z) = 1 + e z

1

 C :: P (c1) P (c2) G X2 :: P (true|c1) P (true|c2) P (true|c3) C 0.3 0.5 0.9 0.5 0.7 X1 :: P (true|c1) P (true|c2) P (true|c3) X3 :: P (true|c1) P (true|c2) P (true|c3) 0.7 0.4 0.2 X2 0.6 0.4 0.2 X1 X3

Figure 1: A Naive Bayes Classifier.

has derivative s0(z) = s(z)(1 s(z)). Moreover, recall that during backpro-pogation the derivative s0(z) is a factor in the gradient computation used to update the weights of a multilayer perceptron (see slides 28-30 in the neural-nets.pdf slide set). Activation functions like sigmoid have a “satura-tion” problem: when z is very large or very small, s(z) is close to 1 or 0, respectively, and so s0(z) is close to 0. As a result, corresponding gradients will be nearly 0, which slows down training. A ne activation functions with positive slope always have a positive derivative and thus will (more or less) not exhibit saturation, but they have other drawbacks (think back to lab 6). Do a little research and find a non-a ne activation function that avoids the saturation problem (hint: ReLU). In your own words, describe how this activation is non-a ne and also avoids the saturation problem. Briefly dis-cuss any drawbacks your chosen activation function may have, as well as similar alternatives that avoid these drawbacks.

Coding

1. (8pt) Implement in python a convolutional layer (without identity activa-tion) that computes the application of the 3×3 vertical and horizontal sobel masks below to an input image of size 5x5x3 with zero-padding of size 1. That is, the weights of your convolutional layer will not be learned, but rather hard-coded to match the values of the filters. To make things con-crete, use the input volume and masks below:

 3 1 3 8 2 5 4 1 3 8 2 2 3 7 3 4 1 5 7 9 4 9 1 4 7 6 9 4 4 5 input volume: 2 1 4 5 0 7 3 1 4 6 1 1 1 1 1 4 1 5 8 3 8 4 1 5 2 8 3 4 5 5 3 1 4 7 2 2 3 1 8 2 7 2 3 1 4

2

 1 0 1 1 2 1 vertical mask: 2 0 2 horizontal mask: 0 0 0 1 0 1 1 2 1
1. (9 pt) Implement a perceptron that can learn the Boolean function AND using the threshold activation function.

Fun with proofs

1. (5pt) Prove that a multilayer perceptron with one hidden layer with two neurons and output layer with one neuron is an a ne function of the input if the activation function for each neuron is an a ne function. To make things simple and concrete, you need only demonstrate the result for the mlp show below. Briefly explain the implications of this result for using multilayer perceptrons with a ne activation functions to learn the XOR data.

Figure 2: Multilayer Perceptron

3

error: Content is protected !!