Problem Set 1 SOlution

$30.00 $24.90



Please read the late submission and collaboration policy on the course website:

Install Anaconda for Python 3.6 from We will test all code on this distri-bution. You can install it locally in a separate directory without interfering with your system Python.

This problem set is provided as a ZIP file. You are reading the main problem set PDF in the root pset1/ directory. Look in the pset1/code sub-directory for code for different problems, with data in pset1/code/inputs. You will be required to modify and fill-in various functions in the .py files in the code directory. Running this code (e.g., with python ./ and so on) will create output images in the pset1/code/outputs directory.

  1. Complete the code files in pset1/code/ by filling out the required functions.

  1. Run each program to generated output images in pset1/code/outputs directory.

  1. Create a PDF report file created with LATEX. Use the template provided on the course website (scroll down to the resources section at the bottom). In particular, make sure you fill out your name/wustl key (top of the .tex file) to populate the page headers. Also fill out the final information section describing how long the problem set took you, who you collaborated / discussed the problem set with, as well as any online resources you used.

The main body of the report should contain responses to the math questions, and also include results and figures for the programming questions as asked for. These figures will often correspond to images generated by your python code in the pset/code/outputs/ directory.

Place this report file as solution.pdf in the root pset1/ directory.

Then, zip all the contents and sub-directories of pset1/ into a single ZIP file, and upload it to blackboard.

Write efficient code. While most of the points are for writing code that is correct, some points are allocated to efficiency. Above all, try to minimize the total number of multiplies / adds. For the same number of underlying operations, try to keep the use of for loops to a minimum (i.e., over a minimum number of indices). Instead, use convolution, element-wise operations over large arrays, calls to matrix multiply, etc.

PROBLEM 1 (Total: 10 points)

  1. Recall from Lecture 2 that for the observed image—ignoring clipping and rounding—is given by: I = gI0 + ;

where I0 the ideal noise-free image, g is the amplification factor, and is zero-mean Gaussian noise that captures contribu-tions from shot noise and (pre- and post-amplification) additive noise. What is the variance of , in terms of the amplification factor g, ideal intensity I0, and noise parameters 22a and 22b. (You can refer to the slides). (3 points)

  1. Let’s say we already have g as the optimal value of amplification (for clipping/rounding) for ideal intensity I0 which corresponds to an exposure time of T . Assuming the scene is static, I0 scales linearly with T . So if we chose to have a shorter exposure time T =k, the corresponding ideal intensity would also be I0=k. And let us say in that case, we would scale up the amplification factor to g k to compensate so that the noise-free version of I would be gI0. But in this case, what is the expected variance of noise now ? (3 points)

  1. Now, let’s say we took k such images I1; I2; : : : Ik (assuming k is an integer) with the shorter exposure times T =k. And

then, simply set I to be the average of these. Notice that the idea value for this average is still gI0 (since the ideal value for each Ii is again gI0). What is the noise variance in this case ? (3 points)

  1. If we had a time budget of T , what does this say about the noise trade-off between taking a single shot with exposure time T , and k shots with exposure time T =k ? Which is preferable ? (1 point)

PROBLEM 2 (Total: 15 points)

Implement histogram equalization. Fill out the histeq function in that takes as input 8-bit grayscale im-ages (i.e., single channel images where intensities are integers between 0 and 255), and outputs an equalized image of the same size (with integer intensities between 0 and 255). Run the program (from the code directory) as python This will load the image code/inputs/p2_inp.png, run your function on it, and store the result as code/outputs/prob2.png. Include this output in the report.

Please do not use any in-built or third-party code for histogram equalization. Implement it yourself. Hint: Look at (and use) the np.unique function in numpy.

PROBLEM 3 (Total: 15 points)

(a) Fill out the grads function in to return a map of gradient strength / magnitude H[n] and orientation [n] for an input grayscale image X[n]. Use the Sobel x- and y-derivative kernels:

Dx =




; Dy =

















The support code already imports the 2D convolution function scipy.signal.convolve2d as the conv2 function. Use this for carrying out same convolution: the size of your output image should be the same as that of your input im-age. We recommend using “symmetric” padding. Once you fill out the function and run, it should generate

code/outputs/prob3_a.png showing gradient magnitudes H for the image in code/inputs/p3_inp.png. In-clude that output image in your report. (5 points)

  1. In addition to the gradient magnitude, produces three edge images code/outputs/prob3_b_X.png, for X = 0, 1, and 2. These correspond to three threshold values in the code (variables T0,T1,T2). Thresholding is done simply by comparing the H image to these thresholds, which returns a boolean image, and casting it to float32, which maps False to 0 and True to 1. You will notice that while a high threshold misses a lot of edges, a low threshold leads to “thick” edge lines.

To mitigate the latter, implement non-maxima suppression (NMS) by filling out the function nms, which takes as input a binarized edge image E[n], and the original magnitude and orientation images H[n] and [n]. Your code should return a new edge image E+[n], where an edge pixel n in E[n] (i.e., with value 1) is set to 0 if corresponding magnitude value in H[n] is not higher than that of its two neighbors in the direction of the maximum gradient (from [n]), i.e., orthogonal to the edge direction. To keep things simple, only consider the horizontal, vertical, and two diagonal directions by rounding off [n] appropriately. Hint: Use the numpy function np.where and np.logical_and to get a list of oriented edge pixels and do boolean operations on arrays.

Running will also call nms and produce the thinned edge maps as code/outputs/prob3_b_nmsX.png. Try different values of the three thresholds that you think cover the range of acceptable edge images (after non-maxima suppression), and include the produced edge images (before and after NMS) in the report. (10 points)

PROBLEM 4 (Total: 15 points)

Implement Bilateral filtering by filling out the bfilt function in The function takes in a color image X and parameters S and I (spatial and intensity standard deviation), and neighborhood size K. It should return an output image Y [n] of the same size as X[n] where:

Y [n2] =

B[n1; n2]X[n1];

B[n1; n2]



jn1 n2j2

jX[n1] X[n2]j2


2 s2

2 I2



where B[n1; n2] is normalized such that


B[n1; n2] = 1. The support of the filter should be (2K + 1) (2K + 1),

i.e., the summation over


1 should be

from n


[K; K] to n

+ [K; K]. For locations outside the image boundary, you



should set the contribution weights B to 0 (and normalize accordingly). Also note that the weights B are scalar—both kn1 n2k2 and kX[n1] X[n2]k2 imply summing the squared differences across (x; y) co-ordinates and the three color channels respectively. However, the same weight is applied while averaging for all three color channels. calls the bfilt function with different parameters on two noisy images code/inputs/p4_nz*.png to create a number of output files code/outputs/prob4_*.png (most of them with the first image), with code/outputs/prob4_*rep.png corresponding to repeated bilateral filtering. Include these generated images in your report. (You may optionally also want to try different parameters to see if you can get better results.)

PROBLEM 5 (Total: 15 points)

  1. Let F [u; v] be the Fourier transform of a real-valued image X[nx; ny] of width WX , height HX . Treating the imaginary and real parts of a complex number as two separate scalars, show you only need to store WX HX distinct scalars from

F [u; v]. Explain which scalars you will be storing, and how you can recover the full F [u; v] from them. You may need to

do this separately for when both WX ; HX are even, when both are odd, and when one is odd and one even. (10 points)

(b) tries to implement convolution (same convolution with circular padding) in the Fourier domain. Fill out the kernpad function that takes a smaller kernel k and the size of the image X[n], and returns a zero-padded and circularly-rotated kernel of the same size as the image, so that the center of the original kernel is at the top-left corner. You can assume that both the height and width of k are odd. Running the script will create a file code/outputs/prob5.png that contains three columns: original image, output from calling regular convolution, and output from doing it in the Fourier domain with kernpad. The second and third column should look identical. Include this image in your report. (5 points)

PROBLEM 6 (Total: 30 points)

  1. Implement Harr wavelet decomposition by completing the im2wv function in The function takes a grayscale image and number of levels N as input. The output is a python list with N + 1 elements: [V1,V2,…,VN, X(N+1)]. All elements of this list, except for the last one, are lists themselves—each containing the three detail images

[H1; H2; H3]—for the corresponding level, with the first element V1 being the finest level (for which the three images have width and height half of that of the input image). The last element of the list returned by the function will be an image, corresponding to the coarse “scaling” coefficients at the top of the wavelet pyramid. contains code to visualize the pyramid and store the result as code/outputs/prob6a_*.png. Include these in the report. (15 points)

  1. Complete the wv2im function in This should take the pyramid list produced by wv2im and reconstruct the original image. You should infer the number of levels from the length of the list. Running also calls this reconstruction function, on the original pyramid and on versions with the finest levels zeroed out. These are all saved as

code/outputs/prob6b_*.png. Include these outputs in your report. (15 points)

Do not use any wavelet toolbox / libraries for this problem.