Homework #11 Solution




  1. [8 points] Generative Adversarial  Network  (GAN)


(a)  What  is the cost function for classical GANs?  Use Dw (x) as the discriminator and Gθ (z)

as the generator, where the generator transforms  z ∼ Z to x ∈ X .


Your answer:







(b)  Assume arbitrary capacity  for both discriminator and generator. In this case we refer to the  discriminator using D(x),  and denote  the distribution on the  data  domain  induced by the  generator via pG (x).   State  an equivalent  problem  to the  one asked  for in part (a),  by using pG (x) and the ground  truth data  distribution pdata (x).


Your answer:








(c) Assuming arbitrary capacity,  derive the optimal  discriminator D∗ (x) in terms of pdata (x)

and pG (x).

You may need the Euler-Lagrange equation:


∂L(x, D, D˙ )

d  ∂L(x, D, D˙ )





where D˙



= ∂D/∂x.

∂D         − dx

∂D˙          = 0


Your answer:










(d)  Assume arbitrary capacity  and an optimal  discriminator D∗ (x),  show that the optimal



generator, G∗ (x),  generates  the  distribution p∗

= pdata ,  where  pdata (x)  is the  data



You may need the Jensen-Shannon divergence:


1                               1                                                1

JSD(pdata , pG ) = 2 DK L (pdata , M ) + 2 DK L(pG , M )    with    M = 2 (pdata + pG )



Your answer:


(e)  More recently,  researchers  have proposed  to use the Wasserstein distance  instead  of di- vergences  to  train  the  models  since the  KL divergence  often  fails to  give meaningful information  for training.  Consider  three  distributions, P1  ∼  U [0, 1], P2  ∼  U [0.5, 1.5], and  P3   ∼  U [1, 2].  Calculate DK L (P1, P2),  DK L(P1 , P3),  W1 (P1, P2),  and  W1(P1, P3), where W1  is the Wasserstein-1  distance  between  distributions.
















Your answer:


































































error: Content is protected !!