Applications

  • Generating realistic data
  • So far, mostly used for images
    • Generate images from text or rough sketches
  • Can also be used to imitate actions
  • Predict outcome of high energy particle physics experiments

How do they work?

  • They’re generative, like RNNs can generate text
    • RNNs can do this one pixel at a time
    • GANs can do it in parallel
      • More realistic image
  • Generating
    • Take in noise
    • Pass through differentiator function
    • Generates image
  • Training
    • Using real data as label and having generator try to match it is to computationally demanding
    • Use a discriminator to approximate. Discriminator:
      • Sees real images half the time
      • Fake images the other half
    • To reduce loss, generator must “fool” discriminator with more realistic images
    • GAN focuses on generating images where the discriminator score is high (good at detecting fake vs real)
    • “Perfect” loss would be 0.5

Adversarial

  • “Adversarial” because discriminator and generator are in competition
  • Game theory:
    • Consider rock paper scissors:
      • Rational agents will decide the best strategy is playing randomly
    • GANs
      • Generator wants to minimize value func
      • Discriminator wants to maximize value func
      • Equlibrium reached when value func is the same for both

GAN Specific Concerns

  • We’re building two networks simultaneously.
  • Requires building/training
    • Discriminator
    • Generator
    • Labels are flipped
  • Typically want to output probabilities
    • Use sigmoid activation functions
    • Use BCEWithLogitsLoss optimizer
  • Convolutional layers used for more complex images
    • For classifying, go from tall and wide to short, narrow and deep.
    • For generating, do the opposite.
  • Use batch normalization

Cycle GANs

  • Adversarial networks that transform one image into another
    • Horse -> Zebra
    • Doodle -> Realistic photo
  • AKA image to image transformations
  • Applications
    • Labeling objects in a photo
    • Edge detection
    • Colorization
    • Sharpening

Loss Functions

  • Image transformation naive approach
    • Create classifier of ground truth, or desired outcome
      • A color photo of a bird, for example
    • Use b & w image as input
    • Network attempts to coloroze then measures success by comparing
  • GAN approach
    • Use discriminator to compare real images vs fake images
    • Now networks are fighting to fool the discriminator instead of matching existing images

Pix2Pix

  • Instead of generating images from noise, generate from input
  • Modify discriminator to assess pairs
    • Input doodle vs real or generated
  • Necessary to manually produce pairs of doodle/reals?
    • No.
    • Take set of doodles, or any set of input data, and a set of real, corresponding images, Y
      • So a bunch of doodles of horses and a bunch of photos of horses
    • Use classifier to match X to Y, but many X may match single Y
    • Go backwards, too. Match Y to X. Keep pairs that choose each other.
    • Requires 2 discriminators
    • This is a cycle GAN