Separate style and content. Merge style from one source onto content from another. Based on a paper that accomplished this using VGG19 network.

Isolate a style

  • The filters in a convolutional layer highlight certain features.
  • Style can be approximated by looking for patterns in the filters.
    • For example, if it finds lots of vertical pink lines.

Transfer the Style

  • Style and content images are passed through the conv network.
  • Target image is taken from the output of a single layer late in the in thework.
  • The styles to be imposed on the target image are extracted from multiple layers.
  • While layering styles onto target, a ‘content loss’ is calculated
  • Also a ‘style loss’

Unconventional Use of CNN

  • We’re not training the network.
  • We are using a loss function to perform a task, though.
    • Modifying target image to minimize loss.

Gram matrix

  • Technique for determining similarities between filters in a layer
    1. Convert layer to 2d tensor
      • So a 4 x 4 x 8 conv layer would be 8 x 16
    2. Multiply that matrix by its transpose
    3. Result is nonlocalized info about image
  • Highlights nonlocalized features of an image. It doesn’t reveal anything about images actual dimensions.

Style / Content Constants

  • The style weight is modified by a constant $\beta$
  • The content weight is modified by a constant $\alpha$
  • Adjusting the $\alpha / \beta$ ratio will affect how much style or content comes through. The smaller the ratio, the more style will be present in the output image and vice versa.

Process for Exercise

  • We’ll take a pretrained VGG 19 network
    • Freeze weights to turn it into a fixed feature extractor.
  • We’ll push style and content images through it
  • We’ll save style images at various layers to get target features
  • We’ll save content image at specific layer to get target content