Two-Stream Convolutional Networks for Dynamic Texture Synthesis

Abstract

We introduce a two-stream model for dynamic texture synthesis. Our model is based on pre-trained convolutional networks (ConvNets) that target two independent tasks: (i) object recognition, and (ii) optical flow prediction. Given an input dynamic texture, statistics of filter responses from the object recognition ConvNet encapsulate the per-frame appearance of the input texture, while statistics of filter responses from the optical flow ConvNet model its dynamics. To generate a novel texture, a randomly initialized input sequence is optimized to match the feature statistics from each stream of an example texture. Inspired by recent work on image style transfer and enabled by the two-stream model, we also apply the synthesis approach to combine the texture appearance from one texture with the dynamics of another to generate entirely novel dynamic textures. We show that our approach generates novel, high quality samples that match both the framewise appearance and temporal evolution of input texture. Finally, we quantitatively evaluate our texture synthesis approach with a thorough user study.

Two-Stream Convolutional Networks for Dynamic Texture Synthesis

Paper

Konstantinos G. Derpanis

We applied our dynamic texture synthesis process to a wide range of textures which were selected from the DynTex database as well as others collected in-the-wild. Here we provide synthesized results of nearly 60 different textures that encapsulate a range of phenomena, such as flowing water, waves, clouds, fire, rippling flags, waving plants, and schools of fish. Unless otherwise stated, both target and synthesized dynamic textures consist of 12 frames.

The top row consists of the target dynamic textures while the bottom row consists of the synthesized dynamic textures. Scroll horizontally to view more.

Inputs which follow the underlying assumption of a dynamic texture, i.e., the appearance and/or dynamics are more-or-less spatially/temporally homogeneous, allow for better synthesized results. Inputs which do not follow the underlying assumption of a dynamic texture may result in perceptually implausible synthesized dynamic textures.

Click on a thumbnail to play the dynamic texture.

          
  ants
  
  


          
  bamboo
  
  


          
  birds
  
  


          
  boiling_water_1
  
  


          
  boiling_water_2
  
  


          
  calm_water
  
  


          
  calm_water_2
  
  


          
  calm_water_3
  
  


          
  calm_water_4
  
  


          
  calm_water_5
  
  


          
  calm_water_6
  
  


          
  candle_flame
  
  


          
  candy_1
  
  


          
  candy_2
  
  


          
  coral
  
  


          
  cranberries
  
  


          
  escalator
  
  


          
  fireplace_1
  
  


          
  fireplace_2
  
  


          
  fish
  
  


          
  flag
  
  


          
  flag_2
  
  


          
  flames
  
  


          
  flushing_water
  
  


          
  fountain_1
  
  


          
  fountain_2
  
  


          
  fur
  
  


          
  grass_1
  
  


          
  grass_2
  
  


          
  grass_3
  
  


          
  ink
  
  


          
  lava
  
  


          
  plants
  
  


          
  sea_1
  
  


          
  sea_2
  
  


          
  sea_3
  
  


          
  sea_4
  
  


          
  shiny_circles
  
  


          
  shower_water_1
  
  


          
  sky_clouds_1
  
  


          
  sky_clouds_2
  
  


          
  smoke_1
  
  


          
  smoke_2
  
  


          
  smoke_3
  
  


          
  smoke_plume_1
  
  


          
  snake_1
  
  


          
  snake_2
  
  


          
  snake_3
  
  


          
  snake_4
  
  


          
  snake_5
  
  


          
  tv_static
  
  


          
  underwater_vegetation_1
  
  


          
  water_1
  
  


          
  water_2
  
  


          
  water_3
  
  


          
  water_4
  
  


          
  water_5
  
  


          
  waterfall
  
  


          
  waterfall_2

To validate that the texture generation of multiple frames would not induce dynamics consistent with the input, we generated frames starting from randomly generated noise but only using the appearance statistics and corresponding loss. As expected, this produced frames that were valid textures but with no coherent dynamics present.

The top row consists of the target dynamic textures while the bottom row consists of the synthesized textures.

          
  fish

Dynamic textures generated by incremental synthesis. Long sequences can be incrementally generated by separating the sequence into sub- sequences and optimizing them sequentially. This is realized by initializing the first frame of a subsequence as the last frame from the previous subsequence and keeping it fixed throughout the optimization. The remaining frames of the subsequence are initialized randomly and optimized as usual.

The top row consists of the target dynamic textures while the bottom row consists of the synthesized dynamic textures.

          
  calm_water_4
  
  


          
  calm_water_5
  
  


          
  fish
  
  


          
  flag
  
  


          
  sky_clouds_1
  
  


          
  smoke_1
  
  


          
  water_5

An interesting extension that we briefly explored are textures where there is no discernible temporal seam between the last and first frames. Played as a loop, these textures appear to be temporally endless. This is trivially achieved by adding an additional loss to the dynamics stream that ties the last frame to the first.

The top row consists of the target dynamic texture while the bottom row consists of the synthesized "infinite" dynamic texture.

          
  smoke_plume_1

The underlying assumption of our model is that the appearance and dynamics of a dynamic texture can be factorized. As such, it should allow for the transfer of the dynamics of one texture onto the appearance of another. This has been explored previously for artistic style transfer with static imagery. We accomplish this with our model by performing the same optimization as usual, but with the target Gram matrices for appearance and dynamics computed from different textures. In effect, the static texture is animated and hence brought to life. To the best of our knowledge, we are the first to demonstrate this form of style transfer.

The top row consists of the appearance target, the second row the dynamics target, and the third row the synthesized result.

          
  fireplace_1_to_fire_paint
  
  
  


          
  flag_2_to_flag_cropped_1
  
  
  


          
  flag_2_to_flag_cropped_2
  
  
  


          
  water_4_to_water_img

Portions of images can be animated.

          
  waterfall_to_waterfall_paint
  
    
    
    
  
  


          
  water_4_to_water_paint

Mattie Tesfaldet, Marcus A. Brubaker, and Konstantinos G. Derpanis. Two-stream convolutional networks for dynamic texture synthesis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

Bibtex format:

@inproceedings{tesfaldet2018,
  author = {Mattie Tesfaldet and Marcus A. Brubaker and Konstantinos G. Derpanis},
  title = {Two-Stream Convolutional Networks for Dynamic Texture Synthesis},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2018}
}

Abstract

Material

Authors

Dynamic Texture Synthesis

ants

bamboo

birds

boiling_water_1

boiling_water_2

calm_water

calm_water_2

calm_water_3

calm_water_4

calm_water_5

calm_water_6

candle_flame

candy_1

candy_2

coral

cranberries

escalator

fireplace_1

fireplace_2

fish

flag

flag_2

flames

flushing_water

fountain_1

fountain_2

fur

grass_1

grass_2

grass_3

ink

lava

plants

sea_1

sea_2

sea_3

sea_4

shiny_circles

shower_water_1

sky_clouds_1

sky_clouds_2

smoke_1

smoke_2

smoke_3

smoke_plume_1

snake_1

snake_2

snake_3

snake_4

snake_5

tv_static

underwater_vegetation_1

water_1

water_2

water_3

water_4

water_5

waterfall

waterfall_2

Synthesis w/o the Dynamics Stream

fish

Incremental Synthesis

calm_water_4

calm_water_5

fish

flag

sky_clouds_1

smoke_1

water_5

Temporally-endless Textures

smoke_plume_1

Dynamics Style Transfer

fireplace_1_to_fire_paint

flag_2_to_flag_cropped_1

flag_2_to_flag_cropped_2

water_4_to_water_img