Experiment Results (Ctrl-Adapter)

 

Denoise (DDIM step)

  • scheduler.config.prediction_type == "v_prediction” (bake_texture, modified pipeline)
    • notion image
      [Consistency] left: modified, right: ctrl-adapter
      notion image
      notion image
 
[Failed] scheduler.config.prediction_type == "epsilon”
notion image
 
Denoise Code
Denoise step (rewrite)
notion image
notion image
 
 
 

First frame guidance

White background
notion image
notion image
notion image
notion image
notion image
notion image
notion image
notion image
(In SyncMVD, they use random background to shuffle pixels as latents input)
[Debug] Video results w/o bake_texture (single view 32 frames)
(a) modified pipeline, (b) original pipeline, (c) generated background, (d) composited first frame
notion image
notion image
notion image
notion image
[Debug] Video results w/o bake_texture (8 views 8 frames, modified pipeline, target_fps=4, guidance_scale=9.0)
notion image
 
 
 

More Results

  1. First line, Ctrl-Adapter
  1. Second line, Ctrl-Adapter w/ noise initialized by uv projection.
Code
Path: ./VideoMVD/i2vgen_xl/pipelines/i2vgen_xl_controlnet_adapter_pipeline_latent.py
Debug: inference_latent.py
Content:
noise init w/ uv projection(self.uvp.prepare_latents): outputs/2024-07-16_05-57-40→2024-07-16_23-35-18
MeshRasterizer.raster_settings.RasterizationSettings.faces_per_pixel: outputs/2024-07-17_09-07-33
self.uvp.load_anim(): DELETED
success_test_case: outputs/2024-07-17_09-25-33→2024-07-17_10-50-18: monster*2,mech_man*2
Integration to our pipeline(multiview video): data/monster/MVD_15Jul2024-175650→
notion image
notion image
prompt: a monster working in volcano.
prompt: a monster working in volcano.
notion image
 
notion image
notion image
prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.
prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.
notion image
notion image
notion image
prompt: a stormtrooper walking on the beach, Van-Gogh style.
prompt: a stormtrooper walking on the beach, Van-Gogh style.
notion image
 
notion image
notion image
prompt: a stormtrooper walking on moon.
prompt: a stormtrooper walking on moon.
notion image
 

Results

  1. Generate the first frame by SDXL. (left)
  1. Generate the short video (right) based on depth map sequence (mid), w/ I2V-GenXL.
 
prompt: a horned demon with clawed hands walking in the hell
prompt: a horned demon with clawed hands walking in the hell
prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.
prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.
prompt: a cybernetic giant with mechanical limbs, covered in rusted plates walking in an abandoned city.
prompt: a cybernetic giant with mechanical limbs, covered in rusted plates walking in an abandoned city.
prompt: a stormtrooper walking in the milky way.
prompt: a stormtrooper walking in the milky way.
prompt: a stormtrooper walking on the beach, Van-Gogh style.
prompt: a stormtrooper walking on the beach, Van-Gogh style.
prompt: a stormtrooper walking on moon.
prompt: a stormtrooper walking on moon.