Experiment Results (Ctrl-Adapter)

https://ctrl-adapter.github.io/

Denoise (DDIM step)

scheduler.config.prediction_type == "v_prediction” (bake_texture, modified pipeline)

notion image

[Consistency] left: modified, right: ctrl-adapter

notion image

notion image

[Failed] scheduler.config.prediction_type == "epsilon”

notion image

Denoise Code

scheduling_ddim.py

notion image

notion image

‣

Denoise step (rewrite)

notion image

notion image

First frame guidance

White background

notion image

notion image

notion image

notion image

notion image

notion image

notion image

notion image

(In SyncMVD, they use random background to shuffle pixels as latents input)

[Debug] Video results w/o bake_texture (single view 32 frames)

(a) modified pipeline, (b) original pipeline, (c) generated background, (d) composited first frame

notion image

notion image

notion image

notion image

[Debug] Video results w/o bake_texture (8 views 8 frames, modified pipeline, target_fps=4, guidance_scale=9.0)

notion image

More Results

First line, Ctrl-Adapter

Second line, Ctrl-Adapter w/ noise initialized by uv projection.

Code

Path: ./VideoMVD/i2vgen_xl/pipelines/i2vgen_xl_controlnet_adapter_pipeline_latent.py

Debug: inference_latent.py

Content:

noise init w/ uv projection(self.uvp.prepare_latents): outputs/2024-07-16_05-57-40→2024-07-16_23-35-18

MeshRasterizer.raster_settings.RasterizationSettings.faces_per_pixel: outputs/2024-07-17_09-07-33

self.uvp.load_anim(): DELETED

success_test_case: outputs/2024-07-17_09-25-33→2024-07-17_10-50-18: monster*2,mech_man*2

Integration to our pipeline(multiview video): data/monster/MVD_15Jul2024-175650→

notion image

notion image

prompt: a monster working in volcano.

prompt: a monster working in volcano. — prompt: a monster working in volcano.

notion image

notion image

notion image

prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.

prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle. — prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.

notion image

notion image

notion image

prompt: a stormtrooper walking on the beach, Van-Gogh style.

prompt: a stormtrooper walking on the beach, Van-Gogh style. — prompt: a stormtrooper walking on the beach, Van-Gogh style.

notion image

notion image

notion image

prompt: a stormtrooper walking on moon.

prompt: a stormtrooper walking on moon. — prompt: a stormtrooper walking on moon.

notion image

Results

Generate the first frame by SDXL. (left)

Generate the short video (right) based on depth map sequence (mid), w/ I2V-GenXL.

prompt: a horned demon with clawed hands walking in the hell

prompt: a horned demon with clawed hands walking in the hell — prompt: a horned demon with clawed hands walking in the hell

prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.

prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle. — prompt: a colossal, stone golem with vines and moss growing over its surface walking in a mysterious jungle.

prompt: a cybernetic giant with mechanical limbs, covered in rusted plates walking in an abandoned city.

prompt: a cybernetic giant with mechanical limbs, covered in rusted plates walking in an abandoned city. — prompt: a cybernetic giant with mechanical limbs, covered in rusted plates walking in an abandoned city.

prompt: a stormtrooper walking in the milky way.

prompt: a stormtrooper walking in the milky way. — prompt: a stormtrooper walking in the milky way.

prompt: a stormtrooper walking on the beach, Van-Gogh style.

prompt: a stormtrooper walking on the beach, Van-Gogh style. — prompt: a stormtrooper walking on the beach, Van-Gogh style.

prompt: a stormtrooper walking on moon.

prompt: a stormtrooper walking on moon. — prompt: a stormtrooper walking on moon.