Extending Text2Video-Zero for Multi-ControlNet

Backen, Ben

Extending Text2Video-Zero for Multi-ControlNet

Files

Backen_Ben_Thesis_CHC.pdf (470.33 KB)

Date

2023

Authors

Backen, Ben

Publisher

University of Oregon

Abstract

This research paper presents an extension to the Text2Video-Zero (T2V0) generative model, augmenting the synthesis of video from textual and video inputs. The project focuses on enhancing the functionality and accessibility of T2V0 by integrating Stable Diffusion’s (SD) support for multiple ControlNets, implementing frame-wise masking for selective ControlNet application, and introducing memory optimizations to enable running the model on consumer-grade hardware. The paper also provides a high-level overview of SD, explores experimental features, and offers practical tips for generating videos using these tools. Additionally, we include a demonstration video showcasing T2V0 with Multi-ControlNet. The video highlights the early potential of text-to-video models for storytelling. Ultimately, the study strives to expand the capabilities and accessibility of T2V0, increasing users' control over their generated outputs while upholding the democratic principles of open-source AI.

Description

15 pages

Keywords

text-to-video, Stable Diffusion, ControlNet, machine learning, generative models

URI

https://hdl.handle.net/1794/28647

Collections

Clark Honors College Theses

Full item page

Scholars' Bank

Extending Text2Video-Zero for Multi-ControlNet

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections