You are on page 1of 2

Methodology Overview

• Core Components:
• Diffusion Models: Utilized state-of-the-art diffusion models for generating videos from textual
descriptions, capturing complex motion and textures.

• Classifier-Free Guidance (CFG): Implemented CFG for dynamic control over the conditioning
strength, enhancing the model’s ability to adhere to descriptive prompts while fostering creative
interpretations.

• SmoothLoss Function: Introduced a novel loss function to ensure temporal smoothness and
background stability across video frames.

• Process Flow:

Data Preparation Model Training Inference

Figure 2: Process Flow Diagram

24/04/2024 8
Proposed Approach

Noising

nt
Com poral
pone
Tem
Input Output
Prompt Classifier Free Smooth
Video Guidance Loss Video
Denoising
Input Video +
Caption

3D – UNET
Architecture

Figure 3: Proposed Approach Architecture

24/04/2024 9

You might also like