HunyuanVideo
13B open-weight cinematic text-to-video.
Open Source 24–48 GB VRAM
- Min VRAM
- 24 GB
- GPU class
- Workstation GPU
- Quant
- FP16
Actually FreeNo SignupOpen SourceWatermark-Free
Add motion to any SD checkpoint via a motion module.
Requires: ComfyUI, AUTOMATIC1111 (stable-diffusion-webui)
Runs locally · Mid GPU (12 GB)
SDXL variant pushes to 16+ GB VRAM for longer sequences.
AnimateDiff was the first widely-adopted technique to turn frozen image-diffusion checkpoints into video models. It plugs a learned motion module into your SD 1.5 or SDXL checkpoint and animates the latent, producing 16-32 frame clips that preserve the source model's style. The grandparent of every Comfy video workflow.
Apache 2.0.
13B open-weight cinematic text-to-video.
Open-weight video diffusion from Alibaba.
Open-source text-to-video diffusion from THUDM.