Skip to content
AI Tools Finder

CogVideoX 5B

Open-source text-to-video diffusion from THUDM.

Open Source 12–24 GB VRAMRuns locally
Actually FreeNo SignupOpen SourceWatermark-FreeHobbyist-FriendlyPlugin

Requires: ComfyUI

Visit CogVideoX 5BUpdated 2026-04-19 · Direct link

Hardware requirements

Runs locally · High-end GPU (16–24 GB)

12–24 GB VRAM
Min VRAM
12 GB
Rec. VRAM
24 GB
Min RAM
32 GB
Rec. RAM
64 GB
Disk
40 GB
GPU class
High-end GPU
CUDA 12.xNo Apple SiliconGPU RequiredQuant: FP16, FP8

2B variant runs on 12 GB. 5B fits 24 GB comfortably.

Screenshot placeholder · CogVideoX 5B

What is CogVideoX 5B?

CogVideoX is a 5B parameter T2V model with strong prompt following and a relatively friendly VRAM profile. The 5B variant runs on 24 GB; the 2B variant fits in 12 GB.

Pros & cons

Pros

  • Lower VRAM than Hunyuan/Wan
  • I2V variants available
  • Active in ComfyUI nodes

Cons

  • Shorter clips by default
  • Behind Wan 2.2 in motion realism

What's actually free?

Open weights. Free to run locally.

✓ Actually FreeNo SignupOpen SourceWatermark-Free

Alternatives

Wan 2.2

Open-weight video diffusion from Alibaba.

Open Source 12–48 GB VRAM
Min VRAM
12 GB
GPU class
Workstation GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

HunyuanVideo

13B open-weight cinematic text-to-video.

Open Source 24–48 GB VRAM
Min VRAM
24 GB
GPU class
Workstation GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

LTX-Video

Real-time-ish open video diffusion from Lightricks.

Open Source 12–16 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free