Skip to content
AI Tools Finder
24 GB VRAM26 tools fit this tier

AI tools that run on 24 GB VRAM

Twenty-four gigabytes is the current "new floor" for state-of-the-art local AI. HunyuanVideo and Wan 2.2 14B FP8 fit. Flux LoRA training is realistic. 70B LLMs at Q4 run fully on-GPU. This is the workstation tier for serious local-AI work in 2026.

Typical GPUs in this tier
RTX 3090 / 3090 TiRTX 4090RTX 5080Apple M-series 32–36 GB unified

Capabilities

What you can run

  • HunyuanVideo (GGUF Q4_K_M / FP8)
  • Wan 2.2 14B at FP8
  • Flux LoRA training in Kohya / OneTrainer
  • 70B LLMs at Q4 fully on GPU
  • Large-batch SDXL & Flux production runs

What stays out of reach

  • Wan 2.2 14B at BF16 (needs 48 GB)
  • Full fine-tuning of 13B+ models without LoRA
  • Production LLM serving for multiple users (single-stream only)

Recommended tools for 24 GB VRAM

Sorted by best fit for this tier — tools designed around your VRAM budget first, then by our power-user score.

vLLM

High-throughput LLM serving for GPUs.

Open Source 24–80 GB VRAM
Min VRAM
24 GB
GPU class
Datacenter GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

NVIDIA Cosmos

World-foundation models for physical AI.

Open Source 24–80 GB VRAM
Min VRAM
24 GB
GPU class
Datacenter GPU
Quant
FP16
Actually FreeWatermark-FreeAPI

HunyuanVideo

13B open-weight cinematic text-to-video.

Open Source 24–48 GB VRAM
Min VRAM
24 GB
GPU class
Workstation GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Magi-1

Autoregressive video diffusion at 24 GB.

Open Source 24–48 GB VRAM
Min VRAM
24 GB
GPU class
Workstation GPU
Quant
FP8
Actually FreeNo SignupOpen SourceWatermark-Free

Mochi 1

Genmo's 10-B open-weight T2V — the first 'genuinely fluid' OSS video model.

Open Source 24–60 GB VRAM
Min VRAM
24 GB
GPU class
Workstation GPU
Quant
BF16
Actually FreeNo SignupOpen SourceWatermark-Free

Modal

Serverless Python for GPU workloads.

Freemium 16–80 GB VRAM
Min VRAM
16 GB
GPU class
Datacenter GPU
Quant
Actually FreeWatermark-FreeHobbyist-FriendlyAPI

AI-Toolkit (Ostris)

Modern training framework — Flux, SDXL, SD3 LoRAs in YAML.

Open Source 16–24 GB VRAM
Min VRAM
16 GB
GPU class
High-end GPU
Quant
BF16
Actually FreeNo SignupOpen SourceWatermark-Free

TRELLIS

Microsoft Research's structured 3D representation model.

Open Source 16–24 GB VRAM
Min VRAM
16 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Pyramid Flow

Memory-efficient T2V via pyramidal flow matching.

Open Source 16–24 GB VRAM
Min VRAM
16 GB
GPU class
High-end GPU
Quant
BF16
Actually FreeNo SignupOpen SourceWatermark-Free

Wan 2.2

Open-weight video diffusion from Alibaba.

Open Source 12–48 GB VRAM
Min VRAM
12 GB
GPU class
Workstation GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Kohya_ss

The standard SDXL/Flux LoRA training UI.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

3D Gaussian Splatting

The INRIA original — train your own splats.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP32
Actually FreeNo SignupWatermark-Free

Hunyuan3D-2

Tencent's open 3D generator — multi-view, PBR, ready-to-use meshes.

Open Source 12–16 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeOpen SourceWatermark-FreeAPI

LTX-Video

Real-time-ish open video diffusion from Lightricks.

Open Source 12–16 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

OneTrainer

Modern alternative trainer for SD/SDXL/Flux.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Stable Diffusion 3.5 Large

Stability's MMDiT flagship at 8B params.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP16
Actually FreeWatermark-FreeHobbyist-FriendlyAPI

SUPIR

Diffusion-based photorealistic upscaler.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP16
Actually FreeNo SignupWatermark-FreePlugin

CogVideoX 5B

Open-source text-to-video diffusion from THUDM.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

FluxGym

Dead-simple Flux LoRA training in a Gradio UI.

Open Source 12–20 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP8
Actually FreeNo SignupOpen SourceWatermark-Free

Stable Video Diffusion

Image-to-video diffusion — 25 frames, 14 or 25 steps.

Open Source 12–16 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeOpen SourceWatermark-FreeHobbyist-Friendly

ComfyUI-AnimateDiff-Evolved

Animation motion modules for ComfyUI.

Open Source 10–16 GB VRAM
Min VRAM
10 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Transformers

The library every LLM ships against first.

Open Source 2–24 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

OpenAI Whisper

The reference open-source speech-to-text model.

Open Source 2–10 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

faster-whisper

Whisper, 4× faster, same accuracy. CTranslate2 backend.

Open Source 2–6 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
INT8
Actually FreeNo SignupOpen SourceWatermark-Free

FaceFusion

The most active open face-swap toolkit.

Open Source 2–8 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Real-ESRGAN

The default OSS upscaler, still.

Open Source 2–6 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free