Skip to content
AI Tools Finder
48 GB VRAM37 tools fit this tier

AI tools that run on 48 GB VRAM

Forty-eight gigabytes is workstation territory. Now you can run the biggest open-weight video models at full precision, serve 70B LLMs through vLLM, and start fine-tuning instead of just LoRA-ing. The RTX 5090, RTX A6000, and L40S are the typical homes for this tier.

Typical GPUs in this tier
NVIDIA RTX 5090NVIDIA RTX A6000NVIDIA L40 / L40S

Capabilities

What you can run

  • Wan 2.2 14B at BF16
  • HunyuanVideo at full precision
  • 70B LLMs in FP8 with vLLM
  • Flux full fine-tuning (with gradient checkpointing)
  • Multi-model ComfyUI pipelines without offload

What stays out of reach

  • Datacenter-only workloads (H100/H200 quirks)
  • Truly massive 405B+ LLMs without sharding

Recommended tools for 48 GB VRAM

Sorted by best fit for this tier — tools designed around your VRAM budget first, then by our power-user score.

vLLM

High-throughput LLM serving for GPUs.

Open Source 24–80 GB VRAM
Min VRAM
24 GB
GPU class
Datacenter GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

NVIDIA Cosmos

World-foundation models for physical AI.

Open Source 24–80 GB VRAM
Min VRAM
24 GB
GPU class
Datacenter GPU
Quant
FP16
Actually FreeWatermark-FreeAPI

HunyuanVideo

13B open-weight cinematic text-to-video.

Open Source 24–48 GB VRAM
Min VRAM
24 GB
GPU class
Workstation GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Magi-1

Autoregressive video diffusion at 24 GB.

Open Source 24–48 GB VRAM
Min VRAM
24 GB
GPU class
Workstation GPU
Quant
FP8
Actually FreeNo SignupOpen SourceWatermark-Free

Mochi 1

Genmo's 10-B open-weight T2V — the first 'genuinely fluid' OSS video model.

Open Source 24–60 GB VRAM
Min VRAM
24 GB
GPU class
Workstation GPU
Quant
BF16
Actually FreeNo SignupOpen SourceWatermark-Free

Modal

Serverless Python for GPU workloads.

Freemium 16–80 GB VRAM
Min VRAM
16 GB
GPU class
Datacenter GPU
Quant
Actually FreeWatermark-FreeHobbyist-FriendlyAPI

AI-Toolkit (Ostris)

Modern training framework — Flux, SDXL, SD3 LoRAs in YAML.

Open Source 16–24 GB VRAM
Min VRAM
16 GB
GPU class
High-end GPU
Quant
BF16
Actually FreeNo SignupOpen SourceWatermark-Free

TRELLIS

Microsoft Research's structured 3D representation model.

Open Source 16–24 GB VRAM
Min VRAM
16 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Pyramid Flow

Memory-efficient T2V via pyramidal flow matching.

Open Source 16–24 GB VRAM
Min VRAM
16 GB
GPU class
High-end GPU
Quant
BF16
Actually FreeNo SignupOpen SourceWatermark-Free

Wan 2.2

Open-weight video diffusion from Alibaba.

Open Source 12–48 GB VRAM
Min VRAM
12 GB
GPU class
Workstation GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Kohya_ss

The standard SDXL/Flux LoRA training UI.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

3D Gaussian Splatting

The INRIA original — train your own splats.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP32
Actually FreeNo SignupWatermark-Free

Hunyuan3D-2

Tencent's open 3D generator — multi-view, PBR, ready-to-use meshes.

Open Source 12–16 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeOpen SourceWatermark-FreeAPI

LTX-Video

Real-time-ish open video diffusion from Lightricks.

Open Source 12–16 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

OneTrainer

Modern alternative trainer for SD/SDXL/Flux.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Stable Diffusion 3.5 Large

Stability's MMDiT flagship at 8B params.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP16
Actually FreeWatermark-FreeHobbyist-FriendlyAPI

SUPIR

Diffusion-based photorealistic upscaler.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP16
Actually FreeNo SignupWatermark-FreePlugin

CogVideoX 5B

Open-source text-to-video diffusion from THUDM.

Open Source 12–24 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

FluxGym

Dead-simple Flux LoRA training in a Gradio UI.

Open Source 12–20 GB VRAM
Min VRAM
12 GB
GPU class
Mid GPU
Quant
FP8
Actually FreeNo SignupOpen SourceWatermark-Free

Stable Video Diffusion

Image-to-video diffusion — 25 frames, 14 or 25 steps.

Open Source 12–16 GB VRAM
Min VRAM
12 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeOpen SourceWatermark-FreeHobbyist-Friendly

ComfyUI-AnimateDiff-Evolved

Animation motion modules for ComfyUI.

Open Source 10–16 GB VRAM
Min VRAM
10 GB
GPU class
High-end GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Diffusers

Hugging Face's go-to library for every diffusion model.

Open Source 4–12 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Topaz Video AI

GPU-accelerated upscaling, frame-interp, denoise.

Paid · from $299 4–12 GB VRAM
Min VRAM
4 GB
GPU class
Mid GPU
Quant
Watermark-FreeHobbyist-Friendly

WhisperX

Whisper + speaker diarisation + word-level timestamps.

Open Source 4–8 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
INT8
Actually FreeOpen SourceWatermark-FreeHobbyist-Friendly

XTTS v2 (Coqui)

Multilingual voice cloning in 6 seconds.

Open Source 4–6 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Stable Diffusion WebUI Forge

Optimized A1111 fork for low-VRAM cards.

Open Source 4–8 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Flowframes

Free RIFE-based frame interpolation.

Free 4–8 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
Actually FreeNo SignupOpen SourceWatermark-Free

Tabby

Self-hosted, GPU-accelerated coding autocompletion.

Freemium · from $19/mo 4–8 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
GGUF
Actually FreeNo SignupOpen SourceWatermark-Free

Fooocus

Stable Diffusion XL, dialed to one button.

Open Source 4–8 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Tortoise TTS

Slow, but the quality is worth the wait.

Open Source 4–8 GB VRAM
Min VRAM
4 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Transformers

The library every LLM ships against first.

Open Source 2–24 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

OpenAI Whisper

The reference open-source speech-to-text model.

Open Source 2–10 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

faster-whisper

Whisper, 4× faster, same accuracy. CTranslate2 backend.

Open Source 2–6 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
INT8
Actually FreeNo SignupOpen SourceWatermark-Free

FaceFusion

The most active open face-swap toolkit.

Open Source 2–8 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free

Real-ESRGAN

The default OSS upscaler, still.

Open Source 2–6 GB VRAM
Min VRAM
2 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeNo SignupOpen SourceWatermark-Free