48 GB VRAM37 TOOLS FIT THIS TIER

AI tools that run on 48 GB VRAM

Forty-eight gigabytes is workstation territory. Now you can run the biggest open-weight video models at full precision, serve 70B LLMs through vLLM, and start fine-tuning instead of just LoRA-ing. The RTX 5090, RTX A6000, and L40S are the typical homes for this tier.

YOUR POSITION ON THE VRAM SCALE

TYPICAL GPUS IN THIS TIER //

NVIDIA RTX A6000NVIDIA L40 / L40STwo 24 GB GPUs where the runtime supports sharding

✓ WHAT YOU CAN RUN

Wan 2.2 14B at BF16
HunyuanVideo at full precision
70B LLMs in FP8 with vLLM
Flux full fine-tuning (with gradient checkpointing)
Multi-model ComfyUI pipelines without offload

✕ WHAT STAYS OUT OF REACH

Datacenter-only workloads (H100/H200 quirks)
Truly massive 405B+ LLMs without sharding

[ READ THE ASSUMPTIONS ]

VRAM tier ≠ universal guarantee

These are planning envelopes. Model variant, precision, resolution, frame count, context, cache, runtime, and offload can move the same workload across tiers. The guide shows how to validate a real ComfyUI graph before buying hardware.

VRAM decision guide →

Recommended tools for 48 GB VRAM

Sorted by best fit for this tier — tools designed around your VRAM budget first, then by our power-user score.

vLLM

High-throughput LLM serving for GPUs.

OPEN SOURCE24–80 GB VRAM

VRAM fit24–80 GB

NVIDIA Cosmos

World-foundation models for physical AI.

OPEN SOURCE24–80 GB VRAM

VRAM fit24–80 GB

HunyuanVideo

13B open-weight cinematic text-to-video.

OPEN SOURCE24–48 GB VRAM

VRAM fit24–48 GB

Magi-1

Autoregressive video diffusion at 24 GB.

OPEN SOURCE24–48 GB VRAM

VRAM fit24–48 GB

Mochi 1

Genmo's 10-B open-weight T2V — the first 'genuinely fluid' OSS video model.

OPEN SOURCE24–60 GB VRAM

VRAM fit24–60 GB

AI-Toolkit (Ostris)

Modern training framework — Flux, SDXL, SD3 LoRAs in YAML.

OPEN SOURCE16–24 GB VRAM

VRAM fit16–24 GB

Modal

Serverless Python for GPU workloads.

FREEMIUM16–80 GB VRAM

VRAM fit16–80 GB

TRELLIS

Microsoft Research's structured 3D representation model.

OPEN SOURCE16–24 GB VRAM

VRAM fit16–24 GB

Pyramid Flow

Memory-efficient T2V via pyramidal flow matching.

OPEN SOURCE16–24 GB VRAM

VRAM fit16–24 GB

Wan 2.2

Open-weight video diffusion from Alibaba.

OPEN SOURCE12–48 GB VRAM

VRAM fit12–48 GB

Kohya_ss

The standard SDXL/Flux LoRA training UI.

OPEN SOURCE12–24 GB VRAM

VRAM fit12–24 GB

3D Gaussian Splatting

The INRIA original — train your own splats.

OPEN SOURCE12–24 GB VRAM

VRAM fit12–24 GB

Hunyuan3D-2

Tencent's open 3D generator — multi-view, PBR, ready-to-use meshes.

OPEN SOURCE12–16 GB VRAM

VRAM fit12–16 GB

LTX-Video

Real-time-ish open video diffusion from Lightricks.

OPEN SOURCE12–16 GB VRAM

VRAM fit12–16 GB

OneTrainer

Modern alternative trainer for SD/SDXL/Flux.

OPEN SOURCE12–24 GB VRAM

VRAM fit12–24 GB

Stable Diffusion 3.5 Large

Stability's MMDiT flagship at 8B params.

OPEN SOURCE12–24 GB VRAM

VRAM fit12–24 GB

SUPIR

Diffusion-based photorealistic upscaler.

OPEN SOURCE12–24 GB VRAM

VRAM fit12–24 GB

CogVideoX 5B

Open-source text-to-video diffusion from THUDM.

OPEN SOURCE12–24 GB VRAM

VRAM fit12–24 GB

FluxGym

Dead-simple Flux LoRA training in a Gradio UI.

OPEN SOURCE12–20 GB VRAM

VRAM fit12–20 GB

Stable Video Diffusion

Image-to-video diffusion — 25 frames, 14 or 25 steps.

OPEN SOURCE12–16 GB VRAM

VRAM fit12–16 GB

ComfyUI-AnimateDiff-Evolved

Animation motion modules for ComfyUI.

OPEN SOURCE10–16 GB VRAM

VRAM fit10–16 GB

Diffusers

Hugging Face's go-to library for every diffusion model.

OPEN SOURCE4–12 GB VRAM

VRAM fit4–12 GB

AUTOMATIC1111 (stable-diffusion-webui)

The original SD power-user webUI.

OPEN SOURCE4–12 GB VRAM

VRAM fit4–12 GB

Topaz Video AI

GPU-accelerated upscaling, frame-interp, denoise.

PAID · $2994–12 GB VRAM

VRAM fit4–12 GB

WhisperX

Whisper + speaker diarisation + word-level timestamps.

OPEN SOURCE4–8 GB VRAM

VRAM fit4–8 GB

XTTS v2 (Coqui)

Multilingual voice cloning in 6 seconds.

OPEN SOURCE4–6 GB VRAM

VRAM fit4–6 GB

Stable Diffusion WebUI Forge

Optimized A1111 fork for low-VRAM cards.

OPEN SOURCE4–8 GB VRAM

VRAM fit4–8 GB

Flowframes

Free RIFE-based frame interpolation.

FREE4–8 GB VRAM

VRAM fit4–8 GB

RVC (Retrieval-based Voice Conversion)

The voice-changer that took over Discord.

OPEN SOURCE4–8 GB VRAM

VRAM fit4–8 GB

Tabby

Self-hosted, GPU-accelerated coding autocompletion.

FREEMIUM · $19/MO4–8 GB VRAM

VRAM fit4–8 GB

Fooocus

Stable Diffusion XL, dialed to one button.

OPEN SOURCE4–8 GB VRAM

VRAM fit4–8 GB

Tortoise TTS

Slow, but the quality is worth the wait.

OPEN SOURCE4–8 GB VRAM

VRAM fit4–8 GB

Transformers

The library every LLM ships against first.

OPEN SOURCE2–24 GB VRAM

VRAM fit2–24 GB

OpenAI Whisper

The reference open-source speech-to-text model.

OPEN SOURCE2–10 GB VRAM

VRAM fit2–10 GB

faster-whisper

Whisper, 4× faster, same accuracy. CTranslate2 backend.

OPEN SOURCE2–6 GB VRAM

VRAM fit2–6 GB

FaceFusion

The most active open face-swap toolkit.

OPEN SOURCE2–8 GB VRAM

VRAM fit2–8 GB

Real-ESRGAN

The default OSS upscaler, still.

OPEN SOURCE2–6 GB VRAM

VRAM fit2–6 GB

Capabilities

✓ WHAT YOU CAN RUN

✕ WHAT STAYS OUT OF REACH

VRAM tier ≠ universal guarantee

Recommended tools for 48 GB VRAM