[ Hardware-first guides ]

What can you run on your GPU?

Local AI lives and dies by VRAM. Pick your tier — we'll show only the workflows, models, and runners that actually fit in that budget. Every entry tells you the required quantization, expected speed, and what stays out of reach.

6GB45 TOOLS

AI tools that run on 6 GB VRAM

Six gigabytes is a constrained but useful learning tier. Smaller or aggressively optimized image and language-model workflows can run, but model variant, precision, resolution, context, and offload settings decide the real fit. Leave room for runtime overhead instead of sizing from checkpoint files alone.

e.g. GTX 1660 Ti · RTX 2060 · RTX 3050 / 3060 6GB

8GB59 TOOLS

AI tools that run on 8 GB VRAM

Eight gigabytes is a practical hobbyist entry tier, not a universal compatibility line. Many image workflows and compact quantized LLMs fit with sensible settings. Official ComfyUI documentation also shows a Wan 2.2 TI2V 5B path designed for 8 GB with native offload; larger video variants remain a different workload.

e.g. RTX 3060 Ti · RTX 3070 · RTX 4060 / 4060 Ti

12GB12 TOOLS

AI tools that run on 12 GB VRAM

Twelve gigabytes gives useful headroom for complex image graphs, quantized local LLMs, and selected compact video workflows. It is still a planning tier rather than a guarantee: model architecture, context, resolution, frames, precision, and offload settings can move a workload across the boundary.

e.g. RTX 3060 12GB · RTX 4070 · RTX 4070 SUPER

16GB16 TOOLS

AI tools that run on 16 GB VRAM

Sixteen gigabytes reduces compromise for multi-component image graphs and expands quantized LLM and video options. Unified memory on Apple Silicon is not directly interchangeable with discrete VRAM: the operating system and application share the same pool, and backend support differs.

e.g. RTX 4070 Ti SUPER · RTX 4080 (16 GB) · RTX 5070 Ti

24GB26 TOOLS

AI tools that run on 24 GB VRAM

Twenty-four gigabytes is a flexible consumer-workstation tier. It supports upstream-documented Flux LoRA configurations and gives heavy image or video graphs more room, but it does not guarantee that every large model, precision, context, or frame sequence stays fully on the GPU.

e.g. RTX 3090 / 3090 Ti · RTX 4090 · RTX A5000

48GB37 TOOLS

AI tools that run on 48 GB VRAM

Forty-eight gigabytes is workstation territory. Now you can run the biggest open-weight video models at full precision, serve 70B LLMs through vLLM, and start fine-tuning instead of just LoRA-ing. The RTX 5090, RTX A6000, and L40S are the typical homes for this tier.

e.g. NVIDIA RTX A6000 · NVIDIA L40 / L40S · Two 24 GB GPUs where the runtime supports sharding

80GB59 TOOLS

AI tools that run on 80 GB+ datacenter GPUs

Eighty gigabytes and up is datacenter territory — typically rented by the hour rather than owned. This is where you serve 70B+ LLMs at production scale, full fine-tune large models, and run multi-GPU video diffusion pipelines.

e.g. NVIDIA A100 80GB · NVIDIA H100 80GB · NVIDIA H200 141GB

No GPU? Cloud-only options

When you don't have local hardware (or you need quality above what fits in your card), these cloud APIs cover the same workloads.

What can you run on your GPU?

AI tools that run on 6 GB VRAM

AI tools that run on 8 GB VRAM

AI tools that run on 12 GB VRAM

AI tools that run on 16 GB VRAM

AI tools that run on 24 GB VRAM

AI tools that run on 48 GB VRAM

AI tools that run on 80 GB+ datacenter GPUs

No GPU? Cloud-only options

ElevenLabs

Cline

Midjourney v7

Replicate

fal.ai

Luma AI

Ideogram

Runway Gen-4

Vast.ai

Beam

Kling 2.1

Meshy

Pika 2.2

VRAM tiers

AI tools that run on 6 GB VRAM

AI tools that run on 8 GB VRAM

AI tools that run on 12 GB VRAM

AI tools that run on 16 GB VRAM

AI tools that run on 24 GB VRAM

AI tools that run on 48 GB VRAM

AI tools that run on 80 GB+ datacenter GPUs