Modal
Serverless Python for GPU workloads.
- Min VRAM
- 16 GB
- GPU class
- Datacenter GPU
- Quant
- —
Sixteen gigabytes is where things stop feeling tight. Apple Silicon Macs with 16 GB unified memory hit similar territory. Local video diffusion becomes practical (LTX-Video, CogVideoX 5B), and EXL2 LLM quants open up new throughput options.
Sorted by best fit for this tier — tools designed around your VRAM budget first, then by our power-user score.
Serverless Python for GPU workloads.
Modern training framework — Flux, SDXL, SD3 LoRAs in YAML.
Microsoft Research's structured 3D representation model.
Memory-efficient T2V via pyramidal flow matching.
Open-weight video diffusion from Alibaba.
The standard SDXL/Flux LoRA training UI.
The INRIA original — train your own splats.
Tencent's open 3D generator — multi-view, PBR, ready-to-use meshes.
Real-time-ish open video diffusion from Lightricks.
Modern alternative trainer for SD/SDXL/Flux.
Stability's MMDiT flagship at 8B params.
Diffusion-based photorealistic upscaler.
Open-source text-to-video diffusion from THUDM.
Dead-simple Flux LoRA training in a Gradio UI.
Image-to-video diffusion — 25 frames, 14 or 25 steps.
Animation motion modules for ComfyUI.