DATASHEET // TABBY-ML

Tabby

Self-hosted, GPU-accelerated coding autocompletion.

FREEMIUM · $19/MO4–8 GB VRAMSelf-hosted server

Actually FreeNo SignupOpen SourceWatermark-FreeHobbyist-OKAPI

Visit TabbyUPDATED 2026-05-09 · DIRECT LINK

tabby.tabbyml.com/

HARDWARE REQUIREMENTS //

Self-hosted server · Entry GPU (6–8 GB)

4–8 GB VRAM

Min VRAM

4 GB

Rec. VRAM

8 GB

Min RAM

8 GB

Rec. RAM

16 GB

Disk

15 GB

GPU class

Entry GPU

Apple Silicon ✓CPU-CapableQuant: GGUF, Q4_K_M

1.5-7B coder models comfortable on 4-8 GB.

[ EDITORIAL PICK ]

Why we recommend Tabby

DERIVED FROM METADATA — NOT SPONSORED

Open source
Source is public — you can audit it, fork it, and you'll never lose access to your workflows if Tabby the company changes direction.
Runs on 4 GB
Fits on entry-level cards (GTX 1660, RTX 3050, RTX 4060). Rare for this category.
Apple Silicon
Native Metal / MPS support — runs on M-series Macs without CUDA gymnastics.
2 quant formats
Supports GGUF, Q4_K_M — you can dial VRAM use up or down to match your card.

[ EVIDENCE NOTE ]

Documentation-led datasheet

This page summarizes upstream documentation, release information, and editorially reviewed catalogue fields. It is not presented as a hands-on benchmark. Verify changing requirements at the official project; report stale data through our corrections channel.

Memory guide →

AT-A-GLANCE SIGNALS //

DERIVED FROM THIS PAGE'S DATA

Install difficulty
Easy
Runs CPU-only — no CUDA / driver gymnastics required.
Hardware comfort
Entry-level
Fits on 4 GB cards — GTX 1660 / RTX 3050 territory.
Ecosystem
Strong devkit
Open-source AND ships an API — easy to integrate, possible to host yourself.
Verification
Recent
Catalogue entry last updated 68 days ago — re-verification due soon.

[ COMMUNITY GUIDES & WORKFLOWS ]

Tutorials & deep-dives for Tabby

Hand-picked from YouTube, Reddit, GitHub, and the wider web. Each link goes straight to the source — we don't intercept or rewrite anything.

[ MORE IN THIS NICHE ]

Other local llm runners tools we rate

Three picks across different tradeoffs — so you don't end up with three near-clones of Tabby.

LIGHTEST HARDWARE //

Transformers

The library every LLM ships against first.

OPEN SOURCE2–24 GB VRAM

BEST FREE OPTION //

llama.cpp

The C++ inference engine powering most local LLMs.

OPEN SOURCECPU-CAPABLE

TOP QUALITY //

Ollama

One-command local LLM runtime.

OPEN SOURCECPU-CAPABLE

What is Tabby?

Tabby is the self-hosted answer to GitHub Copilot. Runs a local server with any code-tuned model (StarCoder, DeepSeek-Coder, Qwen-Coder); editor plugins for VS Code, JetBrains, Vim, Neovim talk to it over HTTP. Apache 2.0; enterprise team features available.

Pros & cons

✓ PROS

Self-hosted Copilot — no code leaves your network
Pluggable models — pick your size/quality trade-off
Multi-editor plugins maintained by the project

– CONS

Quality bound to model — DeepSeek-Coder-V2 is the current sweet spot
Heavier setup than Continue + Ollama

What's actually free?

Apache 2.0 community edition; team features paid.

✓ Actually FreeNo SignupOpen SourceWatermark-Free

Alternatives

Continue

Open-source Copilot — VS Code & JetBrains, any model.

OPEN SOURCECPU-CAPABLE

Ollama

One-command local LLM runtime.

OPEN SOURCECPU-CAPABLE