DATASHEET // FASTER-WHISPER

faster-whisper

Whisper, 4× faster, same accuracy. CTranslate2 backend.

OPEN SOURCE2–6 GB VRAMRuns locally

Actually FreeNo SignupOpen SourceWatermark-FreeHobbyist-OK

Visit faster-whisperUPDATED 2026-04-25 · DIRECT LINK

github.com/SYSTRAN/faster-whisper

HARDWARE REQUIREMENTS //

Runs locally · Entry GPU (6–8 GB)

2–6 GB VRAM

Min VRAM

2 GB

Rec. VRAM

6 GB

Min RAM

8 GB

Rec. RAM

16 GB

Disk

8 GB

GPU class

Entry GPU

11.8+Apple Silicon ✓CPU-CapableQuant: INT8, FP16

`large-v3` runs in 4-5 GB VRAM with INT8.

[ EDITORIAL PICK ]

Why we recommend faster-whisper

DERIVED FROM METADATA — NOT SPONSORED

Open source
Source is public — you can audit it, fork it, and you'll never lose access to your workflows if faster-whisper the company changes direction.
Runs on 2 GB
Fits on entry-level cards (GTX 1660, RTX 3050, RTX 4060). Rare for this category.
Apple Silicon
Native Metal / MPS support — runs on M-series Macs without CUDA gymnastics.
Top-tier pick
Power-user score 89/100 — consistently rated highly by people who use this every day, not just benchmark chasers.

[ EVIDENCE NOTE ]

Documentation-led datasheet

This page summarizes upstream documentation, release information, and editorially reviewed catalogue fields. It is not presented as a hands-on benchmark. Verify changing requirements at the official project; report stale data through our corrections channel.

Memory guide →

AT-A-GLANCE SIGNALS //

DERIVED FROM THIS PAGE'S DATA

Install difficulty
Easy
Runs CPU-only — no CUDA / driver gymnastics required.
Hardware comfort
Entry-level
Fits on 2 GB cards — GTX 1660 / RTX 3050 territory.
Ecosystem
Open source
Source is public — auditable and forkable, no vendor lock.
Verification
Recent
Catalogue entry last updated 82 days ago — re-verification due soon.

[ COMMUNITY GUIDES & WORKFLOWS ]

Tutorials & deep-dives for faster-whisper

Hand-picked from YouTube, Reddit, GitHub, and the wider web. Each link goes straight to the source — we don't intercept or rewrite anything.

[ MORE IN THIS NICHE ]

Other local llm runners tools we rate

Three picks across different tradeoffs — so you don't end up with three near-clones of faster-whisper.

LIGHTEST HARDWARE //

Transformers

The library every LLM ships against first.

OPEN SOURCE2–24 GB VRAM

BEST FREE OPTION //

llama.cpp

The C++ inference engine powering most local LLMs.

OPEN SOURCECPU-CAPABLE

TOP QUALITY //

Ollama

One-command local LLM runtime.

OPEN SOURCECPU-CAPABLE

What is faster-whisper?

faster-whisper reimplements Whisper inference on top of CTranslate2 (C++/CUDA), delivering 4× speedup over the reference PyTorch impl at the same word error rate. INT8 quantisation halves VRAM again with no measurable accuracy loss. The default Whisper backend for anyone who's measured it.

Pros & cons

✓ PROS

4× faster than reference Whisper at equal accuracy
INT8 quantisation cuts VRAM in half
Drop-in CLI compatible with the reference

– CONS

No diarisation built in — pair with WhisperX or pyannote
Setup involves CUDA + cuDNN library paths that occasionally fight

What's actually free?

MIT.

✓ Actually FreeNo SignupOpen SourceWatermark-Free

Alternatives

OpenAI Whisper

The reference open-source speech-to-text model.

OPEN SOURCE2–10 GB VRAM

VRAM fit2–10 GB

WhisperX

Whisper + speaker diarisation + word-level timestamps.

OPEN SOURCE4–8 GB VRAM

VRAM fit4–8 GB