Whisper.cpp is an optimized implementation of OpenAI’s Whisper automatic speech recognition (ASR) model, designed for efficient local transcription.
OpenAI’s Whisper is a state-of-the-art speech recognition model that:
Using the original Whisper requires coding skills, and you can check the Whisper.cpp repository
Prerequisites
| Model | Parameters | Size | Speed | Quality |
|---|---|---|---|---|
| tiny | 39 M | ~40 MB | Fast | Basic |
| base | 74 M | ~75 MB | Fast | Good |
| small | 244 M | ~250MB | Med | Better |
| medium | 769 M | ~770MB | Slow | High |
| large | 1550 M | ~1.5GB | Slow | Best |
# Transcribe an audio file
./main -m models/ggml-base.bin -f audio.wav
# Specify language (faster)
./main -m models/ggml-base.bin -l en -f audio.wav