Introduction to LLMs

tiny/base: Quick transcription, acceptable quality
small: Balanced speed and accuracy
medium: High accuracy, moderate speed
large: Best quality, slower processing

Whisper.cpp (Transcription)

Whisper.cpp is an optimized implementation of OpenAI’s Whisper automatic speech recognition (ASR) model, designed for efficient local transcription.

OpenAI’s Whisper is a state-of-the-art speech recognition model that:

Using the original Whisper requires coding skills, and you can check the Whisper.cpp repository

Prerequisites

Model	Parameters	Size	Speed	Quality
tiny	39 M	~40 MB	Fast	Basic
base	74 M	~75 MB	Fast	Good
small	244 M	~250MB	Med	Better
medium	769 M	~770MB	Slow	High
large	1550 M	~1.5GB	Slow	Best

# Transcribe an audio file
./main -m models/ggml-base.bin -f audio.wav

# Specify language (faster)
./main -m models/ggml-base.bin -l en -f audio.wav

Previous submodule: