Free Desktop App — v1.15

Podcast Idiot
Transcriber

Transcribe your podcast episodes to text — completely free, completely private, running entirely on your own computer. No cloud. No API keys. No monthly fees. Ever.

100% Local Windows macOS Linux Free Forever
Download Free

Transcription that stays on your machine

The Podcast Idiot Transcriber is a desktop application I built to solve a simple problem: I needed accurate transcripts of my podcast episodes without paying for a cloud service every month or handing my audio files off to someone else’s server.

It uses OpenAI Whisper — the same AI transcription engine that powers some of the best commercial services — running entirely on your local computer. After the one-time setup, it works completely offline. Your audio never leaves your machine.

It also supports speaker diarization — the ability to tell speakers apart — so you get a labeled transcript showing which person said what. Essential for interview-format podcasts.

🔒

Completely Private

Your audio files never leave your computer. No uploads, no cloud processing, no data collection. Ever.

💸

Free Forever

No subscriptions, no API keys to pay for, no usage limits. Download once, use as much as you want.

🎙️

Speaker Labels

Automatically identifies different voices and labels them in the transcript. Perfect for interviews.

📄

All Formats

Outputs TXT, SRT, VTT, JSON, and Podcasting 2.0 JSON — ready to drop into your RSS feed.

GPU Accelerated

Automatically uses your NVIDIA, Apple Silicon, or AMD GPU for dramatically faster transcription.

🖥️

Cross-Platform

One download works on Windows, macOS, and Linux. Installers and uninstallers for all three included.

Simple from start to transcript

1

Download & Install

Download the zip, extract it, and run the installer for your OS. It handles Python, ffmpeg, and all dependencies automatically — and downloads the Whisper AI model (about 150 MB) one time.

2

Open the App

Launch Podcast Idiot Transcriber from your desktop icon, Start Menu, or application launcher. A clean branded interface shows all your options at a glance.

3

Choose Your Audio

Browse for your MP3, WAV, M4A, FLAC, OGG, or AAC file. Select an output folder, or let the app save transcripts beside the original audio.

4

Configure Options

Pick your Whisper model, choose language or auto-detect, toggle speaker labels, choose output formats, and set CPU priority so transcription runs quietly in the background.

5

Click Transcribe

Hit the big red Transcribe button and watch the progress log. When done, all your chosen output files are ready in the output folder.

Every format podcasters need

The app creates all your transcript files in a single pass. Choose which ones you want — or grab all of them.

FormatFileBest For
TXT.txtPlain readable transcript for your website, show notes, or personal reference
SRT.srtSubtitle file for video versions of your podcast, YouTube, or video editors
VTT.vttWebVTT captions for HTML5 players and web-based podcast players
JSON.jsonFull timestamped segment data for developers or custom integrations
Podcast 2.0_podcast20.jsonReady for the <podcast:transcript> tag in your RSS feed

Podcasting 2.0 ready: The Podcast 2.0 JSON format follows the official podcast namespace transcript spec. Upload it to your server and point to it from your RSS feed — compatible with all Podcasting 2.0 apps.

Pick your speed vs. accuracy tradeoff

Whisper comes in five sizes. The app defaults to base — a great balance for most podcasts. Switch to a larger model any time for more accurate results on difficult audio.

ModelSizeSpeed (1hr, CPU)Accuracy
tiny75 MB~5 minGood
small465 MB~20–30 minGreat
medium1.5 GB~45–60 minExcellent
large2.9 GB~90–120 minBest

Have a GPU? The app automatically detects and uses your NVIDIA (CUDA), Apple Silicon (Metal), or AMD GPU. A mid-range NVIDIA GPU can be 8–15× faster than CPU — a one-hour episode in under two minutes.

Know who said what

Speaker diarization automatically detects different voices and labels them in the transcript. Instead of a wall of text, you get something like this:

[SPEAKER_00] Welcome back to the show. Today we’re talking about transcription tools.
[SPEAKER_01] Thanks for having me. I’ve been looking for something like this for a while.
[SPEAKER_00] Let’s start with why you think transcripts matter for podcasters.

Speaker labeling uses pyannote.audio and requires a free HuggingFace account and token — a two-minute one-time setup. You can also turn it off entirely for faster, label-free transcription.

Works best with two clearly distinct voices and minimal crosstalk — typical interview podcasts transcribe with excellent accuracy.

What you need to run it

🪟 Windows 10 / 11
🍎 macOS 12+
🐧 Linux (Ubuntu, Mint, Fedora, Arch…)
🐍

Python 3.10+

The installer checks for Python and guides you if it’s missing. Mac and Linux often have it pre-installed.

🎞️

ffmpeg

Required for audio processing. The Mac installer gets it via Homebrew. Windows and Linux instructions are included.

💾

~500 MB Disk

For the base Whisper model and Python environment. Larger models need up to 3 GB extra.

🧠

4 GB RAM

8 GB or more recommended. Larger Whisper models need more RAM — see the model table above.

Ready to transcribe?

Free. Local. Private. No account required.

Download Free — v1.15
Windows · macOS · Linux  |  Includes installer & uninstaller for all platforms