Podcast Transcription Setup

Ultra Quick Start

  1. Install uv (if you don’t have it):

    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Run the self-contained script:

    ./transcribe_episodes.py
    

That’s it! The script automatically manages its own dependencies with uv.

What It Does

Output Structure

transcripts/
β”œβ”€β”€ 01-rizzoli-matt_rodbard.txt
β”œβ”€β”€ 01-rizzoli-matt_rodbard.srt  
β”œβ”€β”€ 01-rizzoli-matt_rodbard.json
β”œβ”€β”€ 02-notion-rob_giampietro.txt
β”œβ”€β”€ ...etc

Performance Notes

Customization Options

Use faster model (lower quality):

model = whisper.load_model("base")  # Much faster

Add speaker detection:

# In transcribe_episode function, add:
result = model.transcribe(mp3_path, 
                        language='en',
                        word_timestamps=True,
                        verbose=False,
                        initial_prompt="This is a podcast conversation between Craig Mod and a guest.")

Output only specific formats: Comment out the formats you don’t want in the save_transcript() function.

Troubleshooting

“No module named ‘whisper’”:

pip install openai-whisper

Memory errors: Use smaller model:

model = whisper.load_model("base")

FFmpeg errors:

# macOS
brew install ffmpeg

# Ubuntu/Debian  
sudo apt install ffmpeg

Very slow performance: Consider using faster-whisper:

pip install faster-whisper

Integration Ideas

Once you have transcripts, you could:

  1. Add to episode pages - Include transcript sections
  2. Enable search - Make episodes searchable by content
  3. Generate show notes - Extract key topics automatically
  4. Create clips - Find quotable moments with timestamps

Subscribe to my newsletters

Join some ~40,000 other subscribers.

Roden: photography Γ— literature Γ— tech Γ— film (monthly)
Ridgeline: walking Γ— Japan (weekly)

Always one-click to unsubscribe.