Skip to content

modem-dev/podguy

Repository files navigation

podguy

Pi-first post-production tooling for podcast and video-podcast editors who want transcripts, chapters, clip candidates, cuts, and social review exports from local media.

CI

What it does

  • Launches pi with podcast-specific skills, prompts, and a startup widget.
  • Transcribes local audio/video files with Whisper-compatible backends.
  • Prepares long transcripts into timecoded chunks for editorial review.
  • Scans video episodes for likely interstitials and non-host inserts.
  • Generates chapters, clip candidates, cut reports, show notes, quotes, and proper noun checks.
  • Cuts selected highlight ranges into review exports for TikTok, Reels, YouTube Shorts, trailers, or social posts.
  • Uploads finished episodes to YouTube with metadata composed from analysis artifacts.

Generated transcripts, scans, thumbnails, notes, and clip exports go under gitignored dist/ by default.

Install

Clone the repo and install the local Node tooling:

git clone https://github.com/modem-dev/podguy.git
cd podguy
npm install

Install the required system tools:

brew install uv ffmpeg

Notes:

  • ffmpeg is used for fixtures, transcription backends, and clip cutting.
  • uv runs the Python scripts and optional transcription dependency groups. The first uv command creates a local .venv/ and may download Python automatically — this is normal and only happens once.
  • The video scanner is macOS-only and uses Swift / AVFoundation / Vision. Everything else works cross-platform (Linux: install uv and ffmpeg via your package manager).

Set up a real transcription backend when you are ready to transcribe episodes:

uv sync --group transcribe-mlx      # Apple Silicon
uv sync --group transcribe-faster   # Cross-platform
uv sync --group transcribe-whisper  # OpenAI Whisper package

Quick start

Create an optional show profile:

cp podguy.example.toml podguy.toml

Start podguy from the repo root:

./podguy

On first launch, type /login inside pi to connect your model provider (or use your usual API key setup).

Then ask pi for a concrete episode task:

Analyze "episode-006-draft.mp4" as ep006.

Generate chapters for ep006 in timestamp-title format.

Find likely TikTok/Shorts clips for ep006 and cut vertical review exports.

For broad requests, podguy should clarify between:

  • quick pass: optional video scan + transcript + prepared transcript artifacts + short summary
  • full review: quick pass + chapters + clips + cuts + show notes + quotes + proper noun review

Common workflows

You don't need to memorize these — pi runs them for you when you ask in natural language. They're here for reference and debugging.

Scan a video

swift scripts/scan_podcast.swift "episode-006-draft.mp4" dist/analysis/ep006/scan 0.5
open dist/analysis/ep006/scan/report.html

Key outputs:

  • interstitial_candidates.csv
  • non_host_candidates.csv
  • report.html
  • thumbs/

Scanner results are heuristic review aids, not exact edit points.

Transcribe media

uv run python scripts/transcribe_video.py --list-backends

uv run --group transcribe-mlx python scripts/transcribe_video.py \
  "episode-006-draft.mp4" \
  dist/analysis/ep006/transcript \
  --backend mlx-whisper

Key outputs:

  • segments.json
  • transcript.txt
  • transcript.srt
  • transcript.vtt
  • summary.txt

Use --backend mock only for tests and setup validation.

Prepare transcript artifacts

uv run python scripts/prepare_transcript_analysis.py \
  dist/analysis/ep006/transcript \
  --output-dir dist/analysis/ep006 \
  --slug ep006 \
  --plain-output-names

Key outputs:

  • dist/analysis/ep006/transcript_chunks.md
  • dist/analysis/ep006/transcript_index.json

These are the main inputs for chaptering and editorial analysis.

Cut selected clips

After pi writes dist/analysis/ep006/clips.md, cut original-aspect review exports:

uv run python scripts/cut_clips.py \
  "episode-006-draft.mp4" \
  dist/analysis/ep006/clips.md \
  dist/analysis/ep006/clips/cuts

For simple vertical Shorts/TikTok/Reels review exports:

uv run python scripts/cut_clips.py \
  "episode-006-draft.mp4" \
  dist/analysis/ep006/clips.md \
  dist/analysis/ep006/clips/shorts \
  --aspect vertical \
  --pad-start 1 \
  --pad-end 1

The cutter writes generated media plus manifest.json. Vertical and square modes use center-crop framing, so treat them as review exports unless the framing has been checked.

Publish to YouTube

One-time setup: create a Google Cloud project with the YouTube Data API v3 enabled, create a Desktop-app OAuth client, and save the downloaded JSON to ~/.config/podguy/youtube/client_secret.json. On the OAuth consent screen, keep the app in Testing mode and add your own Google account as a test user — otherwise the auth flow fails with access_denied. Then authenticate:

uv sync --group youtube
uv run --group youtube python scripts/youtube_publish.py auth

Upload an episode (private by default; use --dry-run first to preview the request):

uv run --group youtube python scripts/youtube_publish.py upload \
  "episode-006-final.mp4" \
  --title "Ep 6: Why this market flipped" \
  --description-file dist/analysis/ep006/youtube-description.md \
  --chapters-file dist/analysis/ep006/chapters.md

Other subcommands cover thumbnails, SRT captions, playlists, scheduled publishing (--publish-at), status checks, and metadata updates. Defaults like privacy, category, tags, and a description footer come from the [youtube] section of podguy.toml.

Each upload costs 1600 of the default 10000 daily YouTube API quota units, and videos uploaded through unverified API projects may stay locked private until the project passes a YouTube API audit.

Download real sample media

Use the Cordkillers open-license video-podcast excerpt for local evaluation:

scripts/download_sample_media.sh

This writes to:

dist/test-fixtures/open-license/cordkillers-572/

The default sample is a 3m50s excerpt from 00:08:00 of Cordkillers 572, licensed CC BY-SA 4.0. The range includes multiple podcast layouts, lower thirds, chat/sidebar graphics, a Patreon bumper, and an outro/interstitial card. The script writes ATTRIBUTION.md next to the generated media.

Configuration

podguy.toml lets you define show-specific context without changing the workflow:

show_name = "Example Podcast"
show_slug = "example"
hosts = ["Host One", "Host Two"]
tone = "curious, direct, practical"
audience = "builders and technical operators"
chapter_style = "concise descriptive titles"
preferred_review = "quick_pass"

podcast.toml is also accepted as a compatible profile name.

Project layout

Development

Run the full validation surface:

npm run format:check
npm run lint
npm run typecheck
npm test

Run the shell smoke tests directly:

bash scripts/test.sh

CI runs the same checks on macOS because the scanner depends on macOS media APIs.

Contributing

Small, focused PRs are welcome. Before opening a PR, run the validation commands above.

For workflow or heuristic changes, include:

  • media type and OS/backend details
  • expected vs actual output
  • relevant transcript, scan, or manifest paths when available

See CHANGELOG.md for user-visible changes and AGENTS.md for repo maintenance guidance.

Security

This repo does not have a published security policy yet. If you find a sensitive issue, do not open a public issue. Contact the maintainers privately first.

Sponsor

Sponsored by Modem.

Modem

License

MIT. See LICENSE.

Support

Use GitHub issues for bugs, questions, and workflow discussion.

About

Agent-driven post-production workflow for video podcasts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors