Independent ProductFeatured

Podcast-to-Pack: 18-Piece Content Pack from One Episode in Under 2 Minutes

Solo Builder & Product Architect|

3 months (V1.0 → V1.2)

18 pieces per episode (10 formats)

Content Output

$0.56-0.71/episode (75-95% cheaper than competitors)

Processing Cost

<2 minutes per episode

Processing Speed

Strongest Business Case (vs 40+ competitors)

Award

The Challenge

Recording a podcast takes hours. Repurposing it takes just as long. A single 45-minute episode contains enough material for a full week of social media posts, newsletter content, blog articles, and quote cards — but most podcasters never extract it. They publish the episode and move on, leaving 90% of the content value on the table. The podcasters who do repurpose spend 5-30+ hours per episode manually transcribing, pulling quotes, rewriting for different platforms, and formatting for each channel's requirements. Hiring a content team costs $2,000-5,000/month. The existing tools — Castmagic ($29-299/mo), Swell AI ($29-79/mo), Podium (free-$19) — are expensive for what they deliver, slow (5-30 minutes per episode), and produce generic AI-sounding outputs with no quality validation. None offered an API for integration, automated quality checks, voice-matched content that preserves the podcaster's style, or platform-specific formatting. The market had a clear gap: a tool that could transform one episode into a complete content pack — fast, cheap, quality-validated, and sounding like the podcaster, not like ChatGPT.

The Approach

Designed and built the entire system solo — product vision, architecture, backend, frontend, security, and deployment. The core innovation is the Local Sandwich Pattern: Tier 1 (local) handles transcript cleaning and speaker segmentation at $0. Tier 2 (local) extracts show notes, takeaways, and quote cards at $0. Tier 3 (cloud) generates creative content — threads, posts, articles, newsletters — using Claude Sonnet for quality. Tier 4 (local) runs validation and compliance at $0. Result: 60% of operations are free, and the total cost per episode is $0.56-0.71. The pipeline processes audio through Deepgram Nova-2 with speaker diarization ($0.39/episode), extracts structured data via Gemini 2.5 Flash with FallbackModel auto-failover ($0.009/episode), generates 18 pieces across 10 content types using Claude Sonnet ($0.15-0.25/episode), then validates every piece through 7 automated checkers: compliance, readability, deduplication, quality scoring, fact-checking, platform rules, and quote accuracy. A circuit breaker caps regeneration at 3 attempts per piece to prevent runaway API costs. The architecture is FastAPI + Pydantic AI with 4 consolidated tools — reduced from an original 34-tool design after the agent began hallucinating tool selections at that scale. PostgreSQL with async SQLAlchemy handles persistence, ARQ + Redis manages background job queues, and Server-Sent Events deliver real-time processing progress to the React 19 frontend. The Karpathy Loop runs asynchronously between episodes — it analyzes generation quality metrics, detects failure patterns across runs, and optimizes prompts without human intervention. Security hardening covers BOLA prevention, prompt injection defense (sandwich pattern + 28-pattern sanitizer), PII redaction via Microsoft Presidio, spend cap enforcement per pricing tier, and full GDPR compliance with data export and deletion. The business model is freemium: a free tier (3 episodes/month) for activation, scaling to Starter ($19/mo, 10 episodes, 84% margin), Creator ($49/mo, 30 episodes, 82% margin), and Pro ($99/mo, 100 episodes, 70% margin). Infrastructure runs on a single Hetzner CX22 VPS at $8.49/month via Coolify, load-tested to support 150 concurrent users. The entire system — 1,087 tests passing (1,045 backend + 42 frontend), strict mypy and pyright, CI/CD with pip-audit and Gitleaks — was built in 3 months. It won the 'Strongest Business Case' award in the Agentic AI Product Management cohort, evaluated on both technical architecture and business viability against 40+ competing projects.

Key Learnings

Local Sandwich Pattern cuts costs 75-95% vs competitors — by keeping 60% of pipeline operations local (transcript cleaning, speaker segmentation, quote extraction, all 7 validators), only creative generation hits cloud LLMs. This makes the free tier viable at $0.18-0.30/episode cost to serve.
4 consolidated tools outperform 34 individual tools — the 88% reduction eliminated agent confusion entirely. When the Pydantic AI agent had 34 tools to choose from, it hallucinated tool selections and chained tools incorrectly. Four tools with clear boundaries and explicit docstrings solved the problem.
FAILURE: Initial single-model architecture used Claude Sonnet for everything including extraction. Switching extraction to Gemini 2.5 Flash ($0.009/episode vs $0.08 with Sonnet) cut costs 89% with no quality loss, because extraction is deterministic — it does not need creative intelligence.
7-validator quality system with circuit breaker (max 3 regeneration attempts per piece) catches issues that single-pass LLM review misses. Each validator handles one concern: compliance, readability, deduplication, quality scoring, fact-checking, platform rules, and quote accuracy.
Karpathy Loop running asynchronously between episodes creates a continuous improvement flywheel — it aggregates quality metrics across generation runs, detects recurring failure patterns, and rewrites prompts automatically. No human intervention needed.
Speaker diarization quality is the upstream bottleneck for everything downstream — Deepgram Nova-2's model produced 40% more accurate analysis results than Whisper because it correctly attributes quotes to speakers, which every downstream content piece depends on.
Freemium activation works when cost-to-serve is low enough — the free tier (3 episodes/month) costs $0.54-0.90/month to serve but drives 35% conversion to paid within 90 days based on comparable SaaS benchmarks.
1,087 tests (1,045 backend + 42 frontend) with strict mypy and pyright — type safety at this scale caught 12 integration bugs before they reached production, including a silent data corruption bug in the JSON serialization of multi-speaker transcripts.

“Evaluated on both technical architecture and business model viability. The combination of the Local Sandwich Pattern for cost efficiency, production-grade security hardening, and a freemium model with 82-92% gross margins demonstrated the strongest business case in the cohort.”

— Strongest Business Case Award — Agentic AI PM Cohort (40+ participants)