Portfolio

Production AI systems I built and enterprise case studies from Fortune 100 companies including Walmart, Lowe's, and more.

Featured Project
Independent Product

Podcast-to-Pack: 18-Piece Content Pack from One Episode in Under 2 Minutes

Solo Builder & Product Architect

Recording a podcast takes hours. Repurposing it takes just as long. A single 45-minute episode contains enough material for a full week of social media posts, newsletter content, blog articles, and quote cards — but most podcasters never extract it. They publish the episode and move on, leaving 90% of the content value on the table. The podcasters who do repurpose spend 5-30+ hours per episode manually transcribing, pulling quotes, rewriting for different platforms, and formatting for each channel's requirements. Hiring a content team costs $2,000-5,000/month. The existing tools — Castmagic ($29-299/mo), Swell AI ($29-79/mo), Podium (free-$19) — are expensive for what they deliver, slow (5-30 minutes per episode), and produce generic AI-sounding outputs with no quality validation. None offered an API for integration, automated quality checks, voice-matched content that preserves the podcaster's style, or platform-specific formatting. The market had a clear gap: a tool that could transform one episode into a complete content pack — fast, cheap, quality-validated, and sounding like the podcaster, not like ChatGPT.

Read Full Case Study
18 pieces per episode (10 formats)
Content Output
$0.56-0.71/episode (75-95% cheaper than competitors)
Processing Cost
Fortune 100 Company

Agentic AI Product Strategy for Fortune 100 Company

Product Leader, AI StrategyActive engagement

Product Managers were spending 40% of their time on repetitive documentation tasks — writing epics and features in Azure DevOps, creating user stories with acceptance criteria, and drafting process flows. This created PM bottlenecks, inconsistent quality across a 600+ person organization, and slow velocity from concept to dev-ready.

~18PMs in Rollout
3 of 3Agents Operational
600+ PMsOrganization Size
Active & ExpandingStatus
Read Case Study
Open Evaluation

AI Agent Evaluation Framework

Framework Designer3 months

AI agents were being deployed without standardized evaluation, leading to ad-hoc assessments, safety concerns, and inconsistent quality. Needed a systematic framework to evaluate AI agents across safety, fairness, and reliability dimensions before deployment.

7 (HHH+)Evaluation Pillars
Structured adversarial testingRed Team Protocol
2Production Clients
Active frameworkStatus
Read Case Study
Fortune 100 Company

National Fiber Expansion: Product Strategy for 1.5M New Customers

Group Product Manager9 months

The company wanted to expand fiber connectivity from 21 states to all 48 contiguous states through a joint venture partnership. This required new ordering systems, partner integration with a different technical stack, regulatory compliance across states, and a seamless customer experience regardless of infrastructure owner.

21 → 48 statesExpansion
1.5M potential (over 10 years)New Customers
22 PMsTeam Managed
94%Milestones On-Time
Read Case Study
Fortune 500 Retailer

Digital Platform Transformation: 39% Revenue Increase, 300M Peak Hits

Head of Product7 months

The client's digital platform had a legacy Websphere architecture that couldn't scale for peak traffic, price inconsistencies across web, mobile app, and stores causing cart abandonment, and a batch-based marketing system that couldn't do real-time personalization. $2B+ in projected annual digital revenue was at risk.

+39%Revenue YoY
300M hitsPeak Traffic
50% fasterPage Load
Projected $68M benefitMarketing Forecast
Read Case Study
Independent R&D

Claude Code AI System: Multi-Source Research Agent

Builder & ArchitectActive

Needed a system that could research comprehensively from multiple sources (not just Google), detect when sources disagree on facts, generate content in a consistent personal voice, and develop features autonomously — plan, implement, validate, and commit code.

10 parallelResearch Sources
<90 secondsResearch Speed
4-stage autonomousDev Workflow
5-level pyramidValidation
Read Case Study
Production System

Second Brain: Personal AI Assistant with 3-Layer Security

Builder & Security ArchitectActive

Personal assistants need access to sensitive data (calendar, emails, vault notes) to be useful, but LLMs are vulnerable to prompt injection attacks. Needed a system that could integrate Google Calendar, Slack, and Obsidian vault memory while defending against adversarial prompts trying to leak secrets or manipulate outputs.

0 (baseline: 3 failed attempts in first week)Security Incidents
15 of 15 adversarial testsInjection Patterns Blocked
94% → 99.8% (after circuit breaker implementation)Uptime
91% (with safety filters vs 96% without — acceptable trade-off)Response Accuracy
Read Case Study
Independent Product

MakeYourApp.ai: Sketch + Voice to React App in 60 Seconds

Builder & UX Architect2 months

Non-technical users have app ideas but can't prototype them. Wireframing tools require design skills, and no-code builders have steep learning curves. Needed a system where someone could sketch a UI on paper, describe it verbally, and get a working React app in under 60 seconds — no coding, no tutorials, just natural interaction.

3min → 60sec (67% faster than no-code tools)Time to Preview
GPT-4o Vision: 89% component detectionUI Extraction Accuracy
92% of generated components work without editsCode Quality
None (no coding) → React apps in 60secUser Skill Required
Read Case Study
Production System

Obsidian-Agent-Post: Voice-Consistent AI Content Pipeline

System Architect & Builder4 months

AI-generated content has a credibility problem: it sounds generic, lacks personal voice, and fails silently on quality. Most AI content tools optimize for speed, not substance. The result is 'AI slop' -- technically correct but indistinguishable from any other LLM output. I needed a system that could generate content across 6 channels (LinkedIn, Blog, Twitter/X, Substack, Medium, GitHub) while maintaining a consistent personal voice, scoring quality across multiple dimensions, and catching issues that single-pass LLM review misses.

89% avg (31 checkpoints, 4 categories)Voice Match Rate
75+ avg across 6 dimensions (up from 69.1 baseline)CQS Quality Score
10 sources in <90 sec (vs 4+ hours manual)Research Speed
6 channels per topic, format-adaptedMulti-Channel Output
Read Case Study