Portfolio

Case studies from my work at AT&T, Walmart, Lowe's, and more. Real results from real projects at enterprise scale.

Fortune 100 TelecomFeatured

Agentic AI Product Strategy for Fortune 100 Telecom

Product Leader, AI StrategyActive engagement

Product Managers were spending 40% of their time on repetitive documentation tasks — writing epics and features in Azure DevOps, creating user stories with acceptance criteria, and drafting process flows. This created PM bottlenecks, inconsistent quality across a 600+ person organization, and slow velocity from concept to dev-ready.

~18PMs in Rollout
3 of 3Agents Operational
600+ PMsOrganization Size
Active & ExpandingStatus
Read Case Study
Open EvaluationFeatured

AI Agent Evaluation Framework (HHH+)

Framework Designer3 months

AI agents were being deployed without standardized evaluation, leading to ad-hoc assessments, safety concerns, and inconsistent quality. Needed a systematic framework to evaluate AI agents across safety, fairness, and reliability dimensions before deployment.

7 (HHH+)Evaluation Pillars
Structured adversarial testingRed Team Protocol
2Production Clients
Active frameworkStatus
Read Case Study
Fortune 500 RetailerFeatured

Digital Platform Transformation: 39% Revenue Increase, 300M Peak Hits

Head of Product7 months

The client's digital platform had a legacy Websphere architecture that couldn't scale for peak traffic, price inconsistencies across web, mobile app, and stores causing cart abandonment, and a batch-based marketing system that couldn't do real-time personalization. $2B+ in projected annual digital revenue was at risk.

+39%Revenue YoY
300M hitsPeak Traffic
50% fasterPage Load
Projected $68M benefitMarketing Forecast
Read Case Study
Fortune 100 Telecom

National Fiber Expansion: Product Strategy for 1.5M New Customers

Group Product Manager9 months

The company wanted to expand fiber connectivity from 21 states to all 48 contiguous states through a joint venture partnership. This required new ordering systems, partner integration with a different technical stack, regulatory compliance across states, and a seamless customer experience regardless of infrastructure owner.

21 → 48 statesExpansion
1.5M potential (over 10 years)New Customers
22 PMsTeam Managed
94%Milestones On-Time
Read Case Study
Personal Project

Claude Code AI System: Multi-Source Research Agent

Builder & ArchitectActive

Needed a system that could research comprehensively from multiple sources (not just Google), detect when sources disagree on facts, generate content in a consistent personal voice, and develop features autonomously — plan, implement, validate, and commit code.

10 parallelResearch Sources
<90 secondsResearch Speed
4-stage autonomousDev Workflow
5-level pyramidValidation
Read Case Study
Personal ProjectFeatured

Podcast-to-Pack: AI-Powered Podcast Analysis Platform (Award Winner)

Builder & Product Architect3 months

Podcast listeners struggle to extract actionable insights from hours of content. Manual note-taking is slow, inconsistent, and misses connections. Needed a system that could analyze podcasts, extract key insights, identify themes, and generate structured deliverables — all while handling messy real-world audio and maintaining accuracy.

6h → 90min (85% reduction)Research Time
Strongest Business Case (Agentic AI PM cohort, business model + tech vs. 40+ competitors)Award
FastAPI + Pydantic AIArchitecture
4.8/5 avg rating (n=28 beta users)User Feedback
Read Case Study
Personal Project

Second Brain: Personal AI Assistant with 3-Layer Security

Builder & Security ArchitectActive

Personal assistants need access to sensitive data (calendar, emails, vault notes) to be useful, but LLMs are vulnerable to prompt injection attacks. Needed a system that could integrate Google Calendar, Slack, and Obsidian vault memory while defending against adversarial prompts trying to leak secrets or manipulate outputs.

0 (baseline: 3 failed attempts in first week)Security Incidents
15 of 15 adversarial testsInjection Patterns Blocked
94% → 99.8% (after circuit breaker implementation)Uptime
91% (with safety filters vs 96% without — acceptable trade-off)Response Accuracy
Read Case Study
Personal ProjectFeatured

MakeYourApp.ai: Sketch + Voice to React App in 60 Seconds

Builder & UX Architect2 months

Non-technical users have app ideas but can't prototype them. Wireframing tools require design skills, and no-code builders have steep learning curves. Needed a system where someone could sketch a UI on paper, describe it verbally, and get a working React app in under 60 seconds — no coding, no tutorials, just natural interaction.

3min → 60sec (67% faster than no-code tools)Time to Preview
GPT-4o Vision: 89% component detectionUI Extraction Accuracy
92% of generated components work without editsCode Quality
None (no coding) → React apps in 60secUser Skill Required
Read Case Study
Personal ProjectFeatured

Obsidian-Agent-Post: Voice-Consistent AI Content Pipeline

System Architect & Builder4 months

Creating consistent, high-quality content at scale requires maintaining voice, avoiding generic AI slop, and validating quality before publishing. Manual editing is slow; unchecked LLM output lacks personality. Needed a system with 31 voice checkpoints, 6-dimension quality scoring (CQS), and Karpathy-style validation loop — all while generating blog posts and LinkedIn content in a consistent personal voice.

90% (pre-checkpoints: 7/10 flagged, post: 1/10)Generic AI Slop Reduction
69.1 → 75+ avg (6-dimension rubric)CQS Quality Score
Manual 4h → Automated 90sec (96% time savings)Research Time
31 voice checkpoints, 89 test cases validating qualityContent Consistency
Read Case Study
Open Source

AAMAD Framework: Multi-Agent Development Methodology (PyPI Published)

Framework Creator & MaintainerOngoing

AI-assisted development with multiple agents (Product Manager, Backend Dev, Frontend Dev, QA, DevOps) lacks standardized contracts, leading to context drift, missed requirements, and inconsistent quality. Needed a framework with strict persona contracts, reproducibility, provenance tracking, and adapter abstraction for CrewAI, LangGraph, or prompt-based execution.

80% (pre-AAMAD: 6/10 off-spec, post: 1/10)Agent Drift Reduction
Published to PyPI, 3 production deploymentsFramework Adoption
13 test plans, 89 test cases validating contractsTest Coverage
3 adapters (CrewAI, LangGraph, Prompt-based)Adapter Compatibility
Read Case Study