Back to Portfolio
Production System

Obsidian-Agent-Post: Voice-Consistent AI Content Pipeline

System Architect & Builder|
4 months
89% avg (31 checkpoints, 4 categories)
Voice Match Rate
75+ avg across 6 dimensions (up from 69.1 baseline)
CQS Quality Score
10 sources in <90 sec (vs 4+ hours manual)
Research Speed
6 channels per topic, format-adapted
Multi-Channel Output
80 findings, 0 false positives across 4 rounds
Karpathy Validation
90% (7/10 flagged pre-pipeline, 1/10 post)
AI Slop Reduction

The Challenge

AI-generated content has a credibility problem: it sounds generic, lacks personal voice, and fails silently on quality. Most AI content tools optimize for speed, not substance. The result is 'AI slop' -- technically correct but indistinguishable from any other LLM output. I needed a system that could generate content across 6 channels (LinkedIn, Blog, Twitter/X, Substack, Medium, GitHub) while maintaining a consistent personal voice, scoring quality across multiple dimensions, and catching issues that single-pass LLM review misses.

The Approach

Built a 5-layer AI content pipeline using a multi-agent development framework with strict contracts: (1) Multi-Source Research -- 10 parallel sources (HackerNews, Reddit, ArXiv, GitHub, YouTube, RSS, Obsidian vault, Brave, newspaper4k, Google Drive) with semantic deduplication and conflict detection, completing in <90 seconds. (2) CQS Scoring Engine -- 6-dimension quality rubric (Clarity, Quality, Substance, Voice, Structure, Impact) with hard gates (voice match >=70%, word count compliance) and soft scoring, calibrated against published content. (3) Voice Consistency -- 31 checkpoints across 4 categories (Core Voice 9, Theoretical Content 10, Practical Content 11, Optional 2) ensuring every post sounds like me, not generic AI. (4) Karpathy 3-Loop Validation -- 4 rounds with 11 independent audit agents performing structural, adversarial, and logical validation. 80 verified findings, 0 false positives. (5) Multi-Channel Generation -- single research topic adapted to 6 channel-specific formats (LinkedIn 400-word hooks, Blog 1500-word deep-dives, Twitter threads, Substack newsletters, Medium articles, GitHub READMEs). Tech: FastAPI + SQLModel + PostgreSQL + Next.js 14.

Key Learnings

  • Voice consistency cannot be LLM self-corrected -- explicit checkpoints (31 rules across 4 categories) prevent 90% of generic AI output
  • Multi-dimensional scoring (CQS with 6 dimensions) catches quality issues that single-metric systems and LLM self-review miss entirely
  • Karpathy multi-round validation (4 rounds, 11 independent agents) surfaces issues that single-pass review cannot detect -- convergence pattern: 15, 33, 18, 14 findings per round
  • FAILURE: Initial single-pass generation produced 69.1 avg CQS -- unacceptable quality variance. Added 3-loop validation with hard gates to reach 75+ consistently.
  • Dogfooding proof: every article on sathyan.ai was produced by this pipeline -- the content IS the evidence that the system works
Every article on this site was produced by the system described in this case study. The content you are reading IS the proof that the system works.

Self-referential proof -- dogfooding

Read the Deep Dive