Production System

Obsidian-Agent-Post: Voice-Consistent AI Content Pipeline

System Architect & Builder|

4 months

89% avg (31 checkpoints, 4 categories)

Voice Match Rate

75+ avg across 6 dimensions (up from 69.1 baseline)

CQS Quality Score

10 sources in <90 sec (vs 4+ hours manual)

Research Speed

6 channels per topic, format-adapted

Multi-Channel Output

80 findings, 0 false positives across 4 rounds

Karpathy Validation

90% (7/10 flagged pre-pipeline, 1/10 post)

AI Slop Reduction

The Challenge

AI-generated content has a credibility problem: it sounds generic, lacks personal voice, and fails silently on quality. Most AI content tools optimize for speed, not substance. The result is 'AI slop' -- technically correct but indistinguishable from any other LLM output. I needed a system that could generate content across 6 channels (LinkedIn, Blog, Twitter/X, Substack, Medium, GitHub) while maintaining a consistent personal voice, scoring quality across multiple dimensions, and catching issues that single-pass LLM review misses.

The Approach

Built a 5-layer AI content pipeline using a multi-agent development framework with strict contracts: (1) Multi-Source Research -- 10 parallel sources (HackerNews, Reddit, ArXiv, GitHub, YouTube, RSS, Obsidian vault, Brave, newspaper4k, Google Drive) with semantic deduplication and conflict detection, completing in <90 seconds. (2) CQS Scoring Engine -- 6-dimension quality rubric (Clarity, Quality, Substance, Voice, Structure, Impact) with hard gates (voice match >=70%, word count compliance) and soft scoring, calibrated against published content. (3) Voice Consistency -- 31 checkpoints across 4 categories (Core Voice 9, Theoretical Content 10, Practical Content 11, Optional 2) ensuring every post sounds like me, not generic AI. (4) Karpathy 3-Loop Validation -- 4 rounds with 11 independent audit agents performing structural, adversarial, and logical validation. 80 verified findings, 0 false positives. (5) Multi-Channel Generation -- single research topic adapted to 6 channel-specific formats (LinkedIn 400-word hooks, Blog 1500-word deep-dives, Twitter threads, Substack newsletters, Medium articles, GitHub READMEs). Tech: FastAPI + SQLModel + PostgreSQL + Next.js 14.

Key Learnings

Voice consistency cannot be LLM self-corrected -- explicit checkpoints (31 rules across 4 categories) prevent 90% of generic AI output
Multi-dimensional scoring (CQS with 6 dimensions) catches quality issues that single-metric systems and LLM self-review miss entirely
Karpathy multi-round validation (4 rounds, 11 independent agents) surfaces issues that single-pass review cannot detect -- convergence pattern: 15, 33, 18, 14 findings per round
FAILURE: Initial single-pass generation produced 69.1 avg CQS -- unacceptable quality variance. Added 3-loop validation with hard gates to reach 75+ consistently.
Dogfooding proof: every article on sathyan.ai was produced by this pipeline -- the content IS the evidence that the system works

“Every article on this site was produced by the system described in this case study. The content you are reading IS the proof that the system works.”

— Self-referential proof -- dogfooding

Read the Deep Dive

The 5D Method: How I Ship AI Products That Actually Work in Production6 min read

Context Engineering: The 2025 Skill That Replaced Prompt Engineering8 min read