Data-Driven SEO: From Quarterly Audit to Custom Tooling and AI Agent Workflows

Built coffey.codes a repeatable SEO operation from scratch: quantitative audits via Search Console MCP, structured-data hardening across every page, a four-engine custom snapshot pipeline (GSC, GA4, Bing, Google Ads), and AI agent workflows that produce editorial reports on demand.

The Challenge

Even a personal site needs to be findable, and coffey.codes had no quantitative picture of how it was actually performing in search. The site was indexed and ranking on something, but the harder questions had never been asked: which queries were earning impressions, where clicks were falling off, whether any pages were regressing. SEO posture was a feeling, not a number. The goal was to turn it into a repeatable, data-driven operation. Every quarter should be able to rerun the same analysis, compare to the previous quarter, and surface drift without anyone having to remember which screenshots were taken when.

Phase 1: Quantitative audit

The Q2 audit was a 365-day pull driven entirely through the Google Search Console MCP from inside Claude Code. Every datapoint came from the API, not from the GSC UI's exports. That constraint mattered: it meant the next quarter could rerun the same calls and produce a directly comparable report instead of one human's screenshot pile. Bing Webmaster Tools and GA4 were added alongside for the three-engine picture.

Audited

365-day window

294,996

Impressions

1,152

Clicks

6.95

Avg position

0.39%

Site-wide CTR

Top pages reviewed

What position alone does not tell you

The audit surfaced a counterintuitive finding: ranking #1 on coffey.codes earned essentially zero clicks. Some queries reach the top of the SERP but the searcher's intent is a competitor. People searching 'android emulator' want software downloads, not a Flutter article. Meanwhile, the vibe-coding article cluster at position 8 outperformed positions 4-7 on this site at 3-7% CTR, far above the industry baseline. Query intent matters more than position. That finding informed the next phase: not just chasing rank improvements, but pruning effort on pages whose ranking was demonstrably misaligned with what searchers wanted.

Phase 2: On-page and structured-data work

The audit identified concrete issues: missing dateModified, weak Open Graph cards, no pagination noindex, ambiguous entity signals to Google's Knowledge Graph. Two specs worked through them. The result: every page now ships a coherent JSON-LD graph linking the site, the author, and the publisher across stable @id URIs. Articles emit BlogPosting plus a BreadcrumbList; the homepage layout emits Person plus Organization with sameAs links to every owned profile. Pagination pages emit noindex,follow so link equity flows through without polluting the index.

Schema entities (site-wide)

Schema entities (per article)

Owned-profile sameAs links

noindex,follow

Pagination handling

Phase 3: Custom SEO tooling

After the on-page work landed, the audit workflow was still manual: a Claude Code session pulling data through the MCP each quarter. The next spec turned that into a script. scripts/seo-snapshot.mjs uses the same service-account credentials but bypasses the MCP entirely, pulling Google Search Console, Google Analytics 4, Bing Webmaster Tools, and Google Ads Keyword Planner directly into a single dated JSON snapshot. Snapshots are committed to git because Google's data window is only 16 months, and what isn't snapshotted is lost forever. A companion scripts/seo-snapshot-diff.mjs prints the delta between any two snapshots with ANSI colors and box-drawing characters in the terminal, falling back to plain ASCII for CI and pipes.

Phase 4: AI agent workflows for editorial decisions

The snapshot is the data substrate; the leverage came from four follow-up scripts that turn it into editorial answers. scripts/keyword-audit-articles.mjs flags every article where Google Ads suggests a higher-volume keyword the article could target with light editing. scripts/keyword-discover-topics.mjs produces a ranked editorial backlog seeded from the site's article categories and top GSC queries, filtering out anything already covered by existing slugs. scripts/keyword-validate-lps.mjs verdicts each landing page as WELL_TARGETED, UNDER_INVESTED, or OVER_AMBITIOUS. scripts/keyword-probe-url.mjs is a one-shot competitor URL probe. Each report writes dated markdown into docs/strategy/data/. Future Claude Code agents can invoke any of these scripts, ingest the output, and incorporate it into the next quarterly audit doc without human intervention beyond final review. The agent brief in docs/documentation/agents/ documents the full surface so a fresh agent session can pick up the work without re-discovery.

Where this leaves things

The site now has a 365-day baseline snapshot in git, a verifiable structured-data graph across every page, and a tool chain that can produce the same data in the same shape every quarter. The Q3 audit ran end-to-end through the new pipeline; the Q4 audit (target August 2026) will be the first full four-engine run with Google Ads keyword volume context enriching the gsc.topQueries table inside each snapshot. Drift detection is now a seo-snapshot-diff.mjs call. Editorial decisions can cite specific numbers from the latest snapshot instead of intuition. And because every spec was scoped tightly and the scripts share a single auth module at scripts/lib/google-ads.mjs, the whole pipeline can be extended (a fifth engine, a new report shape, a different cadence) without touching the existing surface.