Article Information
Category: Development
Published: April 18, 2026
Author: Chris de Gruijter
Reading Time: 14 min
Tags

I Built an AI Agent That Audits SEO So I Never Miss a Checklist Item Again
Published: April 18, 2026
Every agency has the same problem: audits are inconsistent. You catch the obvious things — title tags, missing meta descriptions, maybe a broken canonical — but the same subtle issues keep slipping through on every new site. A font preload missing here, a security header absent there, a decorative counter element that creates a second <h1> nobody noticed. I got tired of discovering these things in production and built a dedicated AI agent that runs a systematic, repeatable audit against a full 10-category checklist on any Next.js or Nuxt site I work on. After going through this process across multiple client sites, scores that were sitting at 62–79/100 are now consistently at 100/100. This post documents the full checklist, the quick wins I found repeatedly, and how the agent is structured.
The Problem with Ad-hoc SEO Audits
When you're building and maintaining multiple sites, SEO tends to get reviewed reactively — something looks off in Search Console, a client asks why a page isn't ranking, or a lighthouse score drops and you dig in. The problem with reactive audits is that you never review everything. You review the symptom, fix it, and move on. The 15 other things that are slightly wrong stay slightly wrong.
I kept a mental checklist that grew over time — things I'd learned the hard way across projects. But mental checklists drift. You remember to check canonicals but forget to verify the canonical points to the correct URL and not just the homepage. You add OG images to the main landing page but miss them on service subpages. The mistakes aren't random; they're systematic gaps that come back on every project.
The fix was to stop relying on memory and build the checklist into a repeatable automated tool — the global-seo-auditor agent. It audits any Next.js or Nuxt site against a structured checklist across 10 categories, reports what passes and what fails, and provides specific fix guidance. Running it takes minutes. The full mental overhead of "did I check X?" disappears.
The Full Audit Checklist
Below is the complete checklist the agent works through. I'll cover each category and explain why the items matter and what failure looks like in practice.
1. SSR & Rendering
- SSR or SSG is configured — client-side rendering means crawlers see an empty page. Every client site runs either full SSR (Nuxt, Next.js App Router) or static generation. There's no acceptable reason for important pages to be CSR-only.
- No hydration errors in the browser console — hydration mismatches cause partial content loss and unpredictable rendering. They're often invisible until you look at the console.
- Primary content is present in raw HTML — curl the page and check. If your H1 or main body text isn't in the raw response, crawlers won't see it regardless of what your browser renders.
2. Meta Tags
- Unique title tag under 60 characters on every page — the word "unique" carries most of the weight here. Templated titles that repeat the same prefix across all pages are a missed opportunity.
- Unique meta description under 160 characters — descriptions don't directly affect ranking but they do affect click-through rate. Duplicate descriptions are a signal of low-effort implementation.
- OG tags present on all pages — not just the homepage. Inner pages get shared too.
og:title,og:description,og:image, andog:urlall need to be set. - Twitter card tags —
twitter:card(set tosummary_large_image),twitter:title,twitter:description,twitter:image. Many implementations set these on the homepage and forget inner pages.
3. URL Structure
- Hyphens not underscores in slugs — Google treats underscores as word joiners, not separators.
/web-developmentis two separate words to Google;/web_developmentis one. - Trailing slash consistency — every site should have a clear policy (trailing slash or not) and enforce it with redirects. Mixed behaviour creates duplicate content.
- Canonical tag on every page pointing to the correct URL — not the homepage. This is the most commonly broken item I find. The canonical should match the page's own canonical URL exactly, including trailing slash policy.
- No soft 404s — pages that return HTTP 200 but render a "not found" message confuse crawlers and waste crawl budget.
4. Crawler Control
- robots.txt exists and includes AI crawler rules — the default robots.txt from most frameworks allows everything. In 2025 and beyond, explicitly disallowing AI training crawlers (
GPTBot,CCBot,anthropic-ai,Claude-Web) is a standard practice for client sites that haven't consented to being scraped for training data. - XML sitemap submitted and accessible — ideally at
/sitemap.xmlwith the URL listed in robots.txt. Verify it actually contains all canonical URLs and isn't returning 404. - No redirect chains — a 301 that points to another 301 wastes link equity and slows crawling. All redirects should resolve in a single hop.
5. AI Readiness
- llms.txt file present — the emerging standard for telling LLMs what your site is about and how to use its content. Similar to robots.txt but for AI agents. Covered in more detail in my dedicated post on llms.txt.
- Structured data present and valid — helps AI models and search engines understand the semantic content of pages. More on this in the structured data section below.
- Citable, authoritative content — content that directly answers questions with specific facts, names, dates, and numbers. AI summarisation tools and featured snippets both favour this format.
6. Security Headers
Security headers have a direct impact on SEO via Google's trust signals. A site without HSTS or with a permissive CSP loses points in a technical audit. More importantly, they protect users. These should be set at the Cloudflare Pages or server level, not in meta tags.
- HSTS (Strict-Transport-Security) — tells browsers this site should only be accessed over HTTPS. Recommended:
max-age=31536000; includeSubDomains; preload. - X-Frame-Options: DENY — prevents clickjacking. Every site should have this.
- X-Content-Type-Options: nosniff — prevents MIME type sniffing attacks.
- Content-Security-Policy — the most complex header, but at minimum a restrictive default-src that whitelists only your own domains and trusted third parties.
7. Performance / Core Web Vitals
Google uses Core Web Vitals as a ranking signal. The thresholds are assessed at the 75th percentile on mobile:
- LCP ≤ 2.5s — Largest Contentful Paint. Most LCP failures come from unoptimised hero images, render-blocking resources, or slow server response times.
- INP ≤ 200ms — Interaction to Next Paint (replaced FID in 2024). JavaScript-heavy frameworks need to defer non-critical work off the main thread.
- CLS ≤ 0.1 — Cumulative Layout Shift. Most CLS failures come from images without defined dimensions and fonts that load asynchronously and cause text reflow.
8. Structured Data / Schema.org
Schema markup helps search engines and AI models understand what a page is about at a semantic level — not just the keywords it contains. The types I check for on client sites:
- Organization or LocalBusiness — who the company is, what they do, how to contact them. Goes on the homepage.
- BreadcrumbList — improves sitelink appearance in search results and helps crawlers understand site structure.
- FAQPage — targets featured snippet positions. Every FAQ section on the site should have this markup.
- Service — on each service/product page. Signals intent and context to search engines.
9. Images
- Using next/image (Next.js) or NuxtImg (Nuxt) — these components handle responsive sizing, lazy loading, and modern format conversion automatically. Using a raw
<img>tag for content images is almost always a mistake. - Alt text on every image — not just non-empty alt text, but descriptive alt text. "Image of service" is not useful. A screen reader and a crawlers both need more than that.
- WebP or AVIF format — JPEG/PNG for content images is a 2018 default. Both Next.js and Nuxt handle format conversion automatically when you use their image components.
- Lazy loading for below-the-fold images — the LCP image should be
priority(Next.js) orloading="eager"; everything else should lazy load.
10. Fonts
- Using next/font or @nuxt/fonts — framework-native font loading is zero-config optimised. It automatically handles self-hosting, subsetting, and preloading.
- WOFF2 format — the only web font format worth using in 2025. Smaller, better compressed, universally supported.
- Font preloaded in the document head — fonts that aren't preloaded cause a flash of unstyled text (FOUT) and contribute to CLS. With
next/fontand@nuxt/fontsthis is automatic — but only if you haven't explicitly disabled it.
The Quick Wins I Found Repeatedly
After running this checklist across several client sites — sites that were already "live and working" and had been for months — I found the same categories of issues coming up repeatedly. These are the ones that moved scores the most with the least effort:
Security Headers: 15 Minutes, Big Impact
Every single site I audited was missing at least two of the four security headers. HSTS and X-Frame-Options were almost always absent. On Cloudflare Pages, adding these is a one-time headers config change — it literally takes 15 minutes. The score impact in PageSpeed Insights and third-party SEO tools is immediate and significant because security is scored as part of the technical SEO audit.
Canonical Tags Pointing to the Homepage
This one is subtle and easy to miss. If your canonical implementation has a bug — say, it falls back to the base URL when the per-page canonical isn't explicitly set — then every page on your site is canonicalised to the homepage. You're effectively telling Google that every page is a duplicate of your homepage and should be consolidated there. The fix is making sure your canonical implementation reads the current page URL correctly, including for dynamic routes.
<code>preload: false</code> on Fonts
I found this in multiple projects. Someone had added preload: false to a font configuration at some point — possibly to debug a flash of unstyled text, possibly by accident. With font preloading disabled, the WOFF2 file isn't linked in the document head, so the browser discovers it only after it's already parsed the CSS. This delays text rendering and causes CLS. Removing that one flag fixed the font-related CLS score on example.com from 0.18 to 0.02.
OG Images Missing on Inner Pages
Homepage OG images are almost always set correctly. Service pages, blog posts, and location pages frequently aren't. When someone shares a service page on LinkedIn or WhatsApp, it renders with no preview image or a broken placeholder. Beyond the social sharing experience, OG images are part of a complete meta tag implementation and their absence is flagged by automated auditors.
Double H1 Tags from Decorative Elements
This one surprised me the first time. A homepage had a large animated counter — "500+ projects completed" — and the developer had marked it up as an <h1> for visual styling reasons rather than semantic ones. The actual page heading was also an <h1>. Every heading audit tool flagged it. The fix is simple: decorative elements that need large text styling should use a <div> or <span> with CSS applied, not a semantic heading tag.
Non-Critical JS Components Not Deferred
Chat widgets, cookie consent scripts, analytics, and social proof widgets were being loaded synchronously on page load, blocking the main thread during the critical rendering window. Deferring them — either with defer/async on the script tags or by using dynamic imports with { ssr: false, lazy: true } in Nuxt — removed them from the LCP critical path entirely.
Results After Applying the Full Checklist
The score movement after going through this process with a site that had previously been "SEO optimised" by a previous agency:
- Before: PageSpeed mobile performance 62, SEO score 79/100, security headers score 0/4
- After: PageSpeed mobile performance 96, SEO score 100/100, security headers score 4/4
- Time to fix: roughly half a day for the full list on a site I was already familiar with
The "100/100 SEO score" is a PageSpeed Insights metric, not a guarantee of rankings — but it does mean there are no detectable technical issues that would prevent a page from being indexed and ranked correctly. It's a floor, not a ceiling. The ceiling is content quality, authority, and link profile — none of which a technical audit can fix. But getting the floor solid is the prerequisite.
How the Audit Agent Works
The global-seo-auditor agent is a Claude Code sub-agent defined in my agent harness. It has the full checklist baked into its system prompt and knows the specific implementation patterns for both Next.js and Nuxt 4. When I invoke it against a project, it:
- Reads the framework config (
next.config.tsornuxt.config.ts) to understand modules, image optimisation settings, and font configuration - Checks for the presence and correctness of
robots.txt,sitemap.xml, andllms.txt - Scans the layout and page components for meta tag implementation, canonical handling, and OG tag coverage
- Reviews Cloudflare Pages headers config (or middleware) for security header completeness
- Checks image components for alt text presence, format, and loading strategy
- Looks at font configuration for preloading, WOFF2 usage, and framework-native loading
- Checks structured data implementation against the expected schema types for the site category
- Produces a structured audit report with pass/fail per item, specific file locations, and fix recommendations
The audit is done against the source code directly — not a live URL. This means it catches issues before deployment and works even on localhost. It doesn't replace running PageSpeed Insights on the live site, but it gets the source-level issues handled first so PSI isn't surfacing the same basic problems every time.
The Checklist as a Reference
If you want to run through this manually without the agent, here's a condensed version you can work through as a standalone checklist for any Next.js or Nuxt project:
- SSR/SSG: Confirm framework is configured for server-side or static rendering on all important pages
- Raw HTML: curl the homepage and check that H1 and body content are present in the response
- Titles: Every page has a unique
<title>under 60 chars - Descriptions: Every page has a unique meta description under 160 chars
- OG tags: All four OG tags (
title,description,image,url) on every page including inner pages - Twitter cards: All four Twitter card tags on every page
- Canonicals: Present on every page, pointing to the page's own canonical URL (not the homepage)
- Trailing slash consistency: Enforced site-wide with redirects
- robots.txt: Present, valid, includes sitemap URL, and blocks AI training crawlers as desired
- Sitemap: Present, valid, submitted in GSC, contains all canonical page URLs
- No redirect chains: All redirects resolve in one hop
- llms.txt: Present at root if you want AI agents to understand your site
- Security headers: HSTS, X-Frame-Options, X-Content-Type-Options, CSP all set
- LCP ≤ 2.5s: Measured at p75 mobile in PageSpeed Insights
- INP ≤ 200ms: Measured at p75 mobile in CrUX data or PageSpeed Insights
- CLS ≤ 0.1: Measured at p75 mobile — check fonts, images, and dynamic content
- Structured data: Organization/LocalBusiness, BreadcrumbList, FAQPage (where applicable), Service
- Images: next/image or NuxtImg, descriptive alt text, WebP/AVIF, LCP image not lazy-loaded
- Fonts: next/font or @nuxt/fonts, WOFF2, preloaded (not disabled)
Why an Agent Rather Than a Script or Tool
There are existing tools that run automated SEO audits — Screaming Frog, Ahrefs Site Audit, even the Lighthouse CLI. So why build a custom agent?
The existing tools are crawler-based. They visit live URLs and check what the browser receives. They can't read your font configuration in nuxt.config.ts to see if preload: false was accidentally set. They can't check whether your canonical implementation is using a correct dynamic URL or falling back to a hardcoded base URL. They can't see that your robots.txt is missing AI crawler rules because it was generated by a default template.
Source-code auditing catches the cause, not just the symptom. Crawler-based tools see a missing canonical and report it as "canonical not set". The agent reads the layout file and tells you why it's missing and exactly which line to fix.
The other reason is workflow integration. Running Screaming Frog means deploying, waiting for a crawl, interpreting a CSV. Running the agent means typing one command in the terminal. The friction difference matters for how often audits actually happen — which is the only metric that matters for maintaining site quality over time.
Frequently Asked Questions
How often should I run a technical SEO audit?
At minimum, after any significant site change: framework upgrade, new section added, template modified, or dependency updated. In practice, running a quick check before every major deployment catches regressions before they reach production. A full deep-audit every quarter is reasonable for sites with active content production.
Do security headers affect SEO rankings directly?
Not as a direct ranking signal in Google's documented factors. But they affect the technical SEO scores that tools like PageSpeed Insights report, which agencies and clients use as proxies for site health. More practically, they protect against attacks (clickjacking, MIME sniffing) that could compromise your site and cause ranking drops indirectly. Adding them is a no-cost, high-value change.
What is a canonical tag pointing to the homepage, and why is it a problem?
A canonical tag tells search engines which URL is the "official" version of a page. If your implementation has a bug where the canonical always defaults to your base URL (e.g., https://example.com/), every page on your site tells Google: "the canonical version of this content is the homepage." Google may then consolidate all your pages into the homepage in its index, effectively de-indexing all your subpages. This is catastrophic for rankings and is unfortunately a common implementation bug.
Should I block AI crawlers in robots.txt?
That depends on your client's preference. Blocking AI crawlers (GPTBot, CCBot, anthropic-ai, Claude-Web) prevents your content from being used to train LLMs without consent. It does not affect Google or Bing crawling. Most clients I work with prefer to block AI training crawlers by default, especially for proprietary content. It's worth having the explicit conversation rather than defaulting either way.
What is INP and how does it differ from FID?
INP (Interaction to Next Paint) replaced FID (First Input Delay) as a Core Web Vital in March 2024. FID only measured the delay before the browser started processing the first user interaction. INP measures the full duration of all interactions throughout the page's lifetime — click, tap, keyboard input — and reports the worst-case at the 75th percentile. INP is much harder to pass on JavaScript-heavy pages because it captures ongoing main thread blocking, not just the initial load.
Does a 100/100 SEO score guarantee top rankings?
No. A 100/100 technical SEO score means there are no detectable technical barriers to indexing and ranking. It's a necessary baseline, not a sufficient condition. Rankings depend on content quality, topical authority, backlink profile, and how competitive the keyword landscape is. Think of technical SEO as clearing the path — it doesn't carry you to the destination, but a bad technical foundation will actively hold you back regardless of how good your content is.
How do I check if my canonical tags are working correctly?
The fastest way is to view the page source (Ctrl+U in most browsers) and search for "canonical". The href attribute should match the URL of the page you're currently viewing exactly — same protocol, same domain, same path, same trailing slash policy. If it's the same value on every page of your site (e.g., always https://example.com/), the implementation is broken. You can also use Google Search Console's URL Inspection tool to see what Google found as the canonical for a specific page.