How to audit your site for AI citability in 20 minutes

An AI search visibility audit is a structured review to determine if a brand appears in responses from AI tools such as ChatGPT, Perplexity, and Google AI Overviews. Between 20-30% of websites accidentally disallow GPTBot in their robots.txt file, which prevents AI crawlers from accessing their content.

What is an AI citability audit?

To audit AI citability means to run a structured review of your website and content to determine whether AI tools — ChatGPT, Perplexity, Google AI Overviews — are actually pulling from your pages when they answer questions in your niche. It's not about rankings. It's about whether your content gets extracted, cited, and surfaced as a source. Most SMB owners have no idea this gap exists until they notice competitors showing up in AI answers and they don't.

The distinction from traditional SEO matters more than most people realize. Classic SEO audits chase rankings — keyword positions, backlink profiles, Core Web Vitals. An AI citability audit focuses on a different set of signals: structured data validity, crawler access for AI bots specifically, and whether your content is extractable as a self-contained answer [1]. A page can rank on page one of Google and still never get cited by a single AI engine. That's the gap this audit is designed to close.

What does the audit actually check? At minimum, three layers [1]: whether AI crawlers (GPTBot, PerplexityBot, ClaudeBot) can reach your pages at all; whether your schema.org markup — Article, FAQPage, HowTo — is valid and present; and whether your content is structured so that a passage pulled out of context still answers a question completely. Miss any one of these and you're invisible to AI search regardless of your domain authority.

"Technical basics still matter: verify robots.txt doesn't block the AI crawlers, verify your schema.org markup is valid (Article, FAQPage, HowTo where relevant)." — Nikita Janockin, Founder, OG Traffic

One misconception worth addressing directly: people assume that high domain authority automatically produces AI citations. It doesn't. Content structure is the more decisive variable. Wikipedia gets cited constantly — not only because of its authority, but because its prose is built around self-contained, definitional passages that AI engines can extract cleanly. The structural pattern matters as much as the source's reputation.

There's also a testing trap that kills accurate audits before they start. Don't test your own citability by prompting ChatGPT or Perplexity while logged into your own account. These platforms personalize responses, and they'll bias results toward showing you your own content. Always test in incognito, with a fresh account, or through a tool that queries the platforms independently. Bad testing produces false confidence — which is worse than no data at all.

The audit is also a baseline, not a one-time event. AI search is GEO (Generative Engine Optimization) territory — a discipline that's moving faster than traditional SEO ever did. What gets you cited today may not be sufficient in 90 days as model training cycles update and new competitors publish structured content. Running the audit establishes a benchmark: which pages are cited, which are blocked, which are structurally invisible. Everything after that is iteration against that baseline.

So who actually needs this? Any SMB owner whose customers are using AI tools to research purchases, compare providers, or get recommendations — which, at this point, is most of them. The audit isn't a technical luxury. It's the starting point for knowing whether your content exists in the places your customers are actually looking.

How do you audit AI citability in 20 minutes?

To audit AI citability, check three things in sequence: IndexNow connection, robots.txt permissions for AI crawlers, and schema.org markup validity. These technical checks take under 20 minutes and tell you whether AI engines can find, read, and cite your pages — before you touch a single word of content.

Step 1: Check IndexNow first (5 minutes)

IndexNow is the fastest free signal available. If your site is already connected, the dashboard surfaces which specific pages AI engines are indexing and citing — no paid tool required. This matters more than most site owners realize: IndexNow doesn't just confirm crawlability, it shows citation-level visibility, which is the actual metric you're trying to move. If you're not connected yet, that's your first fix, not your last. Ahrefs and Semrush both offer paid coverage for deeper citation analysis, and OG Traffic is shipping a free AI citability audit tool that removes the need for either — but IndexNow alone gives you a meaningful starting point in minutes [1].

Step 2: Verify your robots.txt isn't blocking AI crawlers (5 minutes)

Open your robots.txt file directly (yourdomain.com/robots.txt) and scan for disallow rules that target GPTBot, ClaudeBot, PerplexityBot, or similar AI crawler user agents [3]. Blocking these crawlers is the single fastest way to disappear from AI-generated answers — and it happens more often than you'd expect, usually as an accidental side effect of a blanket disallow rule added during a site migration. If you find a block, remove it, resubmit via IndexNow, and move on. Don't overthink this step. It's binary: either the crawlers can get in or they can't.

Step 3: Validate your schema.org markup (10 minutes)

Run your key pages through Google's Rich Results Test or Schema Markup Validator and confirm that Article, FAQPage, or HowTo schema is present and error-free where relevant [3]. Missing or broken schema doesn't prevent AI engines from reading your content — but it removes a structured signal they use to classify and excerpt it. Think of schema as a label on a filing cabinet: the document exists either way, but the label tells the system exactly what's inside and where to pull from.

The one mistake that invalidates the whole audit

"Do NOT rely on prompt-testing with your own logged-in ChatGPT/Perplexity accounts — they know it's your project and will bias results toward showing you. Test in incognito, with different accounts, or via the official tool surface that queries the platforms on your behalf." — Nikita Janockin, Founder, OG Traffic

This is the most common audit error. Logged-in sessions are personalized. They'll surface your brand in responses partly because the platform has inferred your interest in it — not because your content is genuinely being cited at scale. Use incognito mode, a separate account, or a tool like atomicagi that queries AI platforms without that personalization layer. The goal is to see what a stranger sees when they ask about your topic.

What does a clean audit actually tell you? It tells you the floor is solid. Technical access, crawler permissions, and structured markup are table stakes — necessary but not sufficient. If all three check out and you're still not getting cited, the problem is content structure, not infrastructure. That's a different fix, covered in the next section.

Why is your site not appearing in AI search results?

Most sites are invisible to AI search engines for three fixable reasons: they're blocking AI crawlers in robots.txt, they lack schema markup that signals content structure, or their prose isn't organized in ways that AI engines can extract as self-contained answers. None of these require a developer. All of them show up in a 20-minute audit.

Start with the number that should alarm you. Between 20–30% of websites accidentally disallow GPTBot in their robots.txt file, which means OpenAI's crawler can't read a single page on those sites. [2] That's not a content quality problem — it's a configuration error that no amount of good writing can overcome. Check your robots.txt right now (it lives at yourdomain.com/robots.txt) and confirm that GPTBot, ClaudeBot, and PerplexityBot are not listed under Disallow. This takes under two minutes and it's the single fastest fix available.

Schema markup is the second culprit, and it's subtler. When you audit AI citability on a page, you're essentially asking: "Can a machine read this content without guessing?" Schema.org markup — specifically Article, FAQPage, and HowTo types — gives AI engines an explicit map of what your content is and how it's structured. Missing schema doesn't make your page invisible, but it puts you at a disadvantage against competitors whose pages are explicitly labeled. Google's Rich Results Test and Schema.org's validator are both free and return results in seconds.

"Missing schema, bot blocking, and weak structural patterns — no TL;DR, no self-contained passages, no clear question-answer pairing — are the three most common reasons high-authority sites still don't get cited." — Nikita Janockin, Founder, OG Traffic

The third reason is the one most site owners don't expect. Content structure matters independently of content quality. Wikipedia gets cited constantly by ChatGPT, Perplexity, and Google AI Overviews — not only because of its domain authority, but because every article opens with a definitional summary, uses clear heading hierarchies, and writes paragraphs that hold meaning when extracted in isolation. That's not an accident. It's a structural pattern you can replicate. If your posts bury the answer in paragraph six, AI engines will skip you and cite whoever answered in paragraph one.

What does this mean practically? Look at which competitors in your niche are already getting cited by AI, then reverse-engineer their structure — not their topic, their format. Are they opening with a direct answer? Using FAQ sections? Writing short, declarative paragraphs? Those patterns are the signal. Copying the substance without the structure won't move the needle.

One honest caveat on testing: don't prompt your own logged-in ChatGPT or Perplexity account to check whether your site appears. These platforms have enough context about your browsing behavior to bias results in your favor. Test in incognito mode, use a separate account, or use a tool that queries the platforms independently. IndexNow is a good free starting point for surfacing AI citability signals; atomicagi covers citation tracking across AI platforms. OG Traffic is shipping a free AI citability audit tool specifically for this workflow, so you won't need a paid Ahrefs or Semrush subscription just to answer a basic diagnostic question.

Fix the crawl access first. Then validate schema. Then look at structure. In that order.

How can you improve your content for AI citations?

To improve your content for AI citations, focus on three areas: content structure that mirrors already-cited sources, schema markup that matches your site type, and source-backed claims throughout your prose. These aren't separate tactics — they compound. Get all three right and you give AI engines a clear, low-friction path to pulling your content into a response.

Start by studying who's already getting cited — then reverse-engineer their structure.

Wikipedia gets cited constantly, and it's not just because of domain authority. It's because of how the prose is built: short definitional leads, self-contained sections, clear question-answer pairing. Research from Princeton's KDD 2024 GEO study confirms that content with cited statistics and sources sees 30–40% higher AI visibility [2]. That's not a formatting trick — it's a signal that the content is grounded. The practical implication: every factual claim in your post needs a source attached to it, not because AI engines read footnotes, but because sourced writing forces the kind of precision that makes content extractable.

"We analyzed a large number of websites that receive AI traffic and now shape our own content around the patterns they share. Wikipedia is the obvious example — yes, very high domain authority, but look at how their prose is structured, not just their backlinks." — Nikita Janockin, Founder, OG Traffic

Schema markup is the fastest structural fix most sites skip.

For e-commerce sites, adding FAQPage schema to your top 10 pages is the highest-leverage move [3]. Service businesses should prioritize local schema on location and service pages [3]. Neither requires a developer — most CMS platforms support schema plugins, and Google's Rich Results Test validates the output in under two minutes. The mistake most site owners make is treating schema as a one-time checkbox. It's not. Schema tells AI crawlers exactly what type of content they're reading, which directly affects whether a passage gets extracted as an answer or skipped entirely.

Don't test your own citability while logged into your accounts.

This is a trap. ChatGPT and Perplexity personalize results, which means testing in your own logged-in session will bias the output toward showing your content. Always test in incognito, with a fresh account, or through a tool surface that queries the platforms neutrally. IndexNow is a free starting point — it surfaces which pages are being indexed and flagged for AI citability. For paid coverage, Ahrefs and Semrush both have citation-tracking features. OG Traffic is shipping a free AI citability audit tool specifically so non-technical owners don't need a paid subscription just to answer the basic question: is my content getting cited?

The underlying requirement is substance, not formatting magic.

Here's the contrarian reality: anyone selling "GEO optimization" — GEO stands for Generative Engine Optimization, the practice of structuring content to appear in AI-generated responses — as a bypass for actually knowing your domain is selling you something that won't hold. Structure matters. Schema matters. But AI engines are trained on human judgment about what's useful. If your content doesn't say something distinctive, no amount of FAQPage markup will get it cited.

"The whole thing still reduces to: provide distinctive, extremely valuable content for the end reader. The format matters (structure, schema, TL;DR), but the underlying requirement is substance." — Nikita Janockin, Founder, OG Traffic

What's the single metric to watch after making these changes? Appearance in Google AI Overviews for your target queries. That's the clearest leading indicator — more reliable than ranking position alone — that the structural fixes are working.

Which tools are best for measuring AI search visibility?

To audit AI citability effectively, you need a short stack of tools — not a sprawling dashboard. Start with IndexNow (free, direct signal), layer in atomicagi for structural analysis, and use Ahrefs or Semrush if budget allows. The highest-signal metric to watch is appearance in Google AI Overviews for your target queries. That's your clearest proof the fixes are working.

Start with IndexNow — it's free and surprisingly direct.

IndexNow isn't just a crawl-submission tool. When your site is connected to it, it surfaces which specific pages are being picked up by AI-adjacent indexing pipelines — making it the fastest free first step for any citability audit. Most SMB owners skip it entirely because they assume it's only for Google Search Console parity. It's not. The practical recommendation: connect IndexNow before you touch anything else, then use its data to prioritize which pages need structural fixes first. Five minutes of setup, zero cost, immediate signal.

"Start with IndexNow — if your site is connected to IndexNow, it directly surfaces AI citability and which specific pages are getting cited. That's the easiest + free first step." — Nikita Janockin, Founder, OG Traffic

Atomicagi fills the gap IndexNow can't.

Where IndexNow tells you what is being seen, atomicagi tells you why something might be getting ignored. It analyzes content structure — heading clarity, entity density, question-answer pairing — the exact patterns that AI engines use to decide whether a passage is citation-worthy. Think of it as a structural mirror: you paste in a URL, and it reflects back what an AI crawler actually encounters. That's a different diagnostic than a standard SEO audit, and it's worth running on your top five pages before anything else.

Paid tools have coverage, but don't confuse coverage with signal.

Ahrefs and Semrush both offer AI visibility tracking now, and for sites with larger content libraries, that breadth matters. [1] But a common mistake is treating their AI visibility scores as ground truth. They're proxies — useful for trend-spotting across dozens of pages, less useful for diagnosing why a specific page isn't getting cited. Use them for portfolio-level monitoring, not page-level diagnosis.

One critical caveat that almost nobody mentions: don't test your own citability by prompting ChatGPT or Perplexity while logged into your regular account.

"Do NOT rely on prompt-testing with your own logged-in ChatGPT/Perplexity accounts — they know it's your project and will bias results toward showing you. Test in incognito, with different accounts, or via the official tool surface that queries the platforms on your behalf." — Nikita Janockin, Founder, OG Traffic

This matters more than most people realize. Personalization layers in both platforms skew results toward content you've previously engaged with — which means you'll see false positives and under-diagnose real gaps.

What's missing from this tool landscape? Honest answer: a lot.

There's no single free tool that checks robots.txt for GPTBot and ClaudeBot blocking, validates schema.org markup, and scores citation structure in one pass. That gap is exactly why OG Traffic is shipping a free AI citability audit tool — so you don't have to stitch together three separate checks manually. Until it's live, the working stack is: IndexNow for indexing signal, atomicagi for structural analysis, and a manual robots.txt check to confirm you're not accidentally blocking AI crawlers. Simple. Unglamorous. It works.

What metrics should you track in an AI audit?

Track three metrics when you audit AI citability: (1) appearance in Google AI Overviews for your target queries, (2) citation count in Perplexity for those same queries, and (3) branded-mention frequency in ChatGPT responses. Together, these give you a cross-platform signal that's actionable within a 30–60 day window.

Start with Google AI Overviews. It's the highest-signal metric available right now — not because Google dominates search (though it does), but because AI Overviews are generated from a relatively small, curated pool of sources. If your page appears there for a target query, that's confirmation the content structure, schema, and entity density are working. If it doesn't, you have a specific, testable gap. Check this manually in incognito mode for each priority query, or use a rank-tracking tool like Semrush or Ahrefs that surfaces AI Overview inclusion as a separate data point.

"The single highest-signal metric is appearance in Google AI Overviews for your target queries. That's the clearest leading indicator that the fixes are working." — Nikita Janockin, Founder, OG Traffic

Perplexity citation count is your second instrument. Unlike Google AI Overviews, Perplexity shows its sources inline — which means you can count them, track them, and compare them against competitors. Run the same five to ten target queries in Perplexity weekly and log which domains appear. This isn't glamorous work, but it's precise. One practical implication most audits miss: a competitor appearing consistently across multiple queries signals structural patterns worth studying, not just domain authority worth envying.

Branded mentions in ChatGPT are the hardest to measure reliably, but don't skip them. The key caveat here is testing methodology. Don't run these queries from your own logged-in ChatGPT account — the model has context about your project and will bias toward surfacing you. Test in incognito, with a fresh account, or through a tool surface that queries the platform neutrally. This is the same reason atomicagi and similar monitoring tools exist: to remove the observer effect from your measurement.

Here's the uncomfortable truth about all three metrics: they're unstable by design. Semrush research found that 40–60% of cited sources change month-to-month in Google AI Mode and ChatGPT [3]. That churn rate means a single audit snapshot is nearly worthless. What you're actually building is a monitoring cadence — check AI Overview inclusion every two to four weeks, log Perplexity citations weekly, and run ChatGPT brand-mention tests monthly from a neutral account.

This means the audit isn't a one-time project. It's a recurring diagnostic. Set a calendar reminder. Treat citation volatility the same way you'd treat ranking fluctuations in traditional SEO: expected, manageable, and only meaningful when tracked over time.

One final note on tooling. IndexNow surfaces which pages are being crawled and indexed in ways that correlate with AI citability — it's free and worth connecting before you spend anything on paid coverage. OG Traffic is shipping a free AI citability audit tool specifically so non-technical founders don't have to stitch together Semrush, manual Perplexity checks, and incognito ChatGPT tests just to get a baseline read. Until then, the three-metric framework above — AI Overviews, Perplexity citations, ChatGPT brand mentions — gives you everything you need to know whether your content is being found, cited, and trusted by AI systems.

FAQ

What is the difference between AI citability and traditional SEO?

Traditional SEO optimizes for rankings — getting your page to appear on page one of search results. AI citability focuses on whether AI tools like ChatGPT, Perplexity, and Google AI Overviews actually quote or reference your content in their responses [1]. The practical difference shows up in what you optimize: structured data, crawler access, and content extractability matter more than backlink counts [1]. Think structure and substance, not just authority.

How often should I audit my site for AI citability?

More often than you'd expect. Semrush research found that 40–60% of cited sources change month-to-month in Google AI Mode and ChatGPT [15], meaning a site cited today may be invisible next month. Plan for a full audit every 30–60 days, with lighter spot-checks in between. Track your appearance in Google AI Overviews as your primary signal [13] — it's the clearest early indicator that your fixes are holding.

What are Answer Capsules and how do they relate to AI citability?

Answer Capsules are self-contained passages within your content — tight, standalone sections that directly answer a specific question without requiring surrounding context. AI models extract these naturally when generating responses. Sites missing this structure (no TL;DRs, no clear question-answer pairings) are among the most common non-cited pages [6]. Write every key section as if it could stand alone in an AI response, and your extractability improves immediately.

How do Core Web Vitals impact AI output appearance?

Core Web Vitals affect how quickly AI crawlers like GPTBot and ClaudeBot can access and process your pages, but they are not a primary citability signal. The bigger technical blockers are robots.txt misconfigurations — roughly 20–30% of sites accidentally block AI crawlers entirely [7] — and missing or invalid schema markup [5]. Fix crawler access and schema first; page speed is a secondary concern once those fundamentals are in place.

What is the biggest misconception about AI citability?

That it's an entirely new discipline requiring "GEO magic" separate from good SEO. It isn't. The underlying requirement is the same: provide distinctive, genuinely valuable content for the reader. Format and structure matter — schema, TL;DRs, cited statistics — and research shows content with sourced data sees 30–40% higher AI visibility [9]. But substance drives everything. Anyone promising AI citation shortcuts that bypass real domain expertise is selling snake oil.

CMS & Blog

Dev & AI tools

Social distribution