How to Track GPTBot, ClaudeBot, and PerplexityBot Hits on Your Site
AI answer engines only recommend brands they can crawl. Here's how to confirm GPTBot, ClaudeBot, PerplexityBot and others are actually reading your pages, and what to do when they aren't.
Aeranko Team
AI Search Optimization

ChatGPT, Claude, Perplexity, and Gemini all recommend brands every day. But they only recommend brands they have read. If GPTBot can't fetch your pricing page, ChatGPT has nothing to cite when a user asks what you cost. If ClaudeBot is blocked by your CDN, you'll never show up in Claude's answers no matter how good your content is. Most teams discover this the expensive way: checking SEO rankings and feeling fine while their AI visibility quietly rots.
This guide shows you how to track AI-crawler traffic on your own site, spot gaps before they cost you deals, and turn the data into an AEO priority list.
Which AI Crawlers Should You Watch?
There are three groups worth tracking, and they're not equally important.
Indexing crawlers build the training corpus each AI model is fine-tuned on. These include GPTBot (OpenAI), ClaudeBot and anthropic-ai (Anthropic), Google-Extended (Google Gemini / AI Overviews), Applebot-Extended (Apple Intelligence), and CCBot (Common Crawl, which feeds many open-source models). If an indexer can't read your page, it won't exist in the model's training data.
Live-retrieval crawlers fetch pages in real time when a user asks a question. ChatGPT-User, Perplexity-User, OAI-SearchBot, and PerplexityBot fall here. These are far more important for short-term visibility because they read your site while the user is waiting for an answer.
Meta crawlers (Bingbot, Bytespider, Meta-ExternalAgent, FacebookBot) power hybrid AI surfaces like Copilot, TikTok's AI, and Meta AI. Often overlooked, but they matter if your audience overlaps with those platforms.
The Three Ways Teams Try to Track This (And Why Two Fail)
Option 1: Grepping server logs
The classic method. SSH in, grep user-agents, count rows. Works if you have one server and two crawlers to care about. Breaks the moment you're on Vercel, Cloud Run, or any edge platform where logs are ephemeral, partial, or paid-tier-only. It also buries the data in a format nobody on your marketing team can act on.
Option 2: Reading Cloudflare / Vercel bot reports
Better. Cloudflare's AI Audit and Vercel's bot analytics tell you crawler volume. But they bucket all AI crawlers together, don't separate training from real-time retrieval, and don't connect crawler activity to page-level AEO priorities. You'll see that GPTBot visited 4,000 times. You won't see which pages it fetched or which ones returned a 404.
Option 3: A dedicated crawler-tracking middleware
The right answer. A small piece of code that inspects every incoming request, flags the AI crawlers, and ships the event to a dashboard keyed by page + platform + timestamp. That's what Aeranko Ship does: one install, zero config, live data on every deploy.
The Four Metrics That Matter
Once you have data flowing, you need to know what to look at. Four signals separate "AI can see you" from "you're invisible":
-
Crawler coverage by platform. Are GPTBot, ClaudeBot, PerplexityBot, and Google-Extended all reaching your site? Missing even one narrows your answer-engine footprint by 25%.
-
Page-level crawler heatmap. Which pages get the most AI traffic? If your pricing and docs pages are being hit daily but your case studies aren't, that's a content-structure gap to fix.
-
Error-rate per crawler. A 4xx or 5xx to an AI bot is invisible in Google Search Console. Most teams ship broken robots.txt rules and only find out months later when they notice they've disappeared from Perplexity.
-
Trend over time. Did GPTBot double its visits after you published a comparison page? Did ClaudeBot's volume drop after a CDN rule change? Weekly deltas tell you what the models are reacting to.
Make Sure You're Not Accidentally Blocking Them
Before tracking, check you're not blocking. The five most common ways teams accidentally hide from AI:
robots.txtdenyingGPTBot,ClaudeBot, orGoogle-Extendedwithout realizing it (often copied from an old Shopify or Wordpress template)- CDN / WAF bot-protection rules treating AI crawlers as scrapers and serving 403
- Next.js / Nuxt
middlewarerate-limiting all non-browser agents - Authentication walls on pages that don't actually require auth (product pages behind a cookie gate)
- Cache configuration returning stale 5xx errors to bots that retry once and never come back
Fix those first. Your tracker then tells you whether the fix worked; the crawler hit graph should start climbing within a few days.
Tracking With Aeranko Ship (2 Minutes)
Aeranko Ship is the observability layer we built for exactly this. Install it once and every AI crawler hit streams into a dashboard grouped by platform, page, and status. It also generates the four AEO files (llms.txt, robots.ts, sitemap.ts, metadata) so you stop leaking visibility to misconfigurations.
Three ways to install:
# Option A: npm, one-line in your proxy.ts / middleware.ts
npm i @aeranko/ship
# Option B: CLI, opens a browser for pairing + auto-injects everything
npx @aeranko/cli install
Option C: Vercel Marketplace or GitHub App. One-click integration that injects the API key and opens a PR to wire it up. Best for non-technical teams.
After install, open /dashboard/crawlers to see the four metrics above rendered live.
What You'll Actually Do With the Data
Tracking is only useful if it changes a decision. In practice, Aeranko customers use crawler data three ways:
Triage content. Pages AI bots visit weekly are your AEO surface. Add a definition block, canonical URL, and Schema.org Article or FAQPage markup to those first. Ignore the long tail until crawlers find it.
Prove AEO work to leadership. "We added citation-ready content last month and GPTBot visits jumped 4.2x" is a better story than "we think AEO is working." Screenshots of the crawler trend line close that loop.
Catch regressions. A Vercel deploy changed your middleware? A new CDN rule? The crawler graph will flatline within 12 hours. Without tracking, you'd discover the drop a quarter later in revenue.
Start Small, Scale the Signal
You don't need a full AEO platform to start. You need one line of middleware and an honest look at what's hitting your site. Most teams we onboard discover within 48 hours that one or two crawlers are either missing entirely or returning errors, and the fix takes an afternoon.
Install Aeranko Ship, watch a week of data, then decide which AEO investments make sense. If you want to see the dashboard before installing, run a free audit here. It works without any tracker and tells you whether the AI engines already recommend you.
See how your brand ranks in AI search — for free
Run a free AI audit and get your visibility score across ChatGPT, Perplexity, Gemini, and Google AI Overviews in 60 seconds.
Run Free AI Audit