AI Crawler Check
Free Bot Analysis Tool

Changelog

Version history and updates for AI Crawler Check.

v4.3 30 Blog Posts: E-E-A-T, GEO vs SEO & AI Bot Troubleshooting March 27, 2026
  • 30 BLOG POSTS: Published 3 new E-E-A-T focused articles — E-E-A-T for AI Search, GEO vs SEO Complete Comparison, Why AI Bots Can’t Crawl Your Website
  • 12 NEW IMAGES: 3 hero images + 9 inline diagrams covering trust pyramids, GEO/SEO frameworks, diagnostic flowcharts, and robots.txt fixes
  • E-E-A-T AUTHORITY: Deep-dive into Experience, Expertise, Authoritativeness & Trustworthiness signals for AI search citation. Original frameworks and expert analysis
  • INTERNAL LINKS: 80+ strategic internal links across 3 posts. Heavy tool CTAs, cross-referencing to crawler guides, robots.txt generator, schema markup, and audit checklist
  • WAF FIX: Fixed false-positive bot blocking when target sites use Cloudflare WAF. Multi-UA fallback strategy with WAF detection
v4.2 3 New Blog Posts: Google-Agent, WebMCP & llms.txt Guide March 26, 2026
  • 27 BLOG POSTS: Published 3 new in-depth articles — Google-Agent & Project Mariner, WebMCP Guide, How to Create llms.txt
  • 12 NEW IMAGES: 3 hero images + 9 inline diagrams covering agentic browsing, WebMCP architecture, and llms.txt structure
  • TOPICS: Agentic Engine Optimization (AEO), WebMCP Tool Contracts, user-triggered-agents.json, Web Bot Auth, llms.txt templates for 4 website types
v4.1 Google-Agent, 24 Blog Posts & Microsoft Clarity March 25, 2026
  • NEW BOT: Google-Agent (Project Mariner) — Google’s brand-new agentic AI crawler that navigates sites, fills forms, and takes actions on behalf of users. Monitor its activity to measure how important WebMCP is for your site
  • 155 BOTS: Bot database expanded from 154 → 155. Google-Agent added to Google Bots category with full directory page, FAQs, and structured data. Now 248 SEO-optimized directory pages (161 bot + 8 category + 72 operator)
  • 24 BLOG POSTS: Published 12 new articles (posts 13–24) covering robots.txt creation, Applebot-Extended, Meta-ExternalAgent, AI SEO Audit Checklist, Cohere AI Crawler, and more
  • E-E-A-T FOCUS: All new posts emphasize Experience, Expertise, Authoritativeness, and Trustworthiness with author bylines, expert recommendations, and data-backed analysis
  • ANCHOR TEXT: Strategic homepage anchor text distribution — “AI crawler checker free” (14x), “AI crawler checker online” (12x), “web crawler tool free” (6x), “AI crawlers” (4x)
  • CLARITY: Microsoft Clarity analytics (w0nkoay9xt) added to all pages — heatmaps, session recordings, and user behavior tracking across homepage, blog, tools, directory, and legal pages
  • 48 NEW IMAGES: 12 hero images + 36 inline diagrams/infographics for blog posts 13–24. Custom illustrations for each article topic
v4.0 Focused AI Bot Checker — Simplified & Faster March 18, 2026
  • FOCUS: Removed homepage signals parsing (Schema, H1, Meta Description, Canonical, OG tags) — tool now focuses solely on AI bot access & infrastructure
  • SIMPLIFIED SCORE: New 2-component scoring: Bot Access (65pts) + AI Infrastructure (35pts) = 100pts. Removed Content (25) and Technical (10) components
  • REMOVED: Browser Rendering API fallback, WAF/bot protection detection, homepage HTML parsing, extractSchemas(), extractMeta(), countTag() — eliminates all false-negative complaints from WAF-blocked sites
  • NO MORE WAF ISSUES: Since homepage is no longer scanned, sites behind Cloudflare/LiteSpeed/Wordfence WAFs get accurate results with zero workarounds needed
  • SMARTER RECS: Only actionable recommendations — unblock AI bots, create llms.txt/llms-full.txt, upgrade partial to full access. No more schema/meta/author suggestions
  • CTA: “Want a Full AI Visibility Audit?” card replaces Homepage Signals section — directs users to paid GEO audit for deeper analysis
  • FASTER: Removed homepage fetch + retry + Browser Rendering — checks complete ~3s faster on average
v3.9 Browser Rendering Fallback — Bypass WAF with Headless Chrome March 18, 2026
  • BREAKTHROUGH: When WAF blocks homepage scan, tool now falls back to Cloudflare Browser Rendering API — headless Chromium renders the page like a real browser, bypassing CAPTCHAs and WAF challenges
  • RESULT: WAF-protected sites (LiteSpeed, Cloudflare, Wordfence, Sucuri) now get full homepage analysis — Schema, H1, Meta Description, Canonical, OG tags all detected correctly
  • 3-TIER FETCH: Homepage fetch now uses three strategies: (1) Chrome UA fetch → (2) Safari UA retry → (3) Headless Browser Rendering fallback
  • UX: Green “Browser Rendering Used” badge shows when headless browser was needed to bypass WAF
  • TYPED: Added Env type bindings for CF_API_TOKEN and CF_ACCOUNT_ID environment variables
v3.8.2 Honest WAF Reporting & UX Overhaul March 18, 2026
  • HONESTY: WAF-blocked signals now show ⚠ “Unable to verify” with yellow triangles instead of pretending they pass or fail — no more misleading results
  • UX: New 2-column layout in WAF warning: “✓ Accurate Results” vs “⚠ Unable to Verify” — users instantly see what’s reliable
  • NEW: Links to Google Rich Results Test & Schema.org Validator in WAF warning — guides users to verify schema manually
  • SMART: Recommendations now skip “Add FAQ Schema” and “Add Author signal” when WAF blocks scan — avoids recommending things the site may already have
  • UX: WAF-blocked signal rows show impact as “N/A” in yellow instead of misleading Critical/High labels
  • CLEANUP: Removed temporary debug endpoint /api/debug-signals
v3.8.1 Enhanced WAF Detection & Smarter Scoring March 18, 2026
  • ENHANCED: WAF/Bot detection now covers 15+ providers — Cloudflare, LiteSpeed, Wordfence, Sucuri, Imperva, Akamai, StackPath, AWS WAF, hCaptcha, reCAPTCHA
  • NEW: Auto-retry with Safari UA when initial Chrome UA triggers bot protection — bypasses simpler WAF rules
  • IMPROVED: Bot-protected sites now score 19/25 content (was 13/25) — assumes H1, meta desc, schema exist (standard for CMS sites behind WAF)
  • UX: Homepage signals show yellow ⚠ warning icons for WAF-blocked items instead of misleading red ✗ marks
  • IMPROVED: Homepage fetch now sends full browser headers (Accept, Sec-Fetch-*, Accept-Language) — reduces WAF trigger rate
v3.8 Schema Detection Fix & Bot Protection Awareness March 18, 2026
  • FIXED: Schema detection now traverses @graph arrays (WordPress/RankMath/Yoast) — previously only checked root-level @type
  • FIXED: extractSchemas() rewritten as recursive collectSchemaTypes() — handles nested @graph, mainEntity, and arrays at any depth
  • NEW: Bot Protection Detection — identifies WAF/CAPTCHA pages (LiteSpeed, Cloudflare, Wordfence) and flags botProtection: true in results
  • IMPROVED: Bot-protected sites now receive partial content credit (13/25) instead of 0 — no longer penalized for CAPTCHA pages
  • PERF: Homepage HTML cached from STEP 0 probe — eliminated duplicate fetch, saving ~200ms per check
  • IMPROVED: extractMeta() now handles multi-line attributes (WordPress meta tags with newlines between properties)
v3.7 AI Files Detection Overhaul & Score Rebalance March 17, 2026
  • Fixed False Positives: llms.txt and llms-full.txt detection now uses strict validation — redirect: manual prevents 301/302 homepage redirects from being counted as “Found”
  • Content Validation: AI files must return 200 OK with non-HTML content. Responses containing HTML (custom 404 pages, redirected homepages) are correctly rejected
  • Smart Redirect Follow: Follows one redirect only if the target URL ends with the same filename (e.g., shopify.com/llms.txt → www.shopify.com/llms.txt). Homepage redirects are rejected
  • Removed ai.txt: /ai.txt has been removed from all checks — no actual standard exists for this file. Only llms.txt and llms-full.txt (official llmstxt.org standard) are checked
  • Score Rebalanced: Bot Access (40), AI Infrastructure (25), Content (25), Technical (10). Removed 5pts from ai.txt, redistributed to Content quality signals
  • Social Preview: All pages now have OG image + Twitter Card meta tags for rich social previews when sharing on Facebook, LinkedIn, X/Twitter, Slack
v3.6 Bot Database Accuracy & Versioned Bots March 17, 2026
  • 154 Bots: Added 3 new versioned AI bots: ChatGPT-User/2.0, MistralAI-User-1.0, Perplexity-User-1.0 — matching CrawlerCheck.com’s 34-bot AI directory
  • Deprecated Bot Labels: Claude-Web and anthropic-ai are now marked as Deprecated per Anthropic’s official docs (replaced by Claude-User, Claude-SearchBot, ClaudeBot)
  • ClaudeBot Relabeled: Changed from “Claude AI (Legacy)” to “Claude AI Training” to accurately reflect its official purpose per Anthropic docs
  • Versioned Bot Matching: New parser logic handles versioned user-agents (e.g., ChatGPT-User/2.0 inherits ChatGPT-User rules from robots.txt). Same for -1.0 suffixes
  • Cross-Verified: AI bot list verified against official sources: OpenAI Bots docs, Anthropic Crawler docs, CrawlerCheck.com directory, Search Engine Journal’s 2025 list
v3.5 Partial Access & Report Naming March 17, 2026
  • Partial = Accessible: Bots with partial access (e.g., /admin blocked but / allowed) now count as “accessible” in overview. Only fully BLOCKED bots are counted as denied. This prevents misleading “0% allowed” for sites that only block admin paths
  • Category Overview Rework: Shows “X full · Y partial” breakdown instead of just “X/Y allowed”. Percentage now reflects real accessibility
  • Bot Card Labels: Status now shows “FULLY ALLOWED”, “PARTIALLY ALLOWED”, or “BLOCKED” for clarity
  • Report Naming: PDF/Excel exports renamed from “AI Visibility Report” to “AI Crawler Check Report” — accurate to the tool’s function
  • Score Adjustment: PARTIAL bots now receive 75% score credit (was 50%). Reflects that partial access still means content is crawlable
v3.4 Deep Verification & Accuracy Guarantee March 17, 2026
  • Double Verification: robots.txt is now fetched twice and results compared. If content changed between fetches (CDN cache, etc.), the freshest version is used
  • Enforced Backend Processing: Minimum 8-second server-side processing time ensures all checks genuinely complete. No shortcuts, no cached guesses
  • Frontend Sync: Animation steps now match real backend processing. 10+ second minimum display with step-by-step progress visible to the user
  • Error UX Improved: Invalid domains show immediately clear error messages. No more fake results for non-existent websites
  • Version Tag: Reports now include processingTime to prove thorough analysis
v3.3 Domain Validation & Integrity March 17, 2026
  • Critical Fix: Added mandatory domain reachability check (STEP 0). Invalid or non-existent domains (e.g., example.com.sgsg) now return a clear error instead of fake results
  • DNS Validation: The tool now verifies the website actually responds before analyzing. DNS failures, connection timeouts, and Cloudflare 5xx errors are properly detected
  • robots.txt Validation: Content is now validated to ensure it’s actually a robots.txt file (not a custom error page returning 200). Per RFC 9309: 404 = all allowed, 403 = all blocked
  • Minimum Display Time: Results always show for at least 6 seconds with progressive step animation to confirm thorough analysis
  • Error UX: Clear, actionable error messages for unreachable domains, DNS failures, and timeout scenarios
v3.2 Accuracy Fix — robots.txt Only March 17, 2026
  • Critical Fix: Completely removed HTTP simulation. robots.txt is now the SOLE source of truth for bot access status. This eliminates ALL false positives caused by WAF/CDN/IP-based blocking (Cloudflare, Akamai, etc.)
  • Why: HTTP 403 from datacenter IPs does NOT mean the site owner blocks a bot — it's often generic bot-protection. Only robots.txt reflects the site owner's explicit intent
  • Partial Explained: Added tooltip explaining what "Partial" means on every bot card
  • Bot Purpose Labels: Every bot card shows its purpose — Training, Indexing, User Request, Both, Scraping, Monitoring, Social
  • Check Timing: Increased analysis animation to 14+ seconds for user trust
  • Directory UI Fix: Fixed broken layout in Web Crawler & Bot Directory pages
v3.1 Accuracy & UX Overhaul March 17, 2026
  • Critical Fix: Resolved false-positive blocking reports caused by WAF/CDN detection (e.g., Cloudflare-protected sites returning HTTP 403 for all datacenter IPs)
  • Accuracy: robots.txt is now the canonical source of truth. HTTP simulation only upgrades severity, never overrides a robots.txt ALLOWED status
  • Bot Purpose Labels: Every bot now shows its purpose — Training, Indexing, User Request, Both (Training + User Request), Scraping, Monitoring, Social
  • Partial Explained: PARTIAL status now includes an inline explanation showing which paths are blocked vs allowed
  • New Logo: Custom SVG logo for AI Crawler Check replaces old BH branding in header
  • Full Navigation: Directory, Tools, Generator, Validator, Batch Checker links on every page
  • Check Animation: Progressive step-by-step loading with realistic timing for better user trust
  • Changelog: This page — track all updates and version history
v3.0 SEO Tools Suite March 17, 2026
  • Robots.txt Generator: Create custom robots.txt with 160+ bot presets, 6 one-click presets, per-bot toggles
  • Robots.txt Validator: Paste or fetch robots.txt, analyze against 160+ bots, SEO Safety Score
  • Batch URL Checker: Check up to 20 URLs at once with CSV export and formatted reports
  • Bot Directory: 248 SEO-optimized pages — 161 bot detail pages, 8 category pages, 72 operator pages
  • SEO: JSON-LD structured data, FAQ rich snippets, XML sitemap with 248+ URLs
v2.0 151 Bots & 8 Categories (Initial) March 2026
  • Expanded to 151 bots across 8 categories initially (AI bots, search engines, Google bots, SEO tools, social bots, scrapers, cloud services, other agents)
  • PDF export with branded report
  • Meta-robots and X-Robots-Tag analysis
  • AI Visibility Score with 4-component breakdown
v1.0 Initial Launch March 2026
  • First release with basic robots.txt checking
  • Core AI bot database and analysis engine