Version history and updates for AI Crawler Check.
v4.3
30 Blog Posts: E-E-A-T, GEO vs SEO & AI Bot Troubleshooting
March 27, 2026
- 30 BLOG POSTS: Published 3 new E-E-A-T focused articles — E-E-A-T for AI Search, GEO vs SEO Complete Comparison, Why AI Bots Can’t Crawl Your Website
- 12 NEW IMAGES: 3 hero images + 9 inline diagrams covering trust pyramids, GEO/SEO frameworks, diagnostic flowcharts, and robots.txt fixes
- E-E-A-T AUTHORITY: Deep-dive into Experience, Expertise, Authoritativeness & Trustworthiness signals for AI search citation. Original frameworks and expert analysis
- INTERNAL LINKS: 80+ strategic internal links across 3 posts. Heavy tool CTAs, cross-referencing to crawler guides, robots.txt generator, schema markup, and audit checklist
- WAF FIX: Fixed false-positive bot blocking when target sites use Cloudflare WAF. Multi-UA fallback strategy with WAF detection
v4.2
3 New Blog Posts: Google-Agent, WebMCP & llms.txt Guide
March 26, 2026
- 27 BLOG POSTS: Published 3 new in-depth articles — Google-Agent & Project Mariner, WebMCP Guide, How to Create llms.txt
- 12 NEW IMAGES: 3 hero images + 9 inline diagrams covering agentic browsing, WebMCP architecture, and llms.txt structure
- TOPICS: Agentic Engine Optimization (AEO), WebMCP Tool Contracts, user-triggered-agents.json, Web Bot Auth, llms.txt templates for 4 website types
v4.1
Google-Agent, 24 Blog Posts & Microsoft Clarity
March 25, 2026
- NEW BOT: Google-Agent (Project Mariner) — Google’s brand-new agentic AI crawler that navigates sites, fills forms, and takes actions on behalf of users. Monitor its activity to measure how important WebMCP is for your site
- 155 BOTS: Bot database expanded from 154 → 155. Google-Agent added to Google Bots category with full directory page, FAQs, and structured data. Now 248 SEO-optimized directory pages (161 bot + 8 category + 72 operator)
- 24 BLOG POSTS: Published 12 new articles (posts 13–24) covering robots.txt creation, Applebot-Extended, Meta-ExternalAgent, AI SEO Audit Checklist, Cohere AI Crawler, and more
- E-E-A-T FOCUS: All new posts emphasize Experience, Expertise, Authoritativeness, and Trustworthiness with author bylines, expert recommendations, and data-backed analysis
- ANCHOR TEXT: Strategic homepage anchor text distribution — “AI crawler checker free” (14x), “AI crawler checker online” (12x), “web crawler tool free” (6x), “AI crawlers” (4x)
- CLARITY: Microsoft Clarity analytics (w0nkoay9xt) added to all pages — heatmaps, session recordings, and user behavior tracking across homepage, blog, tools, directory, and legal pages
- 48 NEW IMAGES: 12 hero images + 36 inline diagrams/infographics for blog posts 13–24. Custom illustrations for each article topic
v4.0
Focused AI Bot Checker — Simplified & Faster
March 18, 2026
- FOCUS: Removed homepage signals parsing (Schema, H1, Meta Description, Canonical, OG tags) — tool now focuses solely on AI bot access & infrastructure
- SIMPLIFIED SCORE: New 2-component scoring: Bot Access (65pts) + AI Infrastructure (35pts) = 100pts. Removed Content (25) and Technical (10) components
- REMOVED: Browser Rendering API fallback, WAF/bot protection detection, homepage HTML parsing, extractSchemas(), extractMeta(), countTag() — eliminates all false-negative complaints from WAF-blocked sites
- NO MORE WAF ISSUES: Since homepage is no longer scanned, sites behind Cloudflare/LiteSpeed/Wordfence WAFs get accurate results with zero workarounds needed
- SMARTER RECS: Only actionable recommendations — unblock AI bots, create llms.txt/llms-full.txt, upgrade partial to full access. No more schema/meta/author suggestions
- CTA: “Want a Full AI Visibility Audit?” card replaces Homepage Signals section — directs users to paid GEO audit for deeper analysis
- FASTER: Removed homepage fetch + retry + Browser Rendering — checks complete ~3s faster on average
v3.9
Browser Rendering Fallback — Bypass WAF with Headless Chrome
March 18, 2026
- BREAKTHROUGH: When WAF blocks homepage scan, tool now falls back to Cloudflare Browser Rendering API — headless Chromium renders the page like a real browser, bypassing CAPTCHAs and WAF challenges
- RESULT: WAF-protected sites (LiteSpeed, Cloudflare, Wordfence, Sucuri) now get full homepage analysis — Schema, H1, Meta Description, Canonical, OG tags all detected correctly
- 3-TIER FETCH: Homepage fetch now uses three strategies: (1) Chrome UA fetch → (2) Safari UA retry → (3) Headless Browser Rendering fallback
- UX: Green “Browser Rendering Used” badge shows when headless browser was needed to bypass WAF
- TYPED: Added
Env type bindings for CF_API_TOKEN and CF_ACCOUNT_ID environment variables
v3.8.2
Honest WAF Reporting & UX Overhaul
March 18, 2026
- HONESTY: WAF-blocked signals now show ⚠ “Unable to verify” with yellow triangles instead of pretending they pass or fail — no more misleading results
- UX: New 2-column layout in WAF warning: “✓ Accurate Results” vs “⚠ Unable to Verify” — users instantly see what’s reliable
- NEW: Links to Google Rich Results Test & Schema.org Validator in WAF warning — guides users to verify schema manually
- SMART: Recommendations now skip “Add FAQ Schema” and “Add Author signal” when WAF blocks scan — avoids recommending things the site may already have
- UX: WAF-blocked signal rows show impact as “N/A” in yellow instead of misleading Critical/High labels
- CLEANUP: Removed temporary debug endpoint
/api/debug-signals
v3.8.1
Enhanced WAF Detection & Smarter Scoring
March 18, 2026
- ENHANCED: WAF/Bot detection now covers 15+ providers — Cloudflare, LiteSpeed, Wordfence, Sucuri, Imperva, Akamai, StackPath, AWS WAF, hCaptcha, reCAPTCHA
- NEW: Auto-retry with Safari UA when initial Chrome UA triggers bot protection — bypasses simpler WAF rules
- IMPROVED: Bot-protected sites now score 19/25 content (was 13/25) — assumes H1, meta desc, schema exist (standard for CMS sites behind WAF)
- UX: Homepage signals show yellow ⚠ warning icons for WAF-blocked items instead of misleading red ✗ marks
- IMPROVED: Homepage fetch now sends full browser headers (Accept, Sec-Fetch-*, Accept-Language) — reduces WAF trigger rate
v3.8
Schema Detection Fix & Bot Protection Awareness
March 18, 2026
- FIXED: Schema detection now traverses
@graph arrays (WordPress/RankMath/Yoast) — previously only checked root-level @type
- FIXED:
extractSchemas() rewritten as recursive collectSchemaTypes() — handles nested @graph, mainEntity, and arrays at any depth
- NEW: Bot Protection Detection — identifies WAF/CAPTCHA pages (LiteSpeed, Cloudflare, Wordfence) and flags
botProtection: true in results
- IMPROVED: Bot-protected sites now receive partial content credit (13/25) instead of 0 — no longer penalized for CAPTCHA pages
- PERF: Homepage HTML cached from STEP 0 probe — eliminated duplicate fetch, saving ~200ms per check
- IMPROVED:
extractMeta() now handles multi-line attributes (WordPress meta tags with newlines between properties)
v3.7
AI Files Detection Overhaul & Score Rebalance
March 17, 2026
- Fixed False Positives: llms.txt and llms-full.txt detection now uses strict validation —
redirect: manual prevents 301/302 homepage redirects from being counted as “Found”
- Content Validation: AI files must return 200 OK with non-HTML content. Responses containing HTML (custom 404 pages, redirected homepages) are correctly rejected
- Smart Redirect Follow: Follows one redirect only if the target URL ends with the same filename (e.g., shopify.com/llms.txt → www.shopify.com/llms.txt). Homepage redirects are rejected
- Removed ai.txt: /ai.txt has been removed from all checks — no actual standard exists for this file. Only llms.txt and llms-full.txt (official llmstxt.org standard) are checked
- Score Rebalanced: Bot Access (40), AI Infrastructure (25), Content (25), Technical (10). Removed 5pts from ai.txt, redistributed to Content quality signals
- Social Preview: All pages now have OG image + Twitter Card meta tags for rich social previews when sharing on Facebook, LinkedIn, X/Twitter, Slack
v3.6
Bot Database Accuracy & Versioned Bots
March 17, 2026
- 154 Bots: Added 3 new versioned AI bots: ChatGPT-User/2.0, MistralAI-User-1.0, Perplexity-User-1.0 — matching CrawlerCheck.com’s 34-bot AI directory
- Deprecated Bot Labels: Claude-Web and anthropic-ai are now marked as Deprecated per Anthropic’s official docs (replaced by Claude-User, Claude-SearchBot, ClaudeBot)
- ClaudeBot Relabeled: Changed from “Claude AI (Legacy)” to “Claude AI Training” to accurately reflect its official purpose per Anthropic docs
- Versioned Bot Matching: New parser logic handles versioned user-agents (e.g., ChatGPT-User/2.0 inherits ChatGPT-User rules from robots.txt). Same for -1.0 suffixes
- Cross-Verified: AI bot list verified against official sources: OpenAI Bots docs, Anthropic Crawler docs, CrawlerCheck.com directory, Search Engine Journal’s 2025 list
v3.5
Partial Access & Report Naming
March 17, 2026
- Partial = Accessible: Bots with partial access (e.g., /admin blocked but / allowed) now count as “accessible” in overview. Only fully BLOCKED bots are counted as denied. This prevents misleading “0% allowed” for sites that only block admin paths
- Category Overview Rework: Shows “X full · Y partial” breakdown instead of just “X/Y allowed”. Percentage now reflects real accessibility
- Bot Card Labels: Status now shows “FULLY ALLOWED”, “PARTIALLY ALLOWED”, or “BLOCKED” for clarity
- Report Naming: PDF/Excel exports renamed from “AI Visibility Report” to “AI Crawler Check Report” — accurate to the tool’s function
- Score Adjustment: PARTIAL bots now receive 75% score credit (was 50%). Reflects that partial access still means content is crawlable
v3.4
Deep Verification & Accuracy Guarantee
March 17, 2026
- Double Verification: robots.txt is now fetched twice and results compared. If content changed between fetches (CDN cache, etc.), the freshest version is used
- Enforced Backend Processing: Minimum 8-second server-side processing time ensures all checks genuinely complete. No shortcuts, no cached guesses
- Frontend Sync: Animation steps now match real backend processing. 10+ second minimum display with step-by-step progress visible to the user
- Error UX Improved: Invalid domains show immediately clear error messages. No more fake results for non-existent websites
- Version Tag: Reports now include processingTime to prove thorough analysis
v3.3
Domain Validation & Integrity
March 17, 2026
- Critical Fix: Added mandatory domain reachability check (STEP 0). Invalid or non-existent domains (e.g., example.com.sgsg) now return a clear error instead of fake results
- DNS Validation: The tool now verifies the website actually responds before analyzing. DNS failures, connection timeouts, and Cloudflare 5xx errors are properly detected
- robots.txt Validation: Content is now validated to ensure it’s actually a robots.txt file (not a custom error page returning 200). Per RFC 9309: 404 = all allowed, 403 = all blocked
- Minimum Display Time: Results always show for at least 6 seconds with progressive step animation to confirm thorough analysis
- Error UX: Clear, actionable error messages for unreachable domains, DNS failures, and timeout scenarios
v3.2
Accuracy Fix — robots.txt Only
March 17, 2026
- Critical Fix: Completely removed HTTP simulation. robots.txt is now the SOLE source of truth for bot access status. This eliminates ALL false positives caused by WAF/CDN/IP-based blocking (Cloudflare, Akamai, etc.)
- Why: HTTP 403 from datacenter IPs does NOT mean the site owner blocks a bot — it's often generic bot-protection. Only robots.txt reflects the site owner's explicit intent
- Partial Explained: Added tooltip explaining what "Partial" means on every bot card
- Bot Purpose Labels: Every bot card shows its purpose — Training, Indexing, User Request, Both, Scraping, Monitoring, Social
- Check Timing: Increased analysis animation to 14+ seconds for user trust
- Directory UI Fix: Fixed broken layout in Web Crawler & Bot Directory pages
v3.1
Accuracy & UX Overhaul
March 17, 2026
- Critical Fix: Resolved false-positive blocking reports caused by WAF/CDN detection (e.g., Cloudflare-protected sites returning HTTP 403 for all datacenter IPs)
- Accuracy: robots.txt is now the canonical source of truth. HTTP simulation only upgrades severity, never overrides a robots.txt ALLOWED status
- Bot Purpose Labels: Every bot now shows its purpose — Training, Indexing, User Request, Both (Training + User Request), Scraping, Monitoring, Social
- Partial Explained: PARTIAL status now includes an inline explanation showing which paths are blocked vs allowed
- New Logo: Custom SVG logo for AI Crawler Check replaces old BH branding in header
- Full Navigation: Directory, Tools, Generator, Validator, Batch Checker links on every page
- Check Animation: Progressive step-by-step loading with realistic timing for better user trust
- Changelog: This page — track all updates and version history
v3.0
SEO Tools Suite
March 17, 2026
- Robots.txt Generator: Create custom robots.txt with 160+ bot presets, 6 one-click presets, per-bot toggles
- Robots.txt Validator: Paste or fetch robots.txt, analyze against 160+ bots, SEO Safety Score
- Batch URL Checker: Check up to 20 URLs at once with CSV export and formatted reports
- Bot Directory: 248 SEO-optimized pages — 161 bot detail pages, 8 category pages, 72 operator pages
- SEO: JSON-LD structured data, FAQ rich snippets, XML sitemap with 248+ URLs
v2.0
151 Bots & 8 Categories (Initial)
March 2026
- Expanded to 151 bots across 8 categories initially (AI bots, search engines, Google bots, SEO tools, social bots, scrapers, cloud services, other agents)
- PDF export with branded report
- Meta-robots and X-Robots-Tag analysis
- AI Visibility Score with 4-component breakdown
v1.0
Initial Launch
March 2026
- First release with basic robots.txt checking
- Core AI bot database and analysis engine