Operated by OpenAI
OpenAI's prototype crawler for search features (SearchGPT). Unlike GPTBot (which trains models), this bot is used to fetch real-time information for user queries and provide citations.
OpenAI's prototype crawler for search features (SearchGPT). Unlike GPTBot (which trains models), this bot is used to fetch real-time information for user queries and provide citations.
OAI-SearchBot is an AI data-collection crawler operated by OpenAI. It harvests web content to build or expand training datasets for large language models (LLMs). Unlike search crawlers, OAI-SearchBot does NOT influence your page ranking in any search engine. The user-agent string OAI-SearchBot can be safely blocked via robots.txt, meta tags (noai), or the emerging llms.txt standard without any SEO penalty. Robots.txt is voluntary; for hard enforcement, combine it with server-level IP blocking.
User-agent: OAI-SearchBot / Disallow: / without any SEO penalty. This is the recommended approach if you want to opt out of OpenAI's LLM training datasets.<code>User-agent: OAI-SearchBot</code> — Matching is case-insensitive. Robots.txt is fetched from the root of each subdomain separately.
OAI-SearchBot is verifiable via reverse-DNS lookup on the crawling IP addresses. You can safely allow it unless you have a specific reason to block (e.g., AI training opt-out or SEO tool visibility).Understanding OAI-SearchBot's purpose helps you decide whether to allow or block it.
OAI-SearchBot. This is the exact string you must use in robots.txt, Nginx, Apache, or Cloudflare firewall rules to target this bot. User-agent matching in robots.txt is case-insensitive, but the string must be spelled correctly. You can verify that a request genuinely comes from OAI-SearchBot by performing a reverse-DNS lookup on the source IP — legitimate bots resolve back to their operator's domain.OAI-SearchBot is verifiable via reverse-DNS lookup on the crawling IP addresses. You can safely allow it unless you have a specific reason to block (e.g., AI training opt-out or SEO tool visibility).User-agent: OAI-SearchBot / Disallow: / without any SEO penalty. This is the recommended approach if you want to opt out of OpenAI's LLM training datasets./robots.txt file:
User-agent: OAI-SearchBot Disallow: /This instructs OAI-SearchBot not to crawl any path on your site. The Disallow: / directive covers the entire domain including subfolders. To only block specific sections, replace / with the path (e.g.,
Disallow: /blog/). Note: robots.txt is publicly readable — any bot or human can inspect it at yourdomain.com/robots.txt.OAI-SearchBot (case-insensitive grep: grep -i "OAI-SearchBot" /var/log/nginx/access.log). You can also check Google Search Console → Coverage → Crawl Stats for Googlebot variants. For OAI-SearchBot specifically, filter by user-agent in your log analysis tool (GoAccess, AWStats, etc.).User-agent: OAI-SearchBot Crawl-delay: 10(10 second delay between requests).
Disallow: / you can restrict OAI-SearchBot to specific paths:
User-agent: OAI-SearchBot Disallow: /private/ Disallow: /staging/ Allow: /This allows OAI-SearchBot everywhere except the listed paths. Path matching in robots.txt uses prefix matching —
Disallow: /private/ blocks /private/page.html but NOT /public/private/.<meta name="OAI-SearchBot" content="noai, noimageai, noindex"> to your pages.
2. Add a llms.txt file at your domain root (emerging standard).
3. Use Cloudflare WAF or Nginx to return 403 for this user-agent.
4. Consider IP blocklists for OpenAI's known crawler IP ranges.<meta name="OAI-SearchBot" content="noindex">
• **X-Robots-Tag HTTP header**: X-Robots-Tag: noai, noimageai
• **llms.txt**: Add a /llms.txt file (similar to robots.txt but for LLMs)
• **Server block**: Return 403 or 429 for this user-agent via WAF or Nginx
Using multiple layers provides the strongest protection.Check instantly with our free AI Bot Checker
Check Your Website