Operated by
The web crawler for Cốc Cốc, a popular browser and search engine in Vietnam.
The web crawler for Cốc Cốc, a popular browser and search engine in Vietnam.
coccocbot-web is a data aggregation crawler. Unlike search bots or AI crawlers, its purpose is typically to collect content for private datasets, price monitoring, or research. Blocking coccocbot-web via robots.txt or at the server level has NO negative SEO impact. If you see excessive crawl volume from this bot in your logs, a hard block is recommended.
<code>User-agent: coccocbot-web</code> — Matching is case-insensitive. Robots.txt is fetched from the root of each subdomain separately.
coccocbot-web is verifiable via reverse-DNS lookup on the crawling IP addresses. You can safely allow it unless you have a specific reason to block (e.g., AI training opt-out or SEO tool visibility).Understanding coccocbot-web's purpose helps you decide whether to allow or block it.
coccocbot-web. This is the exact string you must use in robots.txt, Nginx, Apache, or Cloudflare firewall rules to target this bot. User-agent matching in robots.txt is case-insensitive, but the string must be spelled correctly. You can verify that a request genuinely comes from coccocbot-web by performing a reverse-DNS lookup on the source IP — legitimate bots resolve back to their operator's domain.coccocbot-web is verifiable via reverse-DNS lookup on the crawling IP addresses. You can safely allow it unless you have a specific reason to block (e.g., AI training opt-out or SEO tool visibility)./robots.txt file:
User-agent: coccocbot-web Disallow: /This instructs coccocbot-web not to crawl any path on your site. The Disallow: / directive covers the entire domain including subfolders. To only block specific sections, replace / with the path (e.g.,
Disallow: /blog/). Note: robots.txt is publicly readable — any bot or human can inspect it at yourdomain.com/robots.txt.coccocbot-web (case-insensitive grep: grep -i "coccocbot-web" /var/log/nginx/access.log). You can also check Google Search Console → Coverage → Crawl Stats for Googlebot variants. For coccocbot-web specifically, filter by user-agent in your log analysis tool (GoAccess, AWStats, etc.).User-agent: coccocbot-web Crawl-delay: 10(10 second delay between requests).
Disallow: / you can restrict coccocbot-web to specific paths:
User-agent: coccocbot-web Disallow: /private/ Disallow: /staging/ Allow: /This allows coccocbot-web everywhere except the listed paths. Path matching in robots.txt uses prefix matching —
Disallow: /private/ blocks /private/page.html but NOT /public/private/.Crawl-delay: 30 below the User-agent directive in robots.txt.
2. Rate-limit the user-agent via Nginx's limit_req_zone or Apache's mod_ratelimit.
3. Block it outright at Cloudflare WAF with rule: http.user_agent contains "coccocbot-web".
4. Use fail2ban to auto-block IPs exceeding request thresholds.Check instantly with our free AI Bot Checker
Check Your Website