AI Crawler Check
Free Bot Analysis Tool

Data Scrapers Directory

Data aggregation and research crawlers. No SEO impact from blocking.

51 total
41 safe
8 caution
2 aggressive

Amazon Kendra

Safe
User-agent: Amazon Kendra
Operator: Amazon

Arquivo-web-crawler

Safe
User-agent: Arquivo-web-crawler
Operator: Unknown

BLEXBot

Safe
User-agent: BLEXBot
Operator: WebMeUp

Barkrowler

Safe
User-agent: Barkrowler
Operator: Babbar

Bravest

Safe
User-agent: Bravest
Operator: Brave

Bytespider

Aggressive
User-agent: Bytespider
Operator: ByteDance

CCBot

Caution
User-agent: CCBot
Operator: Common Crawl

Cotoyogi

Safe
User-agent: Cotoyogi
Operator: Unknown

Crawl4AI

Safe
User-agent: Crawl4AI
Operator: Unknown

Crawlspace

Safe
User-agent: Crawlspace
Operator: Unknown

Diffbot

Safe
User-agent: Diffbot
Operator: Diffbot

Echobot Bot

Safe
User-agent: Echobot Bot
Operator: Dealfront

EchoboxBot

Safe
User-agent: EchoboxBot
Operator: Echobox

Factset_spyderbot

Caution
User-agent: Factset_spyderbot
Operator: FactSet

Firecrawl

Safe
User-agent: Firecrawl
Operator: Firecrawl

FriendlyCrawler

Safe
User-agent: FriendlyCrawler
Operator: Unknown

ICC-Crawler

Caution
User-agent: ICC-Crawler
Operator: ICC

ISSCyberRiskCrawler

Safe
User-agent: ISSCyberRiskCrawler
Operator: ISS

ImagesiftBot

Safe
User-agent: ImagesiftBot
Operator: Unknown

JenkersBot

Safe
User-agent: JenkersBot
Operator: Jenkers

Kangaroo Bot

Safe
User-agent: Kangaroo Bot
Operator: Unknown

LivelapBot

Safe
User-agent: LivelapBot
Operator: Livelap

MauiBot

Safe
User-agent: MauiBot
Operator: Unknown

MoodleBot

Safe
User-agent: MoodleBot
Operator: Moodle

NewsNow

Safe
User-agent: NewsNow
Operator: NewsNow

NovaAct

Safe
User-agent: NovaAct
Operator: Unknown

Poseidon Research Crawler

Safe
User-agent: Poseidon Research Crawler
Operator: Unknown

QualifiedBot

Safe
User-agent: QualifiedBot
Operator: Qualified

Scrapy

Safe
User-agent: Scrapy
Operator: Unknown

SeekportBot

Safe
User-agent: SeekportBot
Operator: Seekport

Seekr

Safe
User-agent: Seekr
Operator: Seekr

SeekrBot

Safe
User-agent: SeekrBot
Operator: Seekr

TaraGroup Intelligent Bot

Safe
User-agent: TaraGroup Intelligent Bot
Operator: TaraGroup

Timpibot

Safe
User-agent: Timpibot
Operator: Timpi

Turnitin

Safe
User-agent: Turnitin
Operator: Turnitin

VelenPublicWebCrawler

Safe
User-agent: VelenPublicWebCrawler
Operator: Velen

Webzio-Extended

Caution
User-agent: Webzio-Extended
Operator: Webzio

coccocbot-web

Safe
User-agent: coccocbot-web
Operator: Unknown

crawler4j

Safe
User-agent: crawler4j
Operator: Unknown

hada.news

Caution
User-agent: https://hada.news
Operator: Unknown

iaskspider

Safe
User-agent: iaskspider
Operator: Unknown

iaskspider/2.0

Caution
User-agent: iaskspider/2.0
Operator: Unknown

imediaethics.org

Caution
User-agent: https://www.imediaethics.org
Operator: Unknown

imgproxy

Safe
User-agent: imgproxy
Operator: Open Source

magpie-crawler

Caution
User-agent: magpie-crawler
Operator: Brandwatch

netEstate Imprint Crawler

Safe
User-agent: netEstate Imprint Crawler
Operator: Unknown

news-please

Aggressive
User-agent: news-please
Operator: Open Source

omgili

Safe
User-agent: omgili
Operator: Unknown

omgilibot

Safe
User-agent: omgilibot
Operator: Unknown

yacy

Safe
User-agent: yacy
Operator: YaCy Community

yacybot

Safe
User-agent: yacybot
Operator: YaCy Community

FAQ

How many Data Scrapers are tracked?
We currently track 51 Data Scrapers with user-agent strings, safety ratings, and blocking rules.
Which Data Scrapers should I never block?
Never block bots with 'critical' impact rating — there are 0 in this category. Blocking them removes your site from their search results.
Are all Data Scrapers safe?
Out of 51 bots in this category: 41 safe, 8 caution, 2 aggressive.
How do I block all Data Scrapers at once?
Generate a combined robots.txt block rule for all Data Scrapers using our bulk robots.txt generator, or manually list each User-agent + Disallow: / directive.