Question 1

What is the official user-agent string for news-please?

Accepted Answer

The official user-agent string for news-please is: news-please. This is the exact string you must use in robots.txt, Nginx, Apache, or Cloudflare firewall rules to target this bot. User-agent matching in robots.txt is case-insensitive, but the string must be spelled correctly. You can verify that a request genuinely comes from news-please by performing a reverse-DNS lookup on the source IP — legitimate bots resolve back to their operator's domain.

Question 2

Is news-please safe?

Accepted Answer

🔴 **news-please is classified as Aggressive.** This bot has been observed ignoring robots.txt directives, crawling at excessive rates that impact server performance, or collecting data in ways that violate standard web etiquette. **We strongly recommend blocking this bot** at both the robots.txt level AND server level (Nginx/Apache/Cloudflare WAF). A robots.txt block alone may be insufficient if the bot does not respect it.

Question 3

Will blocking news-please hurt my SEO?

Accepted Answer

✅ **Minimal Impact** — Blocking news-please has no meaningful effect on your search engine rankings or organic traffic.

Question 4

How do I block news-please in robots.txt?

Accepted Answer

Add the following lines to your /robots.txt file:
User-agent: news-please
Disallow: /
This instructs news-please not to crawl any path on your site. The Disallow: / directive covers the entire domain including subfolders. To only block specific sections, replace / with the path (e.g., Disallow: /blog/). Note: robots.txt is publicly readable — any bot or human can inspect it at yourdomain.com/robots.txt.

Question 5

Does news-please respect robots.txt?

Accepted Answer

⚠️ news-please may not always respect robots.txt. For guaranteed blocking, combine robots.txt with server-level rules (Nginx if/return 403, Apache SetEnvIf, or Cloudflare WAF).

Question 6

How do I verify if news-please is crawling my site?

Accepted Answer

Search your web server access logs for the string news-please (case-insensitive grep: grep -i "news-please" /var/log/nginx/access.log). You can also check Google Search Console → Coverage → Crawl Stats for Googlebot variants. For news-please specifically, filter by user-agent in your log analysis tool (GoAccess, AWStats, etc.).

news-please

Quick Facts

What is news-please?

What happens if you block news-please?

How to block news-please with robots.txt

Is news-please safe to allow?

What does news-please do?

Frequently Asked Questions

Related Bots

Is news-please blocked on your site?