feat: block search engines and all crawlers in robots.txt

Added User-agent: * Disallow: / catch-all plus explicit rules for Googlebot, Bingbot, DuckDuckBot, Yandex, Baidu and others. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:03:07 +02:00
parent 729af2f8cd
commit 3fb10a9a6b
1 changed files with 21 additions and 0 deletions
@@ -4,6 +4,27 @@
 # Reference: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/
 # See: https://github.com/MattWilcox/native-base/blob/45f6e7a837104f5ad83a5c7e280fb9a4eb126219/robots.txt

+# Block all crawlers by default
+User-agent: *
+Disallow: /
+
+# Search engines (explicit, for clarity)
+User-agent: Googlebot
+User-agent: Googlebot-Image
+User-agent: Googlebot-News
+User-agent: Googlebot-Video
+User-agent: AdsBot-Google
+User-agent: Bingbot
+User-agent: Slurp
+User-agent: DuckDuckBot
+User-agent: Baiduspider
+User-agent: YandexBot
+User-agent: Sogou
+User-agent: Exabot
+User-agent: ia_archiver
+Disallow: /
+
+# AI scrapers
 User-agent: CCBot
 User-agent: ChatGPT-User
 User-agent: GPTBot