Everything you'd want to know before you install.
A detailed look at how AI Crawler Manager — PrestaShop 8 & 9 works, why we built it the way we did, and the thinking behind the features above.
Why manage AI bots in 2026
In two years, AI crawlers have gone from curiosity to top bandwidth consumer on many e-commerce sites. OpenAI GPTBot, Anthropic ClaudeBot, Google-Extended, Applebot-Extended, PerplexityBot, ByteDance Bytespider and twenty more harvest your product pages, descriptions, prices, customer reviews and blog posts every day. Three uses: training the next generation of large language models, powering real-time conversational assistant responses (ChatGPT, Claude, Perplexity), feeding the new AI search engines.
The manual robots.txt problem
Blocking an AI bot via robots.txt requires knowing its exact user-agent (sometimes multiple per vendor, some changing without notice), keeping that list up-to-date, and understanding that not all bots respect robots.txt. Bytespider is notorious for ignoring it, legacy anthropic-ai only partially respects it. Without a dedicated tool, the administrator juggles between text files, scattered documentation and server logs.
What AI Crawler Manager does
The module installs 30+ AI bots pre-configured with their correct user-agents as of May 2026, their official documentation and their usage category (training, assistant, search, crawl). The administrator allows or blocks each bot via a visual switch, applies a one-click preset, previews the resulting robots.txt and writes it safely thanks to sentinel markers that preserve existing manual directives.
HTTP blocking for stubborn bots
For bots that ignore robots.txt, the actionDispatcherBefore hook detects the user-agent on the very first request and returns an HTTP 403 status before any PrestaShop processing. The server saves CPU cycles, the database is not queried, the bot is actually blocked.
Statistics from two sources
First source: real-time tracking via the PrestaShop hook, which records every detected AI visit with URL, IP, user-agent, HTTP status and timestamp. Second source: importing the Apache or Nginx access log file in combined format, with safe incremental parsing (byte offset stored, never any re-reading). The module auto-detects common paths (slash var slash log, slash home slash logs on o2switch, slash home slash user slash access dash logs on cPanel).
Per-path granularity
For cases where you want to allow a bot on certain areas only (for example Anthropic on product pages so it recommends them in Claude, but not on the blog so you don t give up your editorial content), the Rules tab lets you define per-URL-path allow or block rules with wildcard and end-of-string patterns, exactly like a classic robots.txt.
Solid architecture
PSR-4 under namespace DataFirefly slash AiCrawlerManager, embedded custom autoloader (no composer install required at deployment), 5 tables with utf8mb4 and proper indexes, 6 admin controllers under AdminParentConfigure, separate Smarty templates, minimal CSS and JS (native canvas chart, no external dependencies), FR and EN translations included. PrestaShop 8.0 to 9.x compatible via legacy ModuleAdminController.
There are no reviews yet.