PrestaShop Artificial Intelligence

AI Crawler Manager — PrestaShop 8 & 9

Take back control of the AI bots scraping your store

AI crawlers harvest your product pages to train LLMs, power conversational assistants and feed AI search engines. With AI Crawler Manager you take back control: 30+ AI bots tracked and identifiable in one click, robots.txt visual builder, HTTP 403 blocking for bots that ignore robots.txt, real-time AI crawl statistics.

PrestaShop 8 PrestaShop 9 Multilingual Multistore GDPR
  • 30-day refund
  • 12 months updates
  • 24h support
www.datafirefly.com/en/
AI Crawler Manager — PrestaShop 8 & 9
v1.0.0 · updated 2026-05-26
What it does

The short version.

01

30+ tracked AI bots 2026

OpenAI (GPTBot, ChatGPT-User, OAI-SearchBot), Anthropic (ClaudeBot, Claude-Web, anthropic-ai), Google-Extended, Apple (Applebot-Extended), Perplexity, ByteDance (Bytespider), Meta (Meta-ExternalAgent), Mistral, xAI, Cohere, Amazon, Common Crawl, You.com, Diffbot, DuckAssistBot, Kagi, and more.

02

Robots.txt visual builder

Toggle each bot to allowed or blocked via a switch, apply a one-click preset (block training, strict, block all, allow all, block Bytespider), preview the robots.txt live, write the file without touching anything else.

03

HTTP 403 blocking

Some bots ignore robots.txt (Bytespider, legacy anthropic-ai). HTTP blocking returns a 403 status before the page even renders, saves server resources and actually prevents scraping.

04

Per-path selective blocking

For example, allow ClaudeBot on product pages but block it on the blog. Wildcard star and end-of-string dollar patterns like classic robots.txt.

05

Crawl statistics

Dashboard with KPIs (30d visits, distinct bots, blocked hits), daily traffic chart, top bots, top crawled URLs, recent visits log with IP and status.

06

Apache and Nginx log import

Reads your access log file in combined format to retroactively count AI visits, even ones from before installation. Incremental parsing with stored offset: no duplicates, safe re-reads. Auto-detects common paths (o2switch, cPanel, Apache, Nginx).

The long version

Everything you'd want to know before you install.

A detailed look at how AI Crawler Manager — PrestaShop 8 & 9 works, why we built it the way we did, and the thinking behind the features above.

§ 01

Why manage AI bots in 2026

In two years, AI crawlers have gone from curiosity to top bandwidth consumer on many e-commerce sites. OpenAI GPTBot, Anthropic ClaudeBot, Google-Extended, Applebot-Extended, PerplexityBot, ByteDance Bytespider and twenty more harvest your product pages, descriptions, prices, customer reviews and blog posts every day. Three uses: training the next generation of large language models, powering real-time conversational assistant responses (ChatGPT, Claude, Perplexity), feeding the new AI search engines.

§ 02

The manual robots.txt problem

Blocking an AI bot via robots.txt requires knowing its exact user-agent (sometimes multiple per vendor, some changing without notice), keeping that list up-to-date, and understanding that not all bots respect robots.txt. Bytespider is notorious for ignoring it, legacy anthropic-ai only partially respects it. Without a dedicated tool, the administrator juggles between text files, scattered documentation and server logs.

§ 03

What AI Crawler Manager does

The module installs 30+ AI bots pre-configured with their correct user-agents as of May 2026, their official documentation and their usage category (training, assistant, search, crawl). The administrator allows or blocks each bot via a visual switch, applies a one-click preset, previews the resulting robots.txt and writes it safely thanks to sentinel markers that preserve existing manual directives.

§ 04

HTTP blocking for stubborn bots

For bots that ignore robots.txt, the actionDispatcherBefore hook detects the user-agent on the very first request and returns an HTTP 403 status before any PrestaShop processing. The server saves CPU cycles, the database is not queried, the bot is actually blocked.

§ 05

Statistics from two sources

First source: real-time tracking via the PrestaShop hook, which records every detected AI visit with URL, IP, user-agent, HTTP status and timestamp. Second source: importing the Apache or Nginx access log file in combined format, with safe incremental parsing (byte offset stored, never any re-reading). The module auto-detects common paths (slash var slash log, slash home slash logs on o2switch, slash home slash user slash access dash logs on cPanel).

§ 06

Per-path granularity

For cases where you want to allow a bot on certain areas only (for example Anthropic on product pages so it recommends them in Claude, but not on the blog so you don t give up your editorial content), the Rules tab lets you define per-URL-path allow or block rules with wildcard and end-of-string patterns, exactly like a classic robots.txt.

§ 07

Solid architecture

PSR-4 under namespace DataFirefly slash AiCrawlerManager, embedded custom autoloader (no composer install required at deployment), 5 tables with utf8mb4 and proper indexes, 6 admin controllers under AdminParentConfigure, separate Smarty templates, minimal CSS and JS (native canvas chart, no external dependencies), FR and EN translations included. PrestaShop 8.0 to 9.x compatible via legacy ModuleAdminController.