DataFirefly Cleanup — Complete guide
Installation, six cleaners, audit/dry-run/execute modes, cron task and troubleshooting for the PrestaShop database cleanup module.
Overview
DataFirefly Cleanup is a PrestaShop 8 & 9 administration module that safely cleans your database: stale statistics, abandoned carts, old logs, outdated searches, orphan metadata and orphan images. Each cleaner offers three modes — audit, dry-run and execute — and the module computes the MB space gain before any action.
The module changes neither your theme nor PrestaShop core files. It creates a single table (the cleanup history) and one admin tab.
Installation
- Download
dfcleanup.zipfrom your DataFirefly account. - In your PrestaShop back office, go to Modules > Module Manager > Upload a module.
- Drop the ZIP or select it. Installation creates the history table, the admin tab and the cron token.
- Open Advanced Parameters > DataFirefly Cleanup.
Requirements: PrestaShop 8.0+ or 9.0+, PHP 8.0+, MySQL 5.7+ or MariaDB 10.3+.
The dashboard
The main screen shows three information blocks at the top:
- Database size — the total space used by your tables (data + indexes), computed via
information_schema. - Potential gain — the estimated reclaimable space if all cleaners were executed.
- Reclaimable percentage — the ratio between the two.
Below, the top 10 biggest tables shows you where your disk space actually goes. Statistics tables (ps_connections, ps_page_viewed) almost always top the list on an active store.
The six cleaners
Statistics
Cleans ps_connections (and its child tables connections_page and connections_source), ps_page_viewed, ps_referrer_cache, ps_pagenotfound and orphan guests. Default retention: 90 days. This is usually the highest-gain cleaner — stats tables grow with every visit.
Abandoned carts
Removes carts with no associated order older than the retention (30 days by default), plus orphan cart_product and cart_cart_rule rows, and expired cart rules.
A cart converted into an order is never deleted: every query verifies the absence of an order through a join on ps_orders. Your order data is untouchable.
Application logs
Prunes ps_log with severity-weighted retention: informational entries and warnings (severity 1-2) are deleted after the configured retention (30 days by default), while errors and critical errors (severity 3-4) are kept twice as long.
Stale searches
Cleans the ps_statssearch history (60 days by default) and orphan search index rows (search_index, search_word) pointing at deleted products.
Orphan metadata
Targets rows whose parent no longer exists: product_lang, product_shop, product_attribute, category_product, stock_available, specific_price, customization, soft-deleted addresses with no order, image_lang and image_shop. No retention concept here: an orphan is an orphan.
Orphan images
Two parts: ps_image entries whose product no longer exists (always active), and an optional filesystem scan that walks the product image directory looking for JPG files with no database entry. The scan is capped at 200,000 files for safety.
The three modes
| Mode | Database writes | Use |
|---|---|---|
| Audit | None | Count affected rows and estimate the gain. Always run this first. |
| Dry-run | History only | Simulate execution and keep a dated trace of the scope. |
| Execute | Actual delete | Delete in 5,000-row batches (configurable), with micro-pauses between batches. |
Before any Execute: back up your database. Cleanup is irreversible. Recommended workflow: Audit → Dry-run → Backup → Execute → OPTIMIZE TABLE.
OPTIMIZE TABLE
Deleting rows does not immediately return space to the system: InnoDB keeps the space inside the table file. The OPTIMIZE TABLE after execution checkbox rebuilds the cleaned tables to give physical space back to the disk (requires innodb_file_per_table, enabled by default on modern installs). Reserve it for off-peak hours: the operation briefly locks each table.
Cron task
The Scheduled cleanup (cron) panel of the dashboard lets you automate cleanups.
Configuration
- Enable cron — global switch. When disabled, the endpoint answers 503 even with a valid token.
- Mode — audit, dry-run (default, risk-free), execute, or execute + OPTIMIZE.
- Cleaners to run — checkboxes. Default: stats, cart, log, search. Metadata and image are opt-in.
URL and token
The public endpoint is /module/dfcleanup/cron?token=YOUR_TOKEN. The token (32 hex characters) is generated at install and verified in constant time. The Regenerate token button immediately invalidates the old URL.
Scheduling
Two options:
- PrestaShop cronjobs module — if installed, the task self-registers there (hook
actionRetrieveCronJobs), scheduled at 3:00 AM daily. Change the schedule from the cronjobs module configuration. - System crontab — copy the line shown in the admin:
0 3 * * * /usr/bin/curl -s 'https://your-shop.com/module/dfcleanup/cron?token=XXXX' > /dev/null 2>&1
Per-call overrides
You can override the mode and cleaners for a single call, without touching the configuration:
?token=XXXX&mode=audit
?token=XXXX&mode=execute&cleaners=stats,log
The Run cron now button immediately executes the current configuration — handy for testing without waiting for the next tick.
Settings
- Batch size — rows deleted per query (default 5,000, min 100, max 100,000). Lower it on constrained shared hosting, raise it on a beefy dedicated server.
- History retention — how long the module keeps its own history entries (180 days by default).
- Per-cleaner retention — in days. 0 = disables the time filter (orphan cleaners ignore this setting).
History
Every action (audit, dry-run, execute — manual or cron) is recorded: cleaner, mode, affected rows, freed bytes, per-table details as JSON, operator (admin email, cron or cron (manual)), date. The history table is purged automatically according to the configured retention.
Troubleshooting
Timeout on big deletes
The module disables the PHP time limit during execution, but some hosts enforce limits at the web-server level. In that case, lower the batch size, run cleaner by cleaner, or use the cron via CLI (curl from crontab is not subject to web-server limits).
Cron endpoint answers 403
The supplied token doesn’t match. Check that your crontab URL is up to date — a regenerated token invalidates the old URL.
Cron endpoint answers 503
Cron is disabled in the module settings. Enable it from the Scheduled cleanup panel.
Displayed gain differs from actual freed space
The gain is a proportional estimate (deleted_rows / total_rows × table_size). The actual space returned to disk depends on OPTIMIZE TABLE and fragmentation. The estimate is deliberately conservative.
Technical notes
- The module uses defensive schema detection (
tableExists/columnExistsviainformation_schema): it adapts to PS 8 / PS 9 differences and skips missing tables. - Single-table deletes are batched with
LIMIT; multi-table deletes (joins) run as a single statement, since MySQL does not allowLIMITon that syntax. - The cron token is compared via
hash_equals(constant time) to resist timing attacks. - Multi-store compatible. Interface in FR/EN/ES/DE.