Predictive SEO — Complete documentation
Connect Google Search Console, configure your AI provider, understand the prediction engine and exploit detected seasonal opportunities.
Overview
DataFirefly Predictive SEO connects your PrestaShop store to Google Search Console, applies an embedded ML prediction engine on the search history and automatically identifies upcoming seasonal peaks. For each detected opportunity, you can generate a structured content brief in one click via Mistral, OpenAI or Claude.
Requirements
- PrestaShop 8.0+ or PrestaShop 9.x
- PHP 8.1 minimum
- MySQL 5.7+ or MariaDB 10.3+
- A Google account with access to the Search Console property for the store
- An API key from an AI provider (Mistral, OpenAI or Anthropic — Mistral by default, around €0.002 per brief)
- At least 60 to 90 days of GSC history for reliable predictions
Installation
Install from ZIP
- Download the
dfpredictiveseo.ziparchive from your DataFirefly customer area. - In the PrestaShop back-office, go to Modules → Module Manager.
- Click Upload a module and drop the ZIP.
- The module installs automatically, creates the 7
dfpseo_*tables, registers the 6 tabs under IMPROVE and generates a unique cron token. - Once installed, find the module under Improve → Predictive SEO.
upgrade-X.Y.Z.php scripts. Data and configuration are preserved.Database schema
Installation creates 7 tables with the dfpseo_ prefix:
dfpseo_keyword— keywords tracked from GSCdfpseo_history— daily history (impressions, clicks, CTR, position)dfpseo_forecast— daily predictions with confidence intervalsdfpseo_opportunity— detected seasonal opportunitiesdfpseo_recommendation— AI-generated content briefsdfpseo_seasonality— seasonal indices (day-of-week × month) per keyworddfpseo_sync_log— GSC sync log
Google Search Console setup
The module uses the standard OAuth2 protocol. You create an OAuth Client in Google Cloud, paste the credentials into the settings, and trigger the authorization flow from the Connect button.
Step 1 — Create a Google Cloud project
- Go to console.cloud.google.com and sign in with the Google account that has access to your Search Console property.
- Click the project selector at the top-left, then New Project.
- Name it for example DataFirefly Predictive SEO and create it.
Step 2 — Enable the Search Console API
- In the left menu, go to APIs & Services → Library.
- Search for Search Console API and click Enable.
Step 3 — Configure the OAuth consent screen
- Go to APIs & Services → OAuth consent screen.
- User type: External.
- Fill in app name, support email, authorized domain (your store).
- Under Scopes, add
https://www.googleapis.com/auth/webmasters.readonly(read-only Search Console). - In test mode, add your email under Test users. You can move to production later without any module change.
Step 4 — Create the OAuth Client
- Go to APIs & Services → Credentials.
- Click Create credentials → OAuth client ID.
- Application type: Web application.
- Name: free choice (e.g. Predictive SEO Production).
- Authorized redirect URI: copy the URI displayed in the module settings (Improve → Predictive SEO → Settings → Google Search Console → Redirect URI). Format:
https://your-store.com/module/dfpredictiveseo/settings/oauth_callback. - Click Create — Google displays the
client_idandclient_secret.
Step 5 — Connect the module
- In Predictive SEO settings, paste the
client_idandclient_secret. - Save.
- Click Connect to Google Search Console.
- You’re redirected to the Google consent page. Authorize read-only access.
- Back in the back-office, the module has stored the encrypted
refresh_tokenand is ready to sync. - Then select the Search Console property to track in the dropdown (the module auto-detects it after connection).
redirect_uri_mismatch error.AI provider setup
The module supports 3 AI providers for generating content briefs. You only need one, and you bring your own API key for the chosen provider — DataFirefly takes no fee on usage.
Mistral (default, recommended)
- Model:
mistral-small-latest - Indicative cost: around €0.002 per generated brief
- Create the key: console.mistral.ai → API Keys
- Paste the key into Settings → AI Provider → Mistral API Key
OpenAI
- Model:
gpt-4o-mini - Indicative cost: around €0.005 per brief
- Create the key: platform.openai.com → API keys
- Paste the key into Settings → AI Provider → OpenAI API Key
Anthropic (Claude)
- Model:
claude-3-5-haiku-latest - Indicative cost: around €0.004 per brief
- Create the key: console.anthropic.com → API Keys
- Paste the key into Settings → AI Provider → Anthropic API Key
Then select the active provider from the Active AI Provider dropdown. If you change providers, already-generated briefs are not auto-regenerated.
Data synchronization
First sync
Once the GSC connection is established, run a first manual sync: Settings → Run sync. The module pulls the last 90 days of history for the selected property — up to 250,000 rows per sync, with the date × query × page dimensions. The first sync can take 30 seconds to 2 minutes depending on volume.
Daily cron
For automatic syncs, configure a daily cron (recommended: early morning, 4-6 AM Paris time) calling the module’s secured endpoint.
The exact URL and token are displayed in Settings → Cron. Generic format:
https://your-store.com/module/dfpredictiveseo/cron/sync?token=YOUR_GENERATED_TOKEN
Example crontab line (Unix cron):
0 5 * * * curl -s "https://your-store.com/module/dfpredictiveseo/cron/sync?token=YOUR_TOKEN" > /dev/null 2>&1
DFPSEO_CRON_TOKEN). If compromised, regenerate it from Settings → Regenerate cron token.Sync pipeline
Each sync runs in sequence:
- GSC pull on the last 90 rolling days (date/query/page dimensions)
- Insert/update into
dfpseo_keywordanddfpseo_history - Recompute seasonal indices (per keyword with enough history)
- Generate forecasts over the configured horizon
- Detect opportunities over the future window
- Log into
dfpseo_sync_log
Dashboard
The dashboard (Improve → Predictive SEO → Dashboard) gathers the key indicators:
- 4 KPI cards: tracked keywords, upcoming opportunities, predicted clicks over 14 days, last GSC sync
- Main chart: aggregated history curve (90 days) + forecast (configured horizon, 30 days by default), with 95% confidence band
- Top opportunities: the next 10 seasonal peaks sorted by score
- Sync log: the last 5 syncs with their status
Connection status
Two badges at the top of the dashboard indicate the integration status: GSC connected (green/red) and AI Provider configured (green/red). If either is red, follow the direct link to the corresponding settings.
Keywords & forecasts
Keyword list
The Keywords tab displays the native PrestaShop grid with all synced queries. Columns: query, landing page, 30-day impressions, 30-day clicks, CTR, average position, last update. You can filter, sort and export.
Detailed keyword view
Clicking a keyword opens its profile:
- Individual history + personal forecast curve
- 95% confidence interval around the prediction
- 12 × 7 seasonal heatmap (month × day-of-week)
- Computed seasonal indices
- List of opportunities linked to this keyword
Seasonality heatmap
The heatmap visualizes the multiplicative seasonal indices. How to read it:
- Cell at 1.00 → average traffic on this month × day combination
- Cell at 1.50 → traffic 50% above average (seasonal peak)
- Cell at 0.60 → traffic 40% below average (trough)
Cells are colored from pale blue (trough) to deep blue and orange-red (peak). One glance is enough to spot the weeks worth exploiting.
Seasonal opportunities
Automatic detection
An opportunity is detected when, over a 14-day future window (configurable via DFPSEO_OPPORTUNITY_LOOKAHEAD_DAYS):
- The prediction exceeds the keyword’s baseline × 1.25 (peak threshold)
- AND the seasonal index for the month × day combination is above 1.10
Contiguous peaks (gap ≤ 2 days) are grouped into a single opportunity covering the full window.
Opportunity score
The score combines three factors:
score = expected_clicks × lift × confidence
expected_clicks: sum of predicted clicks over the windowlift: peak / baseline ratioconfidence: width of the prediction interval (the narrower, the higher the score)
A score > 80 indicates a high-potential opportunity with strong seasonal signal. A score 40-80 indicates a moderate opportunity. Below 40, the signal is too weak or too uncertain to justify priority action.
Opportunity workflow
Each opportunity has a status:
- New — freshly detected, awaiting decision
- In progress — a brief has been generated, in editorial work
- Done — content published, opportunity exploited
- Dismissed — decision not to process (false positive, out of strategy)
AI recommendations
Generate a brief
From any opportunity, click Generate brief. The module sends a request to the active AI provider with the keyword context (volume, seasonality, current position, target page) and receives a structured JSON brief containing:
- summary — strategic summary of the brief
- meta_description — SEO meta description ready to paste (150-160 characters)
- search_intent — dominant search intent (informational / transactional / navigational / commercial)
- outline — detailed h1/h2/h3 plan of the article or page
- keywords_to_include — semantic keywords to include
- internal_links — internal linking suggestions toward other pages on the site
- rationale — strategic rationale for the recommendation
Generation takes 1 to 3 seconds depending on the provider.
Approval workflow
Each brief goes through the following statuses:
- Pending — generated, awaiting review
- Approved — validated for editorial work
- Published — content live (mark manually)
- Rejected — declined (poor quality or off-topic)
- Draft — being edited
The workflow lets you keep a clear track of what’s been processed.
Technical architecture
Stack
- PSR-4 architecture, namespace
DfPredictiveSeo→src/ - Symfony controllers extending
FrameworkBundleAdminController - Doctrine DBAL repositories (no ObjectModel)
- GSC accessed via direct REST through cURL + OAuth2 (no
google/apiclientto stay lightweight) - No mandatory Composer dependency at install (PSR-4 autoloader is bundled)
ML pipeline
- Multiplicative seasonal decomposition: day-of-week and month indices computed via 28-day centered moving average and 10% trimmed mean
- OLS regression on
log(impressions+1)to model the log-linear trend - Forecast:
exp(log_prediction) × seasonal_index_day × seasonal_index_month - 95% intervals: Student approximation on regression residual error, progressively widening with the horizon
Cron endpoint
The endpoint is public but token-protected. PHP example for programmatic calls:
$token = 'your_cron_token';
$url = 'https://your-store.com/module/dfpredictiveseo/cron/sync?token=' . $token;
$response = file_get_contents($url);
$data = json_decode($response, true);
// $data['status'] = 'ok' | 'error'
// $data['keywords_synced'] = number of keywords updated
// $data['opportunities_detected'] = number of newly detected opportunities
Configuration variables
The module stores 18 configuration keys in the ps_configuration table:
DFPSEO_GSC_CLIENT_ID,DFPSEO_GSC_CLIENT_SECRET,DFPSEO_GSC_REFRESH_TOKEN(encrypted),DFPSEO_GSC_PROPERTYDFPSEO_AI_PROVIDER,DFPSEO_AI_MISTRAL_KEY,DFPSEO_AI_OPENAI_KEY,DFPSEO_AI_ANTHROPIC_KEYDFPSEO_FORECAST_HORIZON_DAYS(default: 30)DFPSEO_OPPORTUNITY_LOOKAHEAD_DAYS(default: 14)DFPSEO_PEAK_THRESHOLD(default: 1.25)DFPSEO_SEASONAL_THRESHOLD(default: 1.10)DFPSEO_CRON_TOKEN(generated at install)DFPSEO_LAST_SYNC,DFPSEO_LAST_FORECAST
Troubleshooting
redirect_uri_mismatch error during GSC connection
The redirect URI configured in Google Cloud doesn’t exactly match the one expected by the module. Check:
- Protocol:
https://nothttp:// - No trailing slash:
...oauth_callbacknot...oauth_callback/ - Subdomain:
www.or not depending on your store, must match
GSC sync returns no keywords
- Check that the property selected in settings is the one receiving SEO traffic (not an empty domain property)
- Check that the connected Google account is owner or authorized user on that property
- The property must have at least a few days of indexed history (Google Search Console publishes data with a 2-3 day delay)
Predictions seem unreliable
- Verify you have at least 60-90 days of history. Below that, seasonal indices can’t be correctly estimated
- For erratic keywords (low signal, much noise), the 95% interval width is intentionally extended — the module explicitly displays low confidence
- For more accurate predictions over long horizons (60-90 days), wait until you’ve accumulated 6-12 months of history. The engine improves over time
Cron endpoint returns 403
The token passed in query string doesn’t match DFPSEO_CRON_TOKEN. Check the exact value in Settings → Cron. If in doubt, regenerate the token and update your crontab.
AI brief isn’t generated
- Check that you pasted a valid API key for the selected active provider
- Check your balance on the provider’s console (Mistral, OpenAI or Anthropic)
- If the provider returns a rate limit error, wait a few minutes and retry
- Generated briefs are stored in
dfpseo_recommendation; on error, the failure is also logged in that table with theerrorstatus
FAQ
Does the module work without Google Search Console?
No, GSC is the primary data source. The module needs at least 60 to 90 days of history to produce reliable predictions and compute seasonality. If your store has just launched, wait until you have at least 2 months of indexed data before installing the module.
How much does an AI brief cost?
It depends on the chosen provider. With Mistral (default), expect around €0.002 per brief. OpenAI GPT-4o-mini around €0.005. Claude Haiku around €0.004. You bring your own API key and pay the provider directly — DataFirefly takes no fee.
How many keywords can the module track?
There’s no hard-coded limit. A standard sync pulls up to 250,000 rows (date × query × page) per call — covering nearly all stores. Beyond that, increase the PHP memory allocated to the cron worker.
Is the module compatible with other SEO modules?
Yes, Predictive SEO never writes to product pages or meta — it only reads GSC and produces recommendations. It’s fully compatible with all existing SEO modules (DataFirefly or third-party).
Is my GSC data stored on DataFirefly?
No. All data stays on your server, in your PrestaShop database. The module calls the Google API directly with your OAuth credentials, and calls the AI providers directly with your API key. No data transits through DataFirefly servers.
Which prediction horizon is most reliable?
The 7-14 day horizon is highly reliable (narrow 95% interval). The 30-60 day horizon is indicative (widened interval). Beyond 90 days, predictions become too uncertain for operational decisions — the module still computes them but flags low confidence.