PrestaShop Artificial Intelligence

Topic Cluster Detector

Semantic mapping of your catalog and detection of missing pillar pages

The module analyzes your catalog through semantic clustering to surface the actual thematic groupings of your products. For each cluster orphaned of a parent page, it generates a complete pillar page draft: title, slug, meta description, H2 outline and priority score.

PrestaShop 8 & 9 5 languages AI + TF-IDF
  • 30-day refund
  • 12 months updates
  • 24h support
www.datafirefly.com/en/
Topic Cluster Detector
v1.0.0 · updated 2026-05-27
What it does

The short version.

01

Real semantic clustering

Spherical k-means with k-means++ init on L2-normalized vectors. You see the actual thematic groupings of your products, not your official category tree.

02

3 modes to choose from

Local and free TF-IDF, or OpenAI text-embedding-3-small embeddings, or Mistral embeddings. Switch in one click.

03

Missing pillar page detection

The module cross-references each detected cluster against your existing CMS pages and categories. Below the match threshold, the cluster is flagged as a pillar gap.

04

Ready-to-publish SEO draft

For each gap, generation of the title, slug, meta description, complete H2 markdown outline and target keywords. Priority score based on cluster size and cohesion.

05

5 native languages

Independent analysis per language with built-in French, English, Spanish, German and Italian stop-words, plus an e-commerce noise dictionary (size, color, shipping, price...).

06

Embeddings cache

Vectors are cached by text hash to avoid re-billing API calls between two runs.

The long version

Everything you'd want to know before you install.

A detailed look at how Topic Cluster Detector works, why we built it the way we did, and the thinking behind the features above.

§ 01

Why a Topic Cluster Detector?

Your catalog is a semantic gold mine often left untapped. Your products naturally form thematic groupings that search engines try to understand. Without a structuring pillar page for each cluster, Google struggles to identify your expertise on the topic, and your product pages cannibalize each other on informational queries. Topic Cluster Detector identifies these opportunities automatically.

§ 02

How does semantic clustering work?

The module extracts a weighted text for each product (name triple weight, meta double, categories double, description single), tokenizes it with per-language stop-words, vectorizes it as TF-IDF or dense embeddings, L2-normalizes the vectors, then applies a spherical k-means with k-means++ initialization. Clusters emerge from real semantic similarities, not from your category tree.

§ 03

Which mode to choose: TF-IDF or embeddings?

TF-IDF is free, instant, no API call, and excellent for lexically homogeneous catalogs (one domain, one vocabulary). OpenAI or Mistral embeddings capture richer semantics, understand synonyms and lexical variants, and excel on diverse catalogs or those with narrative descriptions. You can test both and compare.

§ 04

How are missing pillar pages detected?

The module retrieves your published CMS pages and category landing pages (with descriptions). For each cluster, it computes a fuzzy match score between the cluster's top terms and existing content (title weight 1.0, meta 0.5, body 0.2). Below the configured threshold (0.45 by default), the cluster is flagged as a pillar gap: you have the product material but not the structuring parent page.

§ 05

What does the generated draft for each gap contain?

For each missing pillar page, the module generates an H1 title, URL-safe slug, meta description, complete H2 markdown outline (introduction, what is, how to choose, comparison, best products, use cases, mistakes to avoid, FAQ, CTA), target keyword list and priority score combining cluster size and semantic cohesion.

§ 06

Who is this module for?

E-commerce merchants who invest in long-tail SEO, content managers in charge of pillar/cluster strategies, SEO consultants on audit missions, brands with large catalogs poorly covered editorially. The module is also a diagnostic tool to identify category overlaps or internal linking opportunities.