Everything you'd want to know before you install.
A detailed look at how Topic Cluster Detector works, why we built it the way we did, and the thinking behind the features above.
Why a Topic Cluster Detector?
Your catalog is a semantic gold mine often left untapped. Your products naturally form thematic groupings that search engines try to understand. Without a structuring pillar page for each cluster, Google struggles to identify your expertise on the topic, and your product pages cannibalize each other on informational queries. Topic Cluster Detector identifies these opportunities automatically.
How does semantic clustering work?
The module extracts a weighted text for each product (name triple weight, meta double, categories double, description single), tokenizes it with per-language stop-words, vectorizes it as TF-IDF or dense embeddings, L2-normalizes the vectors, then applies a spherical k-means with k-means++ initialization. Clusters emerge from real semantic similarities, not from your category tree.
Which mode to choose: TF-IDF or embeddings?
TF-IDF is free, instant, no API call, and excellent for lexically homogeneous catalogs (one domain, one vocabulary). OpenAI or Mistral embeddings capture richer semantics, understand synonyms and lexical variants, and excel on diverse catalogs or those with narrative descriptions. You can test both and compare.
How are missing pillar pages detected?
The module retrieves your published CMS pages and category landing pages (with descriptions). For each cluster, it computes a fuzzy match score between the cluster's top terms and existing content (title weight 1.0, meta 0.5, body 0.2). Below the configured threshold (0.45 by default), the cluster is flagged as a pillar gap: you have the product material but not the structuring parent page.
What does the generated draft for each gap contain?
For each missing pillar page, the module generates an H1 title, URL-safe slug, meta description, complete H2 markdown outline (introduction, what is, how to choose, comparison, best products, use cases, mistakes to avoid, FAQ, CTA), target keyword list and priority score combining cluster size and semantic cohesion.
Who is this module for?
E-commerce merchants who invest in long-tail SEO, content managers in charge of pillar/cluster strategies, SEO consultants on audit missions, brands with large catalogs poorly covered editorially. The module is also a diagnostic tool to identify category overlaps or internal linking opportunities.
There are no reviews yet.