Wo WooCommerce Beginner

Vector Search Native — Complete Documentation

Installation, AI provider configuration, indexing, REST API, hooks and troubleshooting for the WooCommerce semantic search plugin.

Updated Module version 1.0.0

Overview

Vector Search Native turns WooCommerce product search into a semantic engine. Instead of matching keywords letter by letter, the plugin converts every product and every query into numeric vectors (embeddings) through an AI model, then computes their similarity of meaning. The result: a customer typing “lightweight cotton summer jacket” finds your “summer linen blazer”, even with no word in common.

The plugin supports three embedding providers — OpenAI, Voyage AI, and Cohere — switchable in one click, with incremental indexing that only calls the API when a product’s content has actually changed, and an automatic fallback to the native keyword search whenever vector search isn’t enough.

Requirements

  • WordPress 6.2 or later
  • WooCommerce 7.0 or later (tested up to 9.4)
  • PHP 8.0 or later
  • An API key from one of the three providers: OpenAI, Voyage AI, or Cohere
  • Working WP-Cron (or a system cron hitting wp-cron.php)

No special MySQL extension, no Redis server, no Elasticsearch required. Similarity is computed in pure PHP, making the plugin compatible with any standard shared hosting.

Installation

  1. In your WordPress back office, go to Plugins → Add New → Upload Plugin.
  2. Select the vector-search-native.zip file, then click Install Now.
  3. Click Activate. The plugin automatically creates its two tables (wp_vsn_embeddings and wp_vsn_index_queue) and schedules its cron task.
  4. A new Vector Search menu appears under WooCommerce.

Provider configuration

Go to WooCommerce → Vector Search. The “Embedding provider” section lists the three available providers. Select yours in the “Active provider” dropdown, paste your API key in the matching block, pick a model, then click Test connection. A green message confirming the vector dimension (e.g. “Connection OK. Embedding dimension: 1536”) validates the configuration.

Which model to pick

  • OpenAI text-embedding-3-small (1536d): the best quality-price ratio, recommended default.
  • OpenAI text-embedding-3-large (3072d): maximum quality, roughly 6× more expensive.
  • Voyage voyage-3 (1024d): excellent retrieval, trained for search.
  • Cohere embed-multilingual-v3.0 (1024d): the pick for multilingual EN/FR/ES/DE/IT catalogs.

Switching providers or models makes existing vectors incompatible (different dimensions). After a change, always run a full reindex.

Initial indexing

  1. Still on the Vector Search page, click Queue all products for reindex. All published products enter the queue.
  2. Click Auto-process until done. The plugin processes the queue in batches (25 products by default) until empty, live from your browser.
  3. The “Indexed”, “Queued”, and “Stuck” counters update in real time.

You can also let WP-Cron do the work in the background: the vsn_process_queue task runs on the configured interval (5 minutes by default) and progressively drains the queue.

Indicative cost: roughly €0.02 for 1,000 products with OpenAI text-embedding-3-small. SHA-256 incremental indexing guarantees an unchanged product never triggers an API call, even if the queue revisits it.

How search works

Once indexing completes, the WooCommerce product search (frontend and standard widgets) is automatically intercepted. The plugin:

  1. Converts the visitor’s query into a vector via the active provider (with a 10-minute cache).
  2. Computes cosine similarity against all stored product vectors.
  3. Keeps products above the minimum similarity threshold (0.30 by default), up to the maximum candidate count (200 by default).
  4. Injects the relevance-sorted IDs into the WordPress query.

If the result count falls below the fallback threshold (3 by default), the plugin steps aside and lets the native WooCommerce keyword search run normally. Your visitors never see an empty page because of an AI-side problem.

Advanced settings

Indexed content

The “Content to index” section lets you pick the fields included in the embedding: short description, long description, SKU, categories, tags, and attributes. The product title is always indexed. Fewer fields can sharpen relevance on some catalogs; including all of them maximizes recall.

Thresholds and candidates

  • Minimum similarity (0.0 – 1.0): below this cosine score, a product is discarded. Raise toward 0.4 – 0.5 to filter aggressively, lower toward 0.2 to widen.
  • Max candidates: number of products handed back to WooCommerce after ranking. Pagination then applies normally.
  • Fallback threshold: minimum number of vector results before switching to keywords.

Queue and cron

  • Cron interval: queue processing frequency (1, 5, 15 minutes, or hourly).
  • Batch size (1 – 100): products processed per tick. Increase cautiously to avoid provider rate limits.
  • Each failed product is retried up to 5 times, with the last error message stored in the database. The “Stuck” counter flags products that have exhausted their attempts.

REST API

Five endpoints are exposed under /wp-json/vsn/v1/, all restricted to users with the manage_woocommerce capability:

  • POST /reindex — queues every product.
  • POST /process — processes one batch immediately.
  • GET /stats — returns the counters (total, indexed, queued, stuck).
  • POST /test — tests an API key (parameters: provider, api_key, model).
  • POST /clear — wipes the embeddings index entirely.

Example full reindex from a deployment script:

curl -X POST https://your-shop.com/wp-json/vsn/v1/reindex 
  -u admin:APPLICATION_PASSWORD

Developer hooks

vsn_indexed_text

Customizes the text sent to the provider for each product. Perfect for injecting ACF fields or business metadata:

add_filter( 'vsn_indexed_text', function ( $text, $product ) {
    $material = get_post_meta( $product->get_id(), 'material', true );
    if ( $material ) {
        $text .= "nMaterial: " . $material;
    }
    return $text;
}, 10, 2 );

vsn_should_engage

Fine-grained control over when vector search engages:

// Disable vector search for single-word queries.
add_filter( 'vsn_should_engage', function ( $engage, $query ) {
    $s = (string) $query->get( 's' );
    if ( str_word_count( $s ) < 2 ) {
        return false;
    }
    return $engage;
}, 10, 2 );

Multilingual shops

With WPML or Polylang, each translation is a separate WordPress product: each one is therefore embedded independently, in its own language. Two recommendations:

  • Use a multilingual model (Cohere embed-multilingual-v3.0 or Voyage voyage-multilingual-2) so queries and product sheets are projected into the same semantic space regardless of language.
  • After adding a new language or running a bulk translation campaign, launch a full reindex to cover the new products.

Troubleshooting

Products stay "Stuck"

A product turns "Stuck" after 5 consecutive failures. Common causes: invalid or expired API key, provider rate limit, or network timeout. Verify your key with Test connection, fix the issue, then click Queue all products for reindex — this resets the attempt counters.

Search seems unchanged

  • Check that the Enabled box is ticked in the general settings.
  • Check that the "Indexed" counter matches your product count.
  • If you use a third-party search plugin (FiboSearch, SearchWP…), it may short-circuit the standard WordPress query before interception. Disable it or contact us for an integration adjustment.

The queue doesn't drain on its own

WP-Cron only fires on visits. On a low-traffic site, configure a system cron:

*/5 * * * * curl -s https://your-shop.com/wp-cron.php > /dev/null 2>&1

Uninstalling

Deactivating the plugin suspends the cron but keeps the data. Deleting the plugin from the Plugins page triggers uninstall.php, which removes both MySQL tables, the settings option, and the scheduled tasks. No leftovers in the database.

FAQ

Can I use the plugin without an API key?

No — semantic search requires a provider. Without a key, the plugin stays inert and the native WooCommerce search keeps working normally.

Are API keys exposed client-side?

No. All provider calls happen server-side, from PHP. The key never appears in the HTML or in browser requests.

What catalog size is supported?

The PHP cosine scan stays very fast up to roughly 50,000 products on standard shared hosting. Beyond that, contact us to discuss a dedicated approximate-nearest-neighbor index integration.

Was this page helpful?

Still stuck? Contact support