LLMs.txt & AEO Shopware — Complete guide
Install, configure and operate LLMs.txt & AEO: llms.txt / llms-full.txt / robots-ai.txt endpoints, Schema.org structured data (FAQPage, HowTo, Speakable, enriched Product), AEO custom fields, CLI and PSR-6 cache for Shopware 6.7.
Overview
DataFirefly LLMs.txt & AEO is a Shopware 6.7 plugin that makes your store visible and understandable to AI answer engines (ChatGPT, Claude, Perplexity, Gemini). It works on three complementary levels:
- llms.txt / llms-full.txt — two files compliant with the llmstxt.org specification, automatically generated at the root of each sales channel, in every active language.
- Schema.org JSON-LD — automatic structured data injection on every page: Organization, enriched Product, BreadcrumbList, FAQPage, HowTo and Speakable.
- AI crawler control — a
/robots-ai.txtendpoint with individual control over 9 bots (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, Bingbot, Meta-ExternalAgent, CCBot, cohere-ai).
Requirements: Shopware 6.7.0+, PHP 8.2+, MySQL 8.0+ or MariaDB 10.6+. The plugin works with the standard storefront and custom themes (Twig inheritance).
Installation
Via ZIP (recommended)
- Download
DataFireflyLlmsAeo.zipfrom your customer account. - Shopware Administration → Extensions → My extensions → Upload extension.
- Click Install, then Activate.
- Clear the cache: Settings → System → Caches & indexes, or via CLI:
bin/console cache:clear
Via CLI
unzip DataFireflyLlmsAeo.zip -d custom/plugins/
bin/console plugin:refresh
bin/console plugin:install --activate DataFireflyLlmsAeo
bin/console cache:clear
On activation, the plugin automatically installs the datafirefly_aeo custom field set on products, categories, CMS pages and manufacturers. No manual migration required.
Building the administration assets
If the administration module does not appear under Marketing after activation:
bin/console bundle:dump
./bin/build-administration.sh
bin/console cache:clear
Configuration
Configuration lives in Settings → System → Extensions → DataFirefly llms.txt & AEO. It is sales-channel scoped: select a specific channel in the top selector to override the global values.
“General” card
- Enable module — global switch (per sales channel).
- Site author — used in the llms.txt header.
- Site description — llms.txt header blockquote; describe your store in 1-2 AI-oriented sentences.
- Cache lifetime — TTL in seconds (default: 3600).
“llms.txt” card
- Include CMS pages, categories, brands and/or products.
- Maximum number of products listed in the index.
- Include inactive products — off by default; keep it off in production.
“AEO & Schema.org” card
- Individual toggles: Organization, enriched Product, BreadcrumbList, FAQPage, HowTo, Speakable.
- Organization logo and URL — override the sales channel values.
- Phone, contact email, social profiles — feed the Organization schema (
contactPoint,sameAs).
“AI crawlers” card
For each of the 9 bots, three modes:
- Allowed — full access (no restrictive directive).
- Denied —
Disallow: /for that bot. - Selective —
Disallowon the paths you list (one per line, e.g./checkout/,/account/).
The content of /robots-ai.txt is not automatically merged into your main robots.txt. Copy its content into your robots.txt, or add a server rewrite rule (see robots.txt integration).
The three endpoints
| URL | Content | Headers |
|---|---|---|
/llms.txt |
Synthetic index: Pages, Categories, Brands, Products, Optional | text/plain; charset=UTF-8, X-Robots-Tag: noindex, public cache |
/llms-full.txt |
Full content: cleaned descriptions, SKU, EAN, brand, grouped specs, FAQ | same |
/robots-ai.txt |
User-agent directive block for the 9 AI crawlers | same |
Quick verification after installation:
curl -I https://your-store.tld/llms.txt
curl -I https://your-store.tld/llms-full.txt
curl -I https://your-store.tld/robots-ai.txt
Each sales channel exposes its own files on its own domain, in every active language (localised URLs follow the channel’s domain configuration).
AEO custom fields
The datafirefly_aeo set is available on products, categories, CMS pages and manufacturers, in the Custom fields tab of each entity.
| Field | Type | Usage |
|---|---|---|
datafirefly_aeo_summary |
Text | 1-2 sentence summary used in llms.txt instead of the truncated description |
datafirefly_aeo_faq |
JSON | Structured FAQ, injected as FAQPage JSON-LD |
datafirefly_aeo_howto |
JSON | Structured tutorial, injected as HowTo JSON-LD |
datafirefly_aeo_speakable |
Text | Short text for voice assistants (30-40 speakable words) |
datafirefly_aeo_exclude |
Boolean | Excludes the entity from llms.txt and llms-full.txt |
FAQ field format
[
{
"q": "How long does delivery take?",
"a": "Standard delivery takes 2 to 4 business days."
},
{
"q": "What is your return policy?",
"a": "You have 30 days to return an unused product."
}
]
HowTo field format
{
"name": "How to install the product",
"totalTime": "PT15M",
"steps": [
{ "name": "Preparation", "text": "Unpack the components." },
{ "name": "Assembly", "text": "Follow the provided diagram." },
{ "name": "Verification", "text": "Test the operation." }
]
}
Shopware custom fields are translatable: fill in the FAQ in each language via the language selector on the product page. The plugin reads the value in the language of the request context.
Schema.org structured data
The plugin injects JSON-LD into the <head> via the storefront/layout/meta.html.twig template (Twig inheritance, compatible with custom themes). Generated schemas:
- Organization — on every page: name, logo, URL,
contactPoint,sameAs(social profiles). - Enriched Product — on product pages:
gtin13(from EAN),mpn,sku,brand(manufacturer),additionalProperty(specs grouped by property group),aggregateRating(from native Shopware reviews when present). - BreadcrumbList — complete breadcrumb of the current page.
- FAQPage — when the
datafirefly_aeo_faqfield is filled on the page’s entity. - HowTo — when the
datafirefly_aeo_howtofield is filled. - Speakable — CSS selectors
h1,.product-detail-name,.product-detail-description-text,.cms-element-text,[data-speakable], plus the dedicated field text.
Recommended validation after going live:
- Schema.org Validator — paste a product page URL.
- Google Rich Results Test.
Administration module
Under Marketing → DataFirefly llms.txt & AEO:
- Live preview of llms.txt or llms-full.txt, in monospace rendering.
- Sales channel selector — preview each channel independently.
- One-click cache invalidation (per channel or global).
- Open public URL and copy to clipboard.
CLI commands and automation
datafirefly:llms-txt:generate
# Generate the llms.txt of a sales channel (printed to standard output)
bin/console datafirefly:llms-txt:generate --sales-channel=<id>
# Full version, written to a file, bypassing the cache
bin/console datafirefly:llms-txt:generate --sales-channel=<id> --full --output=/tmp/llms-full.txt --no-cache
datafirefly:llms-txt:warm
# Warm the cache for all sales channels x all active languages
bin/console datafirefly:llms-txt:warm
# Force regeneration even if the cache is still valid
bin/console datafirefly:llms-txt:warm --force
# Warm only llms.txt (skip llms-full.txt)
bin/console datafirefly:llms-txt:warm --skip-full
Recommended cron
# Daily warming at 03:15
15 3 * * * cd /var/www/shopware && php bin/console datafirefly:llms-txt:warm --quiet
A Shopware scheduled task is also registered on activation: if your Messenger worker and the scheduled task runner are running, the cache warms itself automatically without a system cron.
robots.txt integration
Two approaches to expose the AI directives in your main robots.txt:
Manual copy
Open /robots-ai.txt, copy the generated block and paste it into your existing robots.txt. Repeat after each bot configuration change.
Server rewrite (recommended when robots.txt is fully managed by the plugin)
# nginx
location = /robots.txt {
rewrite ^ /robots-ai.txt last;
}
# Apache (.htaccess)
RewriteRule ^robots.txt$ /robots-ai.txt [L]
Only use the full rewrite if you have no other robots.txt directives to preserve (sitemap, existing SEO exclusions). When in doubt, prefer the manual copy of the AI block.
Cache and performance
- PSR-6 cache on Shopware’s
cache.objectpool, taggeddatafirefly_llms_aeo. - Keys scoped per sales channel + language: each combination has its own entry.
- Configurable TTL (default 3600 s).
- Invalidation: admin button (per channel or global),
warm --forcecommand, or natural expiration. - Compatible with clustered cache (Redis): tag-based invalidation works on all taggable adapters.
Troubleshooting
Endpoints return 404
- Check the plugin is activated (not just installed).
- Clear the HTTP cache and the application cache:
bin/console cache:clear. - If you use a reverse proxy/CDN, purge it too.
“Attempted to call an undefined method named getHeader” error on navigation pages
Bug fixed in version 1.0.1: on some Shopware 6.7 installations, NavigationPage does not expose getHeader(). Update to 1.0.1 (defensive extraction of the active category). If you are already on 1.0.1 and still see the error, clear the PHP opcode cache (opcache_reset or PHP-FPM restart).
The admin module does not appear under Marketing
Build the administration assets (see Installation) then force-reload the browser (Ctrl+Shift+R).
llms.txt is empty or incomplete
- Check the inclusion toggles (CMS pages / categories / brands / products) in the “llms.txt” card.
- Check the product limit is not set to 0.
- Check the
datafirefly_aeo_excludefield on the missing entities. - Invalidate the cache and reload.
JSON-LD does not appear in the page source
- Check “Enable module” and the Schema.org toggles are active for the right sales channel.
- If your theme overrides
storefront/layout/meta.html.twigwithout{{ parent() }}on the relevant block, the injection is lost: restore the parent call.
Changelog
1.0.1 — 2026-05-21
- Fix: defensive extraction of the active category on navigation pages (
getHeader()error on some 6.7 installations).
1.0.0 — 2026-05-21
- Initial release: llms.txt + llms-full.txt, 6 JSON-LD schemas, robots-ai.txt (9 bots), AEO custom fields, Vue 3 admin module, 2 CLI commands, scheduled task, FR/EN/DE snippets.