Shopify Knowledge Base App vs. On-Site Structured Data: What Each One Actually Does

Learn the difference between Shopify Knowledge Base and schema markup for AI visibility. See how ChatGPT, Google AI Overviews, Bing, and Perplexity discover Shopify stores.

Shopify's push into agentic commerce has introduced a new sales channel where AI models - not human shoppers - are the first point of contact with your products. The Shopify Knowledge Base app is one piece of that system, but it only covers one of the two distinct paths AI uses to find and recommend your store. This guide breaks down how each path works, what they actually do under the hood, and where they overlap or leave gaps.

Five things to know before reading further:

  1. Shopify’s Knowledge Base app sends FAQ and policy data directly to partnered AI platforms like OpenAI and Microsoft through Shopify’s internal pipeline, not through your storefront HTML.
  2. Knowledge Base FAQs are not visible on your storefront and are not readable by search engines or web crawlers.
  3. Structured data (JSON-LD schema) is a separate system that lives inside your store’s HTML and can be read by Google, Bing, Perplexity, and other AI systems that crawl the web.
  4. These two systems are complementary. One powers Shopify’s direct AI integrations, while the other supports open-web discovery.
  5. If your FAQs only exist inside Shopify’s Knowledge Base app, they are invisible to Google AI Overviews, Perplexity, Bing’s web index, voice assistants, and other web-based AI systems.

What Shopify's agentic commerce system actually does

Shopify has built a direct data pipeline between its platform and external AI models. The technical infrastructure has two main components: the Agentic Catalog (for products) and the Policy and FAQs MCP endpoint (for store information).

The Agentic Catalog works like this: Shopify aggregates product data from every store on its platform into a Global Catalog hosted at catalog.shopify.com. AI agents can query this catalog using three tools - search_catalog for keyword-based product discovery, lookup_catalog for retrieving specific products by ID, and get_product for full variant/pricing/checkout details.

When someone asks ChatGPT "find me an at-home DNA test kit," the AI queries this global catalog and can surface products from any Shopify store that has its catalog properly set up. The results include pricing, images, availability, and a checkout link that opens in a new browser tab.

There is also a Storefront Catalog scoped to individual stores, which powers single-store AI agents and chatbots.

The key technical detail: this entire system runs on what Shopify calls UCP (Universal Commerce Protocol), which is essentially an MCP server that AI agents connect to directly. The data never passes through your store's public HTML. It travels from Shopify's backend to the AI platform's backend.

As of April 2026, in-chat checkout inside ChatGPT has been removed. OpenAI pulled back from that feature. Product recommendations still happen inside the chat, but the actual purchase redirects to the store's checkout page in a new tab.

How the Knowledge Base app fits into this system

The Shopify Knowledge Base is a free first-party app that handles the non-product side of this pipeline - store policies, shipping info, return rules, and custom FAQs.

What it does when you install it

When you install the app, it automatically scans your store's backend settings and generates a set of default FAQs. The sources it pulls from include:

  • Your language settings and customer account configuration
  • Your shipping and delivery zones
  • Your return rules and store policies

You can override any of these auto-generated answers or add your own custom FAQs manually. Each FAQ is a short question-and-answer pair (Shopify recommends 1-2 sentences per answer). All of them are stored as metaobjects in your Shopify admin.

Here is what happens to those FAQs after you save them:

  • They get fed into Shopify's AI pipeline through the store-level MCP endpoint (yourdomain.com/api/mcp).
  • When an AI agent receives a customer question like "what's your return policy?", it calls 7the search_shop_policies_and_faqs tool, matches the query against your stored FAQs, and returns the answer along with source references.

The app also includes a query log that tracks every question AI agents ask about your store. You can see which questions were answered successfully, which ones went unanswered, and which AI platform asked them.

The unanswered questions section is the most actionable part. It shows you exactly where your store has information gaps that may be costing you AI-referred sales in this channel.

What the Knowledge Base app does not do

This is where the distinction matters.

The Knowledge Base FAQs do not appear anywhere on your storefront. A human visitor browsing your store will never see them. There is no customer-facing FAQ page generated by this app.

The FAQs do not produce any structured data markup in your store's HTML. No JSON-LD, no FAQPage schema, no microdata. The metaobjects feed Shopify's internal pipeline only.

Here is what falls entirely outside the app's scope:

  • Breadcrumbs and breadcrumb schema (BreadcrumbList)
  • Collection hierarchy and site architecture
  • Related searches and internal linking
  • Product-level structured data beyond Shopify's standard taxonomy
  • Any on-site SEO architecture

The app's reach is limited to AI platforms connected through Shopify's partnership agreements - currently ChatGPT and Microsoft Copilot. Gemini integration has been mentioned but is not confirmed as live.

Google's own search engine, including AI Overviews, does not read from Shopify's Knowledge Base metaobjects. Neither does Perplexity, Brave Search, or any other system that discovers information by crawling the web.

How AI systems discover information through the open web

The second path is fundamentally different in architecture.

The RAG pipeline - how AI answers actually get built

Every AI system that generates answers from web content works by reading and indexing publicly accessible web pages. This includes Google's AI Overviews, Bing/Copilot's web-grounded responses, Perplexity, Brave AI Search, and voice assistants.

The process follows a pattern called RAG (Retrieval-Augmented Generation):

  • The AI receives a user question.
  • It searches for relevant web content across its index.
  • It retrieves that content and uses it to ground its answer.

This is where structured data markup (specifically JSON-LD schema) plays a direct role.

When your store's HTML includes proper schema markup - FAQPage, Product, BreadcrumbList, Organization, and other types - you are giving these AI systems a machine-readable layer of verified facts about your store. Instead of forcing the AI to parse and interpret your page's visual content (which is error-prone), structured data hands the information over in a format the AI can consume directly.

What the search engines have said publicly

Microsoft has been the most explicit. At SMX Munich in March 2025, Fabrice Canel (Principal Program Manager at Bing) confirmed that schema markup directly helps Microsoft's LLMs understand content. This applies to both Bing search and Copilot's web-grounded responses.

Google's position is slightly different in wording but points in the same direction.

Ryan Levering, Google's Software Engineer for Structured Data, stated at Google Search Central Live in March 2025 that Google's systems "run much better with structured data" because it is "computationally cheaper than extracting it."

Google officially says no special schema is required for AI features. But their own representatives consistently confirm that structured data helps their systems understand pages - and understanding pages is how content gets selected for AI-generated answers.

What the experimental data shows

A controlled experiment by Search Engine Land tested three identical single-page sites with different levels of schema implementation:

  • Well-implemented schema: The only page to appear in AI Overviews. Achieved Position 3 (highest of the three).
  • Poorly implemented schema: Ranked for 10 keywords, peaked at Position 8. Zero AI Overview appearances.
  • No schema at all: Not indexed despite being crawled. Zero rankings.

The takeaway is straightforward. The open-web channel reaches every AI system that reads websites, and structured data is the format these systems rely on to confidently understand and cite your content.

Two distribution channels, not two competing tools

This is the core point that most conversations about this topic miss.

Shopify's Knowledge Base and on-site structured data are not two versions of the same thing. They operate on completely different distribution channels with different technical mechanisms, different reach, and different audiences.

Shopify's Knowledge Base pipeline:

  • Data format: Metaobjects stored in Shopify admin
  • Distribution: Shopify's proprietary feed to partnered AI platforms
  • Reach: ChatGPT, Microsoft Copilot (confirmed), Gemini (announced)
  • Visibility to Google/Bing web crawlers: None
  • Visibility to store visitors: None
  • Content scope: Store policies, shipping, returns, custom FAQ pairs

On-site structured data (JSON-LD schema):

  • Data format: JSON-LD markup rendered in the store's HTML
  • Distribution: The open web - accessible to any system that crawls your site
  • Reach: Google (including AI Overviews), Bing (including Copilot's web-grounded RAG), Perplexity, Brave AI Search, voice assistants, and any future AI crawler
  • Visibility to Google/Bing web crawlers: Full
  • Visibility to store visitors: Indirect (powers rich results, breadcrumb displays, FAQ widgets)
  • Content scope: Products, FAQs, breadcrumbs, organization info, reviews, collections, entity relationships

A store that only uses the Knowledge Base app has its FAQ content available in ChatGPT and Copilot, but that same content is completely invisible to Google's AI Overviews, Bing's organic web results, Perplexity, and every other web-based AI discovery system.

A store that only has on-site structured data is fully visible across the open web but is not feeding Shopify's direct AI pipeline.

The strongest position is having both.

Where Risify fits in this picture

Risify is a Shopify app built specifically for the open-web channel - the side that Shopify's Knowledge Base does not cover.

The core difference: when you create content through Risify, it gets rendered directly into your store's HTML as structured data (JSON-LD schema) and as visible on-page elements. That means the content is readable by every search engine and AI system that crawls the web, and it is also visible to human visitors browsing your store.

FAQs with structured data output

When you add FAQs through Risify, two things happen simultaneously:

  • The FAQs appear as visible FAQ sections on your actual collection and product pages, so your customers can read them.
  • The same FAQs are output as FAQPage schema (JSON-LD) in the page's HTML, so Google, Bing, Perplexity, and every other web-crawling AI system can read them too.

This is the part that overlaps with Shopify's Knowledge Base in terms of content - but the distribution is completely different. Knowledge Base FAQs reach ChatGPT and Copilot through Shopify's internal feed. Risify FAQs reach the entire open web through your store's HTML.

If you use both, the same FAQ content reaches both channels.

Risify generates BreadcrumbList schema and visible breadcrumb navigation based on your store's collection hierarchy.

This matters for AI discovery because breadcrumbs communicate entity relationships. A product that sits inside "At-Home DNA Tests > Paternity DNA Test" tells an AI system exactly what category that product belongs to and how it relates to other products on the site. Without this structure, the AI has to infer those relationships from page content alone - which is slower, less reliable, and more likely to produce inaccurate citations.

Shopify has no native equivalent for this. The Knowledge Base app does not handle breadcrumbs, collection hierarchy, or any site architecture.

These create internal linking structures and contextual navigation that reinforces your site's topical architecture.

For AI systems using RAG, the more clearly your site communicates the relationships between pages and topics, the more confidently the AI can cite your store. Internal links between related collections and products are signals that help AI systems map the scope of what your store covers.

Product schema

Product-level structured data - pricing, availability, reviews, variants - rendered as JSON-LD in your store's HTML.

This is the same type of data that Shopify's Agentic Catalog sends through its proprietary pipeline to ChatGPT and Copilot. Risify delivers it through the open-web channel instead, making it available to Google, Bing, and every other web crawler.

AI-generated content

Collection descriptions and other content that gives AI crawlers more topical context about what your store sells and how your products relate to each other. Richer page content means more material for AI systems to ground their answers in when citing your store.

Why this distinction matters

None of the functionality above exists in Shopify's Knowledge Base app. The Knowledge Base was built for a specific purpose: feeding Shopify's internal AI pipeline with policy and FAQ data. Risify was built for the open-web channel - structured data, site architecture, and on-page content that every search engine and AI crawler can access.

What this means for your store in practice

If you are a Shopify store owner trying to figure out what to do, here is the practical breakdown.

Step 1: Install the Shopify Knowledge Base app

It is free, it takes minutes to set up, and it gets your store's basic policy information into Shopify's AI pipeline.

  • Review the auto-generated FAQs and fix any that are inaccurate.
  • Add custom FAQs for questions your customers frequently ask.
  • Monitor the query log over time and fill gaps when unanswered questions appear.

This gets you covered for the ChatGPT and Copilot channel.

Step 2: Set up on-site structured data for the open-web channel

This is where your FAQs, product data, breadcrumb hierarchy, and entity relationships need to exist as JSON-LD schema in your store's actual HTML.

This is what Google reads. This is what Bing's web crawler reads. This is what Perplexity, Brave AI Search, and every other web-crawling AI system reads.

If you are already using Risify , you have this covered. Your FAQs are being output as structured data, your breadcrumbs are generating BreadcrumbList schema, and your product pages carry proper Product schema.

If you are not using any structured data tool, you have a gap in your AI discoverability that the Knowledge Base app alone cannot fill.

The overlap point: FAQ content

If you write a strong FAQ for your Knowledge Base app, that same question and answer should also exist as structured data in your store's HTML. The content is the same - the distribution channels are different.

Having it in both places means it reaches both the proprietary AI pipeline and the open web.

The parts with no Knowledge Base equivalent

Breadcrumbs, collection hierarchy, related searches, product schema, entity relationships - these only exist through on-site structured data. There is no Shopify-native alternative for getting this information to Google or web-crawling AI systems.

This is where the two systems stop overlapping entirely. The Knowledge Base handles policy and FAQ data for Shopify's AI pipeline. Everything related to your store's SEO architecture and how search engines understand your site's structure lives on the open-web side.

The bottom line

Shopify's agentic commerce push and the Knowledge Base app are real, functional, and worth using. They solve a specific problem: getting your store's information into a direct feed for ChatGPT and Copilot.

But they cover one channel. The open web - where Google, Bing, Perplexity, AI Overviews, voice assistants, and every future AI crawler operates - requires structured data in your store's HTML. That channel has broader reach, deeper research backing, and no dependency on a single platform's partnership agreements.

The stores that will be most visible to AI systems are the ones that cover both channels: Shopify's proprietary pipeline through the Knowledge Base app, and the open web through proper on-site structured data and SEO architecture.

They are not the same thing. They were never designed to be. And treating them as interchangeable means leaving one entire discovery channel uncovered.

Table Of Contents
Follow