— STUDY · STRUCTURED DATA & AI
Structured Data in the AI Era: from rich results to citations in AI answers
Structured data isn't a magic ranking factor. It's the infrastructure that makes your content readable, disambiguable and citable — from rich results all the way to citations in AI answers.
A Websem study on how the stakes of structured data have shifted in the AI era: we no longer chase SERP widgets that come and go, we build a semantic layer anchored in entity identity — for the Knowledge Graph, AI Overviews, ChatGPT and Perplexity.
- Structured data doesn't directly influence rankings (reconfirmed by Google in 2025–2026), but it's the lever for rich results, entity recognition and citations in AI answers.
- The AI era changed the stakes, not the fundamentals: an AI answer cites 2–7 sources, not 10 links. The fight is to be a source the model understands and trusts enough to cite.
- 2026 brought a cleanup of schema types (FAQ deprecated on 7 May 2026, HowTo earlier). The markup stays valid — the strategy shifts from widget to meaning and entity identity.
Why structured data matters more now than in 2020
Search engines have moved from keyword matching to entity understanding. Google processes a query by identifying the entities involved — people, organizations, places, concepts — and the relationships between them. The Knowledge Graph holds over 500 billion facts about more than 5 billion entities. Being represented correctly in this graph is the precondition for appearing in Knowledge Panels, in AI Overviews citations and in AI Mode answers.
The large language models that power AI Overviews, ChatGPT, Perplexity and Claude have the same need: to understand quickly and unambiguously who publishes, what the page says and how the facts connect. Structured data is the cleanest way to deliver this meaning — a layer of “meaning metadata” that doesn't depend on how well the machine interprets the visual layout.
Being represented correctly in the Knowledge Graph is no longer a bonus. It's the precondition for being cited by AI engines.
Three overlapping layers of optimization
Structured data plays a different role in each. Confusing them means optimizing for one layer and missing the other two.
| Layer | Where | The role of structured data |
|---|---|---|
| Classic SEO | The blue links in Google | Eligibility for rich results and disambiguation (Product, Article, Breadcrumb). |
| AI Overviews / AI Mode | The generative box in Google's SERP | Entity recognition and trust — markup helps the model pick you as a source. |
| Conversational AI engines | ChatGPT, Perplexity, Claude, Copilot | Citability: structured content, easy to extract and attribute correctly. |
In March 2025, both Google and Microsoft publicly confirmed that they use schema markup in their generative features, and ChatGPT confirmed that it uses structured data to decide which products appear in results. Markup has become relevant for the entire discovery ecosystem, not just for a single engine.
Fundamentals: Schema.org and JSON-LD
The vocabulary, the grammar and the format — in short, so we speak the same language.
Structured data is a standardized way of annotating content so machines understand the meaning, not just the text. Instead of “guessing” that “4.7” is a rating and “199 lei” a price, the machine is given these facts explicitly. Schema.org is the shared vocabulary — types (Product, Article, Organization, Person…) and properties (name, author, sameAs…) — maintained by Google, Microsoft, Yahoo and Yandex.
JSON-LD is the only format Google recommends and the de facto standard. It lives in a single <script> block separate from the visual HTML, so it's easy to generate, maintain and audit. The critical condition in the AI era: it must be rendered server-side — some AI crawlers don't fully execute JavaScript, so markup injected only client-side is invisible to them.
The 2026 cleanup: not “the death of schema”
Google deprecated the visual display of a few types — but the usefulness of the markup remained.
On 7 May 2026 Google deprecated FAQ rich results; HowTo had disappeared earlier, along with a few other features. It's easy to misread this as “schema no longer matters.” In reality, FAQ markup stays valid and useful: AI engines still extract the question-answer pairs, and other engines can display them. What disappeared is the visual widget in Google's SERP, not the meaning conveyed.
The right strategy shifts from “schema that produces a widget” to schema that communicates meaning and entity identity.
A frequently cited data point: in one test, the answer accuracy of a GPT-4-class model rose from ~16% to ~54% when the underlying content was accompanied by structured data. The figure is illustrative, not a universal law — but the direction is clear: structure helps the machine get it right.
The Websem semantic layer
What we actually build on our sites and our clients' — anchored in entity identity.
- Organization + Person as the “entity home,” with
sameAspointing to the official profiles andknowsAbout— the central node that the Knowledge Graph and LLMs recognize. - Links through
@id— everyprovider,author,publisherrefers to an entity that actually exists, so the graph is explicit. - Article, Service, BreadcrumbList — for content, offerings and structure, aligned with the visible content (parity, no invented data).
- Server-side rendering, production URLs and external validation (Schema.org Validator, Rich Results Test) — no widgets on their way out.
We no longer chase markup tricks for SERP widgets that come and go. We build a layer of meaning that belongs to you.
What to remember
- Structured data isn't a direct ranking factor — but it's the infrastructure for rich results, entity and AI citations.
- The AI era changed the stakes: be one of the 2–7 sources the model understands and trusts enough to cite.
- FAQ/HowTo no longer produce a widget in Google, but the markup stays valid and useful for AI.
- The priority: entity identity (Organization + Person + sameAs, linked through @id), rendered server-side, in parity with the content.
Frequently asked questions
Is structured data a ranking factor in Google?
Not directly. Google (via John Mueller) reconfirmed in 2025–2026 that schema markup is not a direct ranking factor. But valid markup is the major lever for rich results, for entity recognition in the Knowledge Graph and for being cited in AI answers. Pages with comprehensive structured data are more likely to be cited by AI engines and, according to several analyses, earn more clicks in the classic SERP.
If Google deprecated FAQ and HowTo, does schema still make sense?
Yes. In 2026 Google deprecated the visual display of FAQ rich results (7 May 2026) and HowTo (earlier), plus a few other features. The markup, however, stays valid and useful — for the AI engines that extract question-answer pairs and for other search engines. What disappeared is the widget in Google's SERP, not the usefulness of the markup. The strategy shifts from “schema that produces a widget” to “schema that communicates meaning and entity identity.”
Which schema types matter most in 2026?
The ones that build entity identity: Organization and Person (with sameAs pointing to the official profiles), linked through @id, plus Article/BlogPosting for content, Service for offerings and BreadcrumbList for structure. These communicate who publishes, what the page says and how the facts connect — exactly what both the Knowledge Graph and LLMs need in order to understand you and cite you with confidence.
Which structured data format should I use?
JSON-LD. It's the only format Google recommends and the industry's de facto standard. It lives in a single <script> block separate from the visual HTML, so it's easy to generate, maintain and audit without touching the presentation markup. Important: it must be rendered server-side, because some AI crawlers don't fully execute JavaScript.
Want a semantic layer that makes you citable by AI engines?
We audit the structured data you have (and what's missing), build your entity identity and link it through @id — rendered server-side, validated, in parity with the content. No strings attached.
Dan Cristian Alexandrescu
Founder, Websem
Builds semantic layers anchored in entity identity — for ranking in Google and citability in AI engines, measured on real results.
- SEO in the AI era: the anatomy of the new Google SERP37 modules occupy 60-80% of the commercial screen. The complete map of the new SERP and what “ranking” means when nobody clicks anymore.
- From position to presence: SEO and GEO as one systemClassic SEO and optimizing for AI engines don't exclude each other — they complement each other. A system that wins in both Google and AI Overviews/ChatGPT.