Glossary of chatbots & conversational AI agents.

Definitions for the technical terms behind a business conversational assistant — from model architecture (RAG, knowledge base, fine-tuning) to in-conversation behavior (persona, hallucination, handoff) and operations (CRM integration, sentiment, compliance). A citable index with semantic relations and structured markup, built for indexing in ChatGPT, Perplexity, Claude, and Google AI Overviews.

Updated: Jun 6, 2026
Terms: 24
Table: 1
Questions: 6

01 · Index

Comparison

02 · Comparison table

Technical terms for chatbots and conversational AI agents, grouped by category.
Term	Category	Short definition	Relation
LLM	Architecture & models	The language model that generates the answers.	Root term
RAG	Architecture & models	Generation grounded in your documents.	Combines: LLM + Knowledge base
Knowledge base	Architecture & models	The assistant's source of truth.	Feeds: RAG
Embeddings	Architecture & models	The numeric representation of a text's meaning.	Underpins: RAG
Vector database	Architecture & models	The database for semantic search.	Stores: Embeddings
Fine-tuning	Architecture & models	Retraining the model on your own data.	Alternative/complement to: RAG
System prompt	Architecture & models	The instructions that fix the role and the rules.	Configures: LLM
Context window	Architecture & models	How much the model "remembers" within one exchange.	Limit of: LLM
Persona	Conversation	The assistant's personality and tone.	Defined in: System prompt
Conversație multi-turn	Conversation	An exchange across several turns with context preserved.	Requires: Memorie conversațională
Memorie conversațională	Conversation	Retaining information across turns.	Enables: Conversație multi-turn
Intent	Conversation	The real goal behind the message.	Output of: NLU
NLU	Conversation	Understanding natural language.	Detects: Intent
Halucinație	Conversation	A plausible but false answer.	Countered by: Grounding
Grounding	Conversation	Anchoring the answer in verifiable sources.	Antidote to: Halucinație
Guardrails	Conversation	The behavioral limits imposed on the assistant.	Applied via: System prompt
Fallback	Conversation	What the assistant does when it doesn't know.	Triggers: Handoff
Handoff	Operations & integration	The transfer to a human, with context.	Follows: Fallback
Lead capture & calificare	Operations & integration	Collecting and qualifying prospects.	Feeds: Integrare CRM
Integrare CRM	Operations & integration	Sending data into the team's system.	Receives from: Lead capture & calificare
Sentiment analysis	Operations & integration	Reading the conversation's emotional tone.	Input for: Handoff
Latență	Operations & integration	The time to the first response.	Performance metric
Canal conversațional	Operations & integration	Where the assistant lives (web, WhatsApp, Messenger).	Web / WhatsApp / Messenger
GDPR & data retention	Compliance	Lawful processing of data.	Spans all categories

Definitions

03 · Terms

Architecture & models

LLM

Large Language ModelRoot term

A large-scale language model, trained on vast amounts of text, that generates answers by predicting the next word. It is the engine that produces a conversational assistant's language. Examples used in implementations: GPT-4o, Claude, Llama, Mistral.

Appeared: The Transformer architecture — 2017 (Google); consumer LLMs — from 2022 (ChatGPT).
In implementation: We pick the model per case: GPT-4o for conversational reasoning, Claude for long documents and nuanced tone, open-source models (Llama, Mistral) for on-premise or tight budgets.

Related termsRAG Fine-tuning Context window

Applied resources

Sonya — AI personalities trained on your brand

RAG

Retrieval-Augmented GenerationCombines: LLM + Knowledge base

A technique where, before answering, the assistant searches the relevant fragments from your documents and hands them to the model as context. The answer is thus anchored in real data, not in the model's generic memory — the primary method for reducing hallucinations.

Appeared: Concept formalized in 2020 (Facebook AI / Meta paper), a standard in business assistants since 2023.
In implementation: We split documents into fragments, turn them into embeddings, store them in a vector database and, on every question, inject the matching fragments into the prompt.

Applied resources

Sonya — a knowledge base on your documents

Knowledge base

Knowledge baseFeeds: RAG

The collection of sources the assistant "knows" — product catalog, prices, FAQs, manuals, internal policies. It is the source of truth from which every answer is built. The cleaner and more complete it is, the more precisely the assistant answers.

Appeared: A pre-AI customer-support concept; redefined as a source for RAG from 2023.
In implementation: We import catalog, documents, FAQs and manuals; we structure them and refresh them periodically. The Starter package covers up to 50 documents; Business and Enterprise — unlimited knowledge base.

Related termsRAG Embeddings Grounding

Applied resources

Sonya — import catalog, documents, FAQ

Embeddings

Semantic vectorsUnderpins: RAG

The representation of a text as a list of numbers (a vector) that captures meaning, not the exact words. Texts with close meaning have close vectors. It enables semantic search: the assistant finds the relevant fragment even when the user uses different words than the document.

Appeared: Word2Vec — 2013; sentence/document-level embeddings went mainstream from 2022.
In implementation: We turn every fragment of the knowledge base into embeddings and compare them with the question's embedding to retrieve the most relevant passages.

Related termsRAG Vector database

Vector database

Vector databaseStores: Embeddings

A database optimized for storing and quickly searching embeddings. It answers the question "which fragments are semantically similar to this query?" in milliseconds, even across millions of documents. The "long-term memory" component of a RAG system.

Appeared: A category that consolidated in 2022–2023 (Pinecone, Weaviate, pgvector, Qdrant).
In implementation: We choose the solution by volume and hosting requirements; for EU/on-premise deployments we use self-hosted options, GDPR-compliant.

Related termsEmbeddings RAG

Fine-tuning

Fine-tuningAlternative/complement to: RAG

Retraining a model on your own examples so it adopts a specific style, format or behavior. Unlike RAG (which adds knowledge at answer time), fine-tuning changes the model itself. Useful for tone and repetitive tasks; more expensive to maintain.

Appeared: A standard transfer-learning practice; applied commercially to LLMs from 2023.
In implementation: Reserved for the Enterprise package, when the brand needs a dedicated model. For most cases, RAG + a well-written system prompt achieve similar results, more cheaply.

Related termsLLM RAG System prompt

System prompt

System promptConfigures: LLM

The set of instructions, invisible to the user, that fix the assistant's role, tone, rules and limits before any conversation. This is where the persona is defined, what it may and may not say, and how it escalates. The backbone of behavior.

Appeared: A standard mechanism of chat LLMs from 2022.
In implementation: Here we encode the persona, guardrails and the fallback rules. A poorly written system prompt yields a generic assistant; a well-written one often removes the need for fine-tuning.

Related termsPersona Guardrails Fine-tuning

Context window

Context windowLimit of: LLM

The maximum amount of text (measured in tokens) a model can process in a single exchange — including the instructions, the conversation history and the injected documents. When the conversation exceeds it, the model "forgets" the oldest part unless a memory mechanism exists.

Appeared: A concept from the earliest LLMs; windows grew sharply from 2023.
In implementation: We manage the window by summarizing history and through selective RAG, so only the relevant context fits — keeping latency low.

Conversation

Persona

Brand voice · AI personalityDefined in: System prompt

The assistant's distinct personality — tone, vocabulary, signature lines, the way it reacts to jokes and objections. It turns an animated FAQ into an interlocutor that speaks like the brand. The difference between a generic chatbot and an assistant that actually sells lies, first of all, in the persona.

Appeared: A conversation-design discipline that solidified alongside generative AI assistants, 2023–2024.
In implementation: We define the persona in a workshop with the team (Day 1): voice, target audience, objective. Live examples: Dona (ironic, pharma), Hanna (consultative, HVAC), Sierra (adventurous, premium auto).

Related termsSystem prompt Guardrails

Applied resources

Conversație multi-turn

Multi-turn conversationRequires: Memorie conversațională

A conversation that unfolds across several successive turns, where the assistant understands references to earlier messages and fills missing information by asking naturally. The opposite of a bot that treats each message in isolation and forgets what was said one turn ago.

Appeared: A standard capability of LLM assistants from 2022.
In implementation: The assistant keeps the thread: if you ask "and for menopause?" after a discussion about supplements, it understands what you mean without you restating everything.

Memorie conversațională

Context / memoryEnables: Conversație multi-turn

The mechanism by which the assistant retains relevant information throughout a conversation (short-term memory) or across sessions (long-term memory). It enables continuity: the user need not repeat context, and answers stay coherent from one turn to the next.

Appeared: Memory techniques for conversational agents, mature from 2023.
In implementation: We combine the recent conversation history with RAG over profile/history, keeping only what is relevant so we don't fill the context window.

Intent

IntentOutput of: NLU

The real goal behind a message — what the user actually wants, beyond the words used. "How much is it?", "do you have discounts?" and "is it expensive?" can share the same intent. Reading intent correctly determines whether the assistant answers the question asked or the one missed.

Appeared: A central concept in classic chatbots (from ~2016); reinterpreted by LLMs, which infer it without rigid rules.
In implementation: In modern assistants, intent is inferred by the model in context, not hardcoded into a decision tree — hence the flexibility on nuanced questions.

Related termsNLU Conversație multi-turn

NLU

Natural Language UnderstandingDetects: Intent

A system's ability to understand natural language — the meaning, intent and entities in a freely written message, with typos, slang or unexpected phrasing. The component that separates an assistant that "gets it" from a button menu that demands exact wording.

Appeared: An NLP subfield with a long history; revolutionized by LLMs from 2022.
In implementation: In AI assistants, NLU comes natively from the LLM: we no longer train separate intent classifiers, relying instead on the model's contextual understanding.

Related termsIntent LLM

Halucinație

HallucinationCountered by: Grounding

An answer that sounds plausible and self-assured but is false or invented — a wrong price, a non-existent policy, a fictional source. The biggest business risk of an AI assistant: a hallucination destroys trust exactly when the customer was ready to buy.

Appeared: A term popularized alongside consumer LLMs, 2022–2023.
In implementation: We counter it through RAG (anchoring in real data), grounding, clear knowledge limits and fallback rules: when it doesn't know, the assistant says so and escalates. It never invents.

Related termsGrounding RAG Fallback Guardrails

Grounding

Source groundingAntidote to: Halucinație

The practice of forcing the assistant to build its answers exclusively from verifiable sources — your documents — and, ideally, to cite where the information comes from. A "grounded" answer can be verified; an ungrounded one is just a well-phrased guess from the model.

Appeared: A principle that solidified with the production adoption of RAG, 2023.
In implementation: We configure the assistant to answer from the knowledge base, not from generic knowledge, and to explicitly acknowledge when a piece of information is missing.

Related termsRAG Halucinație Knowledge base

Guardrails

Behavior limitsApplied via: System prompt

The rules that set what the assistant may and may not do — forbidden topics, mandatory tone, the duty not to invent, not to pose as human, to escalate in certain situations. The "red line" that protects the brand from an assistant gone off the rails.

Appeared: An AI-safety discipline applied commercially from 2023.
In implementation: Our red lines: we don't build assistants without personality, we don't allow lying/hallucination, the assistant always states it is AI, and we process data only in a GDPR-compliant way.

Related termsSystem prompt Halucinație Fallback

Fallback

Fallback behaviorTriggers: Handoff

What the assistant does when it reaches the limit of its knowledge or competence. Instead of inventing, it runs a predefined plan: it says it doesn't know and suggests human contact, automatically escalates to a live agent, or collects contact details and promises a follow-up.

Appeared: A conversation-design concept, essential in the age of hallucinations.
In implementation: We define the exact fallback variant per brand, in the Discovery phase. Good fallback is invisible to the customer — it reads as care, not failure.

Related termsHandoff Halucinație Guardrails

Operations & integration

Handoff

Escalation to a human agentFollows: Fallback

The transfer of the conversation from the AI assistant to a human agent, keeping all context intact — the history, the question and the collected data. The customer doesn't start over. It triggers when the conversation exceeds the AI's scope or when sentiment indicates frustration.

Appeared: A standard feature in customer-engagement platforms; integrated with AI from 2023.
In implementation: Available from the Business package. The transfer is instant and complete — the agent picks up exactly where the assistant left off.

Related termsFallback Sentiment analysis

Applied resources

Sonya — live handoff to a human agent

Lead capture & calificare

Lead capture & qualificationFeeds: Integrare CRM

Collecting contact details through natural conversation (not a dry form) and qualifying the prospect by your criteria — budget, need, timing — before passing them to the sales team. The assistant turns curious visitors into qualified quote requests, with no human effort.

Appeared: A marketing-automation practice reinvented conversationally from 2023.
In implementation: The assistant asks naturally, qualifies by your rules and sends the lead straight to the CRM. Available from the Business package.

Related termsIntegrare CRM Intent

Applied resources

Sonya — lead capture and qualification

Integrare CRM

CRM integrationReceives from: Lead capture & calificare

Connecting the assistant to the customer-relationship management system, so leads, conversations and sentiment flow automatically to where the team works. It eliminates manual copy-paste and ensures no prospect generated by the assistant is lost.

Appeared: Native AI–CRM integrations, mature from 2023–2024.
In implementation: We connect HubSpot, Pipedrive, Salesforce, Monday or a custom API (Enterprise). The lead, the transcript and the score land in the right pipeline.

Applied resources

Sonya — CRM integrations

Sentiment analysis

Sentiment analysisInput for: Handoff

Automatic detection of the conversation's emotional tone — satisfied, neutral, frustrated. It lets the assistant adapt its answer and escalate to a human before an unhappy customer leaves. In aggregate, it shows which topics generate recurring frustration.

Appeared: A classic NLP technique, built into modern conversational dashboards.
In implementation: Sentiment feeds the handoff decision and lands in the analytics dashboard as actionable data — not dead reports.

Related termsHandoff Integrare CRM

Latență

LatencyPerformance metric

The time elapsed between the user's question and the assistant's first response. High latency kills the conversation: the customer perceives the hesitation as a malfunction. The practical goal for a business assistant is a response perceived as almost instant.

Appeared: A universal performance metric; critical for conversational UX.
In implementation: We optimize latency per channel; the operational target is a response under 3 seconds. The choice of model and management of the context window are the main levers.

Related termsLLM Context window

Canal conversațional

ChannelWeb / WhatsApp / Messenger

The place where the assistant lives and talks to customers — a website widget, WhatsApp Business, Facebook Messenger, Instagram Direct. The same persona, the same knowledge, on any channel. For the Romanian market, WhatsApp is often the preferred channel for communicating with brands.

Appeared: An omnichannel strategy adapted to AI assistants, 2023–2024.
In implementation: An embeddable widget on any site technology (no re-engineering), plus WhatsApp Business API, Messenger and Instagram from the Business package. 24/7 response, the same voice everywhere.

Related termsPersona Latență

Applied resources

Sonya — web widget, WhatsApp, Messenger

Compliance

Frequently asked questions

04 · FAQ

What is the difference between a classic chatbot and a conversational AI assistant?

A classic chatbot runs on decision trees: buttons, menus and pre-written answers triggered by exact keywords. A conversational AI assistant uses an LLM that understands natural language (NLU), remembers context (multi-turn conversation), infers the real intent and answers with nuance. The first demands exact phrasing; the second gets what you want even when you ask imperfectly.

What is RAG and why does it matter for a business chatbot?

RAG (Retrieval-Augmented Generation) is the technique where the assistant first searches the relevant fragments from your documents and only then formulates the answer, anchored in those sources. It matters because it is the primary method for preventing hallucinations: the assistant answers from your real knowledge base, not from generic knowledge, so it doesn't invent non-existent prices or policies.

How do you stop an AI assistant from inventing information (hallucinating)?

Through four combined mechanisms: RAG and grounding (the answer is built only from verifiable sources), clear knowledge limits set in the system prompt, guardrails that forbid unsupported claims, and fallback rules whereby, when it doesn't know, the assistant says so and escalates to a human. We never allow answers to be invented.

What does fine-tuning mean and do I need it?

Fine-tuning means retraining the model on your own examples so it adopts a specific style or behavior. For most projects it isn't necessary: a well-defined persona in the system prompt, plus RAG over your documents, achieve similar results at a much lower cost. Fine-tuning becomes relevant at the Enterprise level, when the brand needs a dedicated model.

On which channels can the assistant run?

On a widget embedded in the website (any technology, no re-engineering), on WhatsApp Business, on Facebook Messenger and Instagram Direct. The same persona and the same knowledge base everywhere, with 24/7 response. For the Romanian market, WhatsApp is frequently the preferred channel for communicating with brands.

Is customer data safe and GDPR-compliant?

Yes. All data is processed in the EU, with end-to-end encryption and explicit consent requested at the start of every session. The retention period (retention) is configurable to your requirements, and your data is not used to train public models. See the GDPR & data retention term for details.

Sources · next steps

Sources: the RAG paper (Meta AI, 2020), GPT-4o and Claude model documentation, Regulation (EU) 2016/679 (GDPR), Websem's own observations from 2024–2026 implementations.

We build conversational assistants with a distinct personality, trained on your data, live on website, WhatsApp and Messenger.

See the service →

Glossary of chatbots & conversational AI agents.

Contents

Comparison

Definitions

Architecture & models

Conversation

Operations & integration

Compliance

Frequently asked questions

What is the difference between a classic chatbot and a conversational AI assistant?

What is RAG and why does it matter for a business chatbot?

How do you stop an AI assistant from inventing information (hallucinating)?

What does fine-tuning mean and do I need it?

On which channels can the assistant run?

Is customer data safe and GDPR-compliant?