# What is RAG, and why does it matter for AI chatbots?

> RAG (retrieval-augmented generation) is what separates an AI that makes things up from an AI that answers from your own content. Here's what it means for SMBs.

Published: 2026-04-22


When a business considers an AI chatbot for its website, the first question is usually: "How do we keep it from making things up?" That's a fair concern. Language models like the ones behind ChatGPT and Claude are trained on enormous amounts of public text, and they're extremely good at generating fluent text that sounds right. But they know nothing about your specific company's prices, your opening hours, or your return policy. If you ask them to answer anyway, they guess.

The solution that has come to dominate over the last few years is called **retrieval-augmented generation**, or RAG for short. It's the technique modern AI chatbots use to stay grounded in facts from your own material instead of inventing.

## What RAG actually does

RAG is a simple idea, but it changes a lot. Instead of sending a question straight to a language model and hoping for the best, the system does two things:

1. **Retrieves**: It first searches a database of your content (web pages, documents, FAQ) for the snippets most relevant to the question.
2. **Generates**: It then gives the language model both the question and the retrieved snippets and asks it to phrase an answer based on them.

The result is an answer that's fluently phrased like a human response, but anchored in your own material. If the answer isn't in the source material, the model can say so instead of guessing.

## How a chatbot builds its knowledge base

For RAG to work, the AI first has to read and structure your content. That typically goes like this:

1. **Crawling**: A crawler fetches all relevant pages on your website. It can be automatic via your sitemap or controlled by patterns you choose.
2. **Chunking**: Each page is split into smaller pieces (typically 1,000 to 2,000 characters) with small overlap, so no relevant context is lost between chunks.
3. **Embedding**: Each chunk is sent through an *embedding model*, which converts the text into a vector. The vector is a numeric representation that captures the text's meaning.
4. **Storage**: The vectors are stored in a *vector database* that can find chunks most similar to a question in milliseconds.

When a visitor asks a question, the process repeats: the question is converted to a vector, the vector database finds the most relevant chunks, and a language model builds the answer. The whole thing typically takes under a second.

## Why RAG beats the alternatives for SMBs

Before RAG, the common approaches to chatbots were:

- **Rule-based bots**: You manually wrote a decision tree of pre-defined answers. Scales poorly and breaks the moment a customer phrases something slightly differently.
- **Fine-tuned models**: You trained a language model on your own data. Expensive, slow to update, and the model could still invent things.
- **Language model only**: You sent questions straight to ChatGPT or similar. Sounds fluent, but can't answer anything specific about your business and is prone to hallucinations.

RAG combines the strengths: you don't need to write manual rules or train a model. You point the system at your content, and it understands questions phrased naturally and answers based on what's on the site. When you update your content, the answers update on the next re-crawl, with nobody needing to touch code.

## Where RAG isn't enough

RAG has limits, and it's worth being honest about them:

- **Only what's written down**: If an answer isn't in your content, RAG can't conjure it. A good implementation should say "I don't know" rather than guess.
- **Not transactions**: RAG is information retrieval, not action. It can explain your return policy, but can't create a return request itself or fetch a specific order's status.
- **The context window is limited**: The language model typically gets the 5 to 10 most relevant text chunks as context. If your topic requires it to read the whole site at once, that can still be a challenge.

For most SMBs, these limits aren't a problem. Most visitors ask information-seeking questions, not transactional ones.

## How do you tell if a chatbot actually uses RAG well?

When you evaluate vendors, ask about these four things:

1. **Where does it pull answers from?** A good RAG system can show you which documents or pages an answer is based on.
2. **What does it do when an answer isn't there?** It should say "I don't know" or point to contact info, not guess.
3. **How quickly do answers update when you change the source material?** Best is automatic re-crawling on a schedule you choose.
4. **Which language model is used?** Different models have different quality and price. The best systems use multiple models, so simple questions are handled by cheap models and complex questions go to better ones.

## How Clarifier uses RAG

[Clarifier](/) is built on RAG from the ground up. We crawl your website automatically, allow upload of PDF and Word documents to the knowledge base, and use a combination of language models so you get good quality without burning the budget on expensive calls.

Want to see what it looks like in practice for your industry? We have pages about Clarifier for [online stores](/for/e-commerce), [consultancies](/for/konsulenter), [SaaS companies](/for/saas), and [public institutions](/for/kommuner).
