Retrieval Augmented Generation (RAG)

Written by AX Semantics | May 6, 2025 10:15:00 PM

Retrieval Augmented Generation (RAG) are systems in which Large Language Models (LLM) and information retrieval (i.e. the retrieval of information from databases) are combined. Instead of relying solely on their internal training data, RAG models access an external knowledge base or a fixed set of documents to answer a query.

How it works:

A system searches an external data source (e.g. a document collection) for information that is relevant to a user request.
The information found is passed to the generative language model as additional context together with the original query.
The language model then generates a response that utilises and integrates the newly retrieved knowledge.

Advantages:

Up-to-date: The model can access current, domain-specific or internal company data without the need to completely retrain the model.
Fact-based: The generated content can be more fact-based as it is based on verifiable sources.
Efficiency: Helps to reduce the generation of false or fabricated information (‘hallucinations’) and increase the relevance of responses.

Areas of application:

Creation of answers based on a specific knowledge base or authoritative documents.
Knowledge-based chatbots.
RAG is used in response engines such as perplexity.ai, Google AI Overview, Microsoft Copilot & SearchGPT.

View full post