Basics of Information Retrieval

Foundations

January 21, 2026

・

3 min read

In the previous article, we defined Context Engineering as the systematic management of the information an AI model sees. However, before we can manage that information, we must first find it.

This is the role of Information Retrieval (or IR). In the context of AI agents and Retrieval-Augmented Generation (RAG), IR acts as the foundational layer that identifies relevant facts before the model ever generates a response.

DR vs IR

To build effective AI systems, we first have to distinguish between finding raw data and finding meaningful information.

Standard Data Retrieval is deterministic and binary. When you execute a simple SELECT x from Y SQL query for a specific ID, the system either finds that exact match or returns nothing. It is a precise operation based on explicit parameters.

Information Retrieval, by contrast, is probabilistic. Because IR deals with unstructured text where exact matches are often rare, the goal is not a simple yes or no answer. Instead, the system provides a relevance ranking. It returns a list of results sorted by the mathematical likelihood that they satisfy a specific "information need".

Keyword vs Semantic

While IR has traditionally relied on keyword matching, modern AI has introduced search based on semantic meaning. Most production-grade systems now utilize a hybrid of these two methods.

Keyword Search, or lexical retrieval, looks for literal character matches between a query and a document. This is typically achieved through an Inverted Index, which functions as a map of terms and the documents they inhabit. Algorithms like BM25 calculate scores based on term frequency and document frequency.

The primary limitation of keyword search is the "vocabulary mismatch" problem. If a user searches for "cardiac arrest" but the source text uses "heart attack", a keyword system will fail to bridge that gap.

Semantic Search addresses this by focusing on intent rather than spelling. By utilizing machine learning models, text is converted into Embeddings, which are high-dimensional numerical vectors. The system then calculates the distance between the query vector and document vectors, often using Cosine Similarity.

This allows the system to understand that "how to cool a room" is conceptually related to "air conditioning" regardless of the specific words used.

Precision and Recall

Engineering context for AI systems requires managing a constant balance between two primary metrics: Recall and Precision.

Recall measures the system's ability to find all relevant documents, ensuring that vital facts are not missed.

Precision measures the accuracy of those results, ensuring the system is precise and does not include irrelevant noise.

The balance between the two is critical because AI models operate within a limited context window. If a retrieval system provides high recall but low precision, the window becomes cluttered with "context rot".

This irrelevant information forces the model to expend its limited attention budget on noise, which often leads to inconsistent reasoning or hallucinations. Effective IR is not about finding the most data, but rather finding the most grounded and pertinent information.

Looking Ahead

In the next article, we will explore the "math of meaning" by diving deeper into the concept of Embeddings, explaining how text is actually transformed into searchable vector space.

Your next read

Introduction

Context Engineering

Introduction

Context Engineering

Introduction

Context Engineering

Foundations

Context Engineering

Foundations

Chunking and Units of Retrieval

Foundations

Chunking and Units of Retrieval

Foundations

Chunking and Units of Retrieval

Foundations

Chunking and Units of Retrieval