Foundations
Basics of Information Retrieval
・
In the previous article, we defined Context Engineering as the systematic management of the information an AI model sees. However, before we can manage that information, we must first find it.
This is the role of Information Retrieval (or IR). In the context of AI agents and Retrieval-Augmented Generation (RAG), IR acts as the foundational layer that identifies relevant facts before the model ever generates a response.
DR vs IR
To build effective AI systems, we first have to distinguish between finding raw data and finding meaningful information.
Standard Data Retrievalis deterministic and binary. When you execute a simpleSELECT x from YSQL query for a specific ID, the system either finds that exact match or returns nothing. It is a precise operation based on explicit parameters.
Information Retrieval, by contrast, is probabilistic. Because IR deals with unstructured text where exact matches are often rare, the goal is not a simple yes or no answer. Instead, the system provides a relevance ranking. It returns a list of results sorted by the mathematical likelihood that they satisfy a specific "information need".
Keyword vs Semantic
While IR has traditionally relied on keyword matching, modern AI has introduced search based on semantic meaning. Most production-grade systems now utilize a hybrid of these two methods.
Keyword Search, or lexical retrieval, looks for literal character matches between a query and a document. This is typically achieved through an Inverted Index, which functions as a map of terms and the documents they inhabit. Algorithms like BM25 calculate scores based on term frequency and document frequency.
The primary limitation of keyword search is the "vocabulary mismatch" problem. If a user searches for "cardiac arrest" but the source text uses "heart attack", a keyword system will fail to bridge that gap.
Semantic Searchaddresses this by focusing on intent rather than spelling. By utilizing machine learning models, text is converted into Embeddings, which are high-dimensional numerical vectors. The system then calculates the distance between the query vector and document vectors, often using Cosine Similarity.
This allows the system to understand that "how to cool a room" is conceptually related to "air conditioning" regardless of the specific words used.
Precision and Recall
Engineering context for AI systems requires managing a constant balance between two primary metrics: Recall and Precision.
Recallmeasures the system's ability to find all relevant documents, ensuring that vital facts are not missed.
Precisionmeasures the accuracy of those results, ensuring the system is precise and does not include irrelevant noise.
The balance between the two is critical because AI models operate within a limited context window. If a retrieval system provides high recall but low precision, the window becomes cluttered with "context rot".
This irrelevant information forces the model to expend its limited attention budget on noise, which often leads to inconsistent reasoning or hallucinations. Effective IR is not about finding the most data, but rather finding the most grounded and pertinent information.
Looking Ahead
In the next article, we will explore the "math of meaning" by diving deeper into the concept of Embeddings, explaining how text is actually transformed into searchable vector space.
