Search
Airweave’s search subsystem provides a unified query interface across heterogeneous data sources inside a collection. The pipeline transforms natural‑language queries into ranked results via a deterministic sequence of specialized operators that can be individually configured or disabled for fine‑grained control. This page summarizes each operator, its defaults, and minimal examples.
Operators
Query Expansion
Query expansion mitigates vocabulary mismatch by generating semantically equivalent reformulations of the original query, improving recall across documents that express similar concepts with different terminology. Practically, the system expands the query into semantically adjacent variants, sampling a broader neighborhood in vector and keyword space.
Configuration Parameter: expansion_strategy
auto
(default): Dynamically selects expansion based on OpenAI API availabilityllm
: Generates up to 4 query variations using GPT-4o-minino_expansion
: Disables expansion for exact query matching
Embedding Generation
Maps queries to dense and/or sparse representations depending on the search method. Dense embeddings permit semantic similarity; sparse embeddings approximate lexical BM25. In hybrid
mode both representations are produced and fused downstream.
Configuration Parameter: search_method
hybrid
(default): Combines dense semantic vectors with sparse BM25 representations via Reciprocal Rank Fusionneural
: Utilizes transformer-based embeddings for pure semantic similaritykeyword
: Employs BM25 term frequency statistics for lexical matching
Metadata Filtering
Qdrant’s native filtering system enables precise subsetting of the search space based on structured metadata fields. This operation applies boolean predicates to document attributes before similarity computation, ensuring efficient retrieval within constrained domains.
Configuration Parameter: filter
(Qdrant Filter object)
Query Interpretation (Beta)
This advanced natural language understanding component automatically extracts structured constraints from unstructured queries. The system employs large language models to identify temporal expressions, entity references, and status indicators, dynamically generating appropriate filter predicates.
Configuration Parameter: enable_query_interpretation
false
(default): Disabled to prevent unintended filteringtrue
: Enables automatic filter extraction from natural language
Implementation Example:
Query interpretation is in beta. Extracted filters may occasionally be overly restrictive, potentially excluding relevant results. Monitor result counts when using this feature.
Recency Bias
Temporal relevance scoring adjusts document rankings based on creation or modification timestamps. Internally, a decay configuration is derived from the collection’s observed time span (subject to any active filter) and composed into Qdrant formula scoring. This preserves relevance while enforcing temporal preference. The penalty is defined as:
where recency_bias ∈ [0,1]
maps oldest→0 and newest→1 within the observed span.
Configuration Parameter: recency_bias
(float: 0.0-1.0)
0.3
(default): Moderate recency preference0.0
: Pure semantic similarity without temporal influence1.0
: Maximum recency weight within semantic matches
Vector search
The core retrieval mechanism performs approximate nearest neighbor search in high-dimensional embedding spaces. This operation orchestrates the actual database query, handling multi-vector fusion, result deduplication, and score normalization.
Pagination parameters control result set size and navigation through large result collections:
limit
(int: 1-1000, default: 20): Maximum documents per responseoffset
(int: ≥0, default: 0): Skip count for pagination
Score Threshold Filtering
Post-retrieval filtering eliminates results below a specified similarity threshold, ensuring minimum quality standards for returned documents. This parameter acts as a quality gate rather than a separate operation.
Configuration Parameter: score_threshold
(float: 0.0-1.0)
None
(default): No score filtering0.7
: Returns only high-confidence matches
Result Reranking
Reranking employs large language models to perform pairwise relevance assessment between query and retrieved documents. This computationally intensive operation refines the initial ranking by considering deeper semantic relationships and contextual nuances.
Configuration Parameter: enable_reranking
true
(default): Applies LLM-based relevance scoringfalse
: Retains original vector similarity ranking
Reranking adds approximately 10 seconds to query latency due to LLM processing. Disable for time-sensitive applications.
Response Generation
Synthesizes a natural language answer from the retrieved context. The completion model is invoked only when explicitly requested and requires an OpenAI API key in the environment. Source attribution is encouraged via structured prompts.
Configuration Parameter: response_type
raw
(default): Returns structured JSON with document payloadscompletion
: Generates coherent natural language summary via LLM
API Endpoints
GET /collections/{id}/search
: Simple search using all operator defaults (link).
POST /collections/{id}/search
: Advanced search exposing all operators for full configurability. Use this when you need precise control over search behavior (link).
Default Configuration
The system employs empirically optimized defaults for common use cases: