Search
Airweave lets you search across all your connected data sources through one unified interface. When you query a collection, Airweave runs a multi-step search pipeline that combines AI understanding with keyword precision. You can start with the defaults or configure each step for full control.
Want to try out our search right now? Head to our interactive API documentation where you can test search queries directly in your browser!
Quick Reference
Here are the default settings Airweave uses. You can override any of these in your queries.
Which endpoint to use
Choose the GET endpoint for simple searches, and the POST endpoint when you need advanced config.
GET /collections/{readable_id}/search
Best for quick searches with default settings. Just pass your query and go.
POST /collections/{readable_id}/search
Use this when you need to customize any of the options above.
How Airweave search works
Each search runs through a multi step pipeline. Understanding the stages helps explain why different parameters exist and when to use them:
- Query expansion: Generate variations of the user query to capture synonyms and related terms.
- Retrieval: Use keyword, neural, or hybrid methods to fetch candidate documents.
- Filtering: Apply structured metadata filters before or during retrieval.
- Recency bias: Optionally weight results toward fresher content.
- Reranking: Use AI to reorder the top results for higher precision.
- Answer generation: Return raw documents or synthesize a natural language response.
Defaults are designed to work out of the box, and you can override any stage as needed.
Parameters
Query Expansion
Expands your query to catch related terms and synonyms that may not appear verbatim in your documents. This improves recall when wording differs but meaning is the same.
Options: expansion_strategy
auto
(default): Uses AI to expand queries when availablellm
: Always uses AI to create up to 4 query variationsno_expansion
: Search only for your exact query
Search Method
The search method determines how Airweave searches your data. Different methods balance semantic understanding and keyword precision. You can use AI to understand meaning, traditional keyword matching, or both.
Options: search_method
hybrid
(default): Best of both worlds - finds results by meaning AND exact keywordsneural
: AI-powered search that understands what you mean, not just what you typekeyword
: Traditional search that looks for exact word matches
Filtering Results
Applies structured filters before search, ensuring only relevant subsets are scanned. Useful for large datasets or when results must match specific attributes like source, date, or status.
Parameter: filter
Example 1: Filter by source
Example 2: Multiple filters
Example 3: Exclude results
Query Interpretation
This feature is currently in beta. It can occasionally filter too narrowly, so verify result counts.
Query interpretation allows Airweave to automatically extract structured filters from a natural language query. Instead of manually defining metadata filters, you can simply describe what you are looking for, and Airweave will translate that description into filter conditions.
This feature is useful when you want to let end users search in plain English, for example “open GitHub issues from last week” or “critical bugs reported this month”. Airweave analyzes the query, identifies entities like dates, sources, or statuses, and applies them as filters.
Options: enable_query_interpretation
false
(default): You control all filters manuallytrue
: AI extracts filters from your natural language query
Temporal Relevance
Temporal relevance adjusts the results ranking to prefer newer documents. This is valuable for time-sensitive data like messages, customer feedback, tickets, or news.
The scoring formula adjusts results based on age:
Sfinal = Ssimilarity × (1 − β + β × d(t))
where,
- Sfinal = final relevance score
- Ssimilarity = semantic similarity score
- β = recency bias parameter (0 to 1)
- d(t) = time decay factor (0 = oldest, 1 = newest).
Options: recency_bias
(0.0 to 1.0)
0.3
(default): Slightly prefer newer content0.0
: Don’t care about dates, just find the best matches1.0
: Heavily prioritize the newest content
Use this when freshness matters. For example, prioritizing the latest bug reports or recent customer complaints over historical ones.
Pagination
Control how many results you get and navigate through large result sets.
Parameters:
limit
: How many results to return (1-1000, default: 20)offset
: How many results to skip (for pagination, default: 0)
Filter by Relevance Score
Set a minimum relevance score to filter out weak matches. Useful when you only want high-quality results.
Options: score_threshold
(0.0 to 1.0)
None
(default): Return all matches0.7-0.8
: Return only high-confidence matches
Use this when you need very reliable matches and can tolerate lower recall. For example, in compliance or legal document retrieval.
AI Reranking
AI reranking takes the top set of results from the initial search and reorders them using a large language model. This improves accuracy in cases where keyword or semantic similarity alone might be misleading.
Options: enable_reranking
true
(default): AI reviews and reorders results for best relevancefalse
: Skip reranking for faster results
Reranking adds about 10 seconds to your search. Turn it off if you need fast results.
Generate AI Answers
Airweave can return either raw results or a synthesized answer. When set to completion, a large language model generates a natural language response based on the top results, including sources when available.
Options: response_type
raw
(default): Get the actual results, recommended when you want full control.completion
: Get a synthesized answer generated from the top search results.
Complete example
Here’s everything together in one search:
Try these examples live in our interactive API documentation. You can execute real searches and see responses instantly!