Search | Airweave

Search API Updated (October 2025)

The search API has been updated. The legacy API continues to work, but we recommend migrating to the new API.

View Migration Details

What Changed?

Endpoints:

~~GET /collections/{id}/search~~ → Still works but deprecated
POST /collections/{id}/search → Accepts both old and new schemas

Request Schema:

Legacy Field	New Field	Change
`response_type` (`"raw"` \| `"completion"`)	`generate_answer` (boolean)	Enum → Boolean
`expansion_strategy` (`"auto"` \| `"llm"` \| `"no_expansion"`)	`expand_query` (boolean)	Enum → Boolean
`enable_query_interpretation`	`interpret_filters`	Renamed
`search_method`	`retrieval_strategy`	Renamed
`recency_bias`	`temporal_relevance`	Renamed
`enable_reranking`	`rerank`	Renamed
`score_threshold`	(removed)	Deprecated

Full Comparison

1 from airweave import AirweaveSDK
2 
3 client = AirweaveSDK(api_key="YOUR_API_KEY")
4 
5 # Old GET endpoint with query params
6 response = await client.collections.search_collection(
7     readable_id="my-collection",
8     query="customer issues",
9     response_type="completion",  # ❌
10     limit=50,
11     recency_bias=0.5,
12 )
13 
14 # Old POST with verbose schema
15 from airweave.schemas.search import SearchRequest
16 
17 request = SearchRequest(
18     query="deployment procedures",
19     response_type="completion",           # ❌
20     expansion_strategy="auto",            # ❌
21     enable_reranking=True,                # ✅
22     enable_query_interpretation=True,     # ❌
23     search_method="hybrid",               # ❌
24     recency_bias=0.3,                     # ❌
25 )

Migration Steps

Step 1: Update Request Schema

1 request = SearchRequest(
2     query="test",
3     response_type="completion",
4     expansion_strategy="auto",
5     enable_reranking=True,
6     search_method="hybrid",
7 )

Step 2: Update Response Handling

1 response = await client.collections.search_collection(...)
2 
3 # Old response structure
4 if response.status == "success":
5     if response.response_type == "completion":
6         print(response.completion)
7     else:
8         print(response.results)

Step 3: Remove Deprecated Fields

The new response no longer includes:

status field
response_type field

REST API Migration

GET Endpoint (Deprecated)

$ curl -X GET "https://api.airweave.ai/collections/{id}/search?query=test&response_type=completion" \
>   -H "x-api-key: your-api-key"

POST Endpoint Schema

$ {
>   "query": "test",
>   "response_type": "completion",
>   "expansion_strategy": "auto",
>   "enable_reranking": true,
>   "search_method": "hybrid"
> }

Detecting Deprecation

When using the legacy API, you’ll receive HTTP headers indicating deprecation:

1 X-API-Deprecation: true
2 X-API-Deprecation-Message: ...

Airweave lets you search across all your connected data sources through one unified interface. When you query a collection, Airweave runs a multi-step search pipeline that combines AI understanding with keyword precision. You can start with the defaults or configure each step for full control.

Want to try out our search right now? Head to our interactive API documentation where you can test search queries directly in your browser!

Quick Reference

Here are the default settings Airweave uses. You can override any of these in your queries.

Parameter	Default	Description
`expand_query`	`true`	Generate query variations for better recall
`retrieval_strategy`	`hybrid`	Combines AI semantic search with keyword matching
`interpret_filters`	`false`	Extract filters from natural language (you control manually by default)
`rerank`	`true`	LLM-based result reordering (adds ~10s latency)
`temporal_relevance`	`0.3`	Weight toward recent content (0.0-1.0)
`generate_answer`	`true`	Generate AI completion from results
`limit`	`1000`	Maximum results to return
`offset`	`0`	Results to skip for pagination

Which endpoint to use

Simple Search (Deprecated)

GET /collections/{readable_id}/search

⚠️ Deprecated: This endpoint is maintained for backwards compatibility only. Use POST for new integrations.

Advanced Search (Recommended)

POST /collections/{readable_id}/search

Recommended: Full control. Use this for all new integrations.

How Airweave search works

Each search runs through a multi step pipeline. Understanding the stages helps explain why different parameters exist and when to use them:

Query expansion: Generate variations of the user query to capture synonyms and related terms.
Retrieval: Use keyword, neural, or hybrid methods to fetch candidate documents.
Filtering: Apply structured metadata filters before or during retrieval.
Recency bias: Optionally weight results toward fresher content.
Reranking: Use AI to reorder the top results for higher precision.
Answer generation: Return raw documents or synthesize a natural language response.

Defaults are designed to work out of the box, and you can override any stage as needed.

Parameters

Query Expansion

Expands your query to catch related terms and synonyms that may not appear verbatim in your documents. This improves recall when wording differs but meaning is the same.

Parameter: expand_query (boolean)

true (default): Generate query variations for better recall
false: Search only for your exact query

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "customer churn analysis",
>     "expand_query": true
>   }'

Search Method

The search method determines how Airweave searches your data. Different methods balance semantic understanding and keyword precision. You can use AI to understand meaning, traditional keyword matching, or both.

Parameter: retrieval_strategy

hybrid (default): Best of both worlds - finds results by meaning AND exact keywords
neural: AI-powered search that understands what you mean, not just what you type
keyword: Traditional search that looks for exact word matches

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "authentication flow security vulnerabilities",
>     "retrieval_strategy": "hybrid"
>   }'

Filtering Results

Applies structured filters before search, ensuring only relevant subsets are scanned. Useful for large datasets or when results must match specific attributes like source, date, or status.

Parameter: filter

Example 1: Filter by source

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "deployment issues",
>     "filter": {
>       "must": [{
>         "key": "source_name",
>         "match": {"value": "GitHub"}
>       }]
>     }
>   }'

Example 2: Multiple filters

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "customer feedback",
>     "filter": {
>       "must": [
>         {
>           "key": "source_name",
>           "match": {"any": ["Zendesk", "Intercom", "Slack"]}
>         },
>         {
>           "key": "created_at",
>           "range": {
>             "gte": "2024-01-01T00:00:00Z"
>           }
>         }
>       ]
>     }
>   }'

Example 3: Exclude results

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "bug reports",
>     "filter": {
>       "must_not": [{
>         "key": "status",
>         "match": {"any": ["resolved", "closed", "done"]}
>       }]
>     }
>   }'

Query Interpretation

This feature is currently in beta. It can occasionally filter too narrowly, so verify result counts.

Query interpretation allows Airweave to automatically extract structured filters from a natural language query. Instead of manually defining metadata filters, you can simply describe what you are looking for, and Airweave will translate that description into filter conditions.

This feature is useful when you want to let end users search in plain English, for example “open GitHub issues from last week” or “critical bugs reported this month”. Airweave analyzes the query, identifies entities like dates, sources, or statuses, and applies them as filters.

Parameter: interpret_filters (boolean)

false (default): You control all filters manually
true: AI extracts filters from your natural language query

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "open asana tickets from last week",
>     "interpret_filters": true
>   }'

Temporal Relevance

Learn more about this topic in our blogpost: Deep Dive on Temporal Relevance .

Temporal relevance adjusts the results ranking to prefer newer documents. This is valuable for time-sensitive data like messages, customer feedback, tickets, or news.

The scoring formula adjusts results based on age:

S_final = S_similarity × (1 − β + β × d(t))

where,

S_final = final relevance score
S_similarity = semantic similarity score
β = recency bias parameter (0 to 1)
d(t) = time decay factor (0 = oldest, 1 = newest).

Parameter: temporal_relevance (0.0 to 1.0)

0.3 (default): Slightly prefer newer content
0.0: Don’t care about dates, just find the best matches
1.0: Heavily prioritize the newest content

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "project updates",
>     "temporal_relevance": 0.7
>   }'

Use this when freshness matters. For example, prioritizing the latest bug reports or recent customer complaints over historical ones.

Pagination

Control how many results you get and navigate through large result sets.

Parameters:

limit: How many results to return (1-1000, default: 20)
offset: How many results to skip (for pagination, default: 0)

$ # Simple search with pagination
> curl -X GET 'https://api.airweave.ai/collections/your-collection-id/search?query=data%20retention%20policies&limit=50&offset=50' \
>   -H 'x-api-key: YOUR_API_KEY'

AI Reranking

AI reranking takes the top set of results from the initial search and reorders them using a large language model. This improves accuracy in cases where keyword or semantic similarity alone might be misleading.

Parameter: rerank (boolean)

true (default): AI reviews and reorders results for best relevance
false: Skip reranking for faster results

Reranking adds about 10 seconds to your search. Turn it off if you need fast results.

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "user authentication methods",
>     "rerank": false
>   }'

Generate AI Answers

Airweave can return either raw results or a synthesized answer. When enabled, a large language model generates a natural language response based on the top results, including sources when available.

Parameter: generate_answer (boolean)

true (default): Generate an AI-synthesized answer from the top search results
false: Return only raw results

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "What are our customer refund policies?",
>     "generate_answer": true
>   }'

Complete example

Here’s everything together in one search using the new API:

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "customer feedback about pricing",
>     "expand_query": true,
>     "retrieval_strategy": "hybrid",
>     "filter": {
>       "must": [{
>         "key": "source_name",
>         "match": {"any": ["Zendesk", "Slack"]}
>       }]
>     },
>     "temporal_relevance": 0.5,
>     "rerank": true,
>     "generate_answer": false,
>     "limit": 50,
>     "offset": 0
>   }'

Legacy API Example

If you’re still using the legacy API, here’s how the same query looks (deprecated):

$ curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
>   -H 'x-api-key: YOUR_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "customer feedback about pricing",
>     "expansion_strategy": "auto",
>     "search_method": "hybrid",
>     "filter": {
>       "must": [{
>         "key": "source_name",
>         "match": {"any": ["Zendesk", "Slack"]}
>       }]
>     },
>     "recency_bias": 0.5,
>     "enable_reranking": true,
>     "response_type": "raw",
>     "limit": 50,
>     "offset": 0
>   }'
> # Response will include X-API-Deprecation header

Ready to search?

Try these examples live in our interactive API documentation. You can execute real searches and see responses instantly!