Search

Search API Updated (October 2025)

The search API has been updated. The legacy API continues to work, but we recommend migrating to the new API.

What Changed?

Endpoints:

  • GET /collections/{id}/search → Still works but deprecated
  • POST /collections/{id}/search → Accepts both old and new schemas

Request Schema:

Legacy FieldNew FieldChange
response_type ("raw" | "completion")generate_answer (boolean)Enum → Boolean
expansion_strategy ("auto" | "llm" | "no_expansion")expand_query (boolean)Enum → Boolean
enable_query_interpretationinterpret_filtersRenamed
search_methodretrieval_strategyRenamed
recency_biastemporal_relevanceRenamed
enable_rerankingrerankRenamed
score_threshold(removed)Deprecated

Full Comparison

1from airweave import AirweaveSDK
2
3client = AirweaveSDK(api_key="YOUR_API_KEY")
4
5# Old GET endpoint with query params
6response = await client.collections.search_collection(
7 readable_id="my-collection",
8 query="customer issues",
9 response_type="completion", # ❌
10 limit=50,
11 recency_bias=0.5,
12)
13
14# Old POST with verbose schema
15from airweave.schemas.search import SearchRequest
16
17request = SearchRequest(
18 query="deployment procedures",
19 response_type="completion", # ❌
20 expansion_strategy="auto", # ❌
21 enable_reranking=True, # ✅
22 enable_query_interpretation=True, # ❌
23 search_method="hybrid", # ❌
24 recency_bias=0.3, # ❌
25)

Migration Steps

Step 1: Update Request Schema

1request = SearchRequest(
2 query="test",
3 response_type="completion",
4 expansion_strategy="auto",
5 enable_reranking=True,
6 search_method="hybrid",
7)

Step 2: Update Response Handling

1response = await client.collections.search_collection(...)
2
3# Old response structure
4if response.status == "success":
5 if response.response_type == "completion":
6 print(response.completion)
7 else:
8 print(response.results)

Step 3: Remove Deprecated Fields

The new response no longer includes:

  • status field
  • response_type field

REST API Migration

GET Endpoint (Deprecated)

$curl -X GET "https://api.airweave.ai/collections/{id}/search?query=test&response_type=completion" \
> -H "x-api-key: your-api-key"

POST Endpoint Schema

${
> "query": "test",
> "response_type": "completion",
> "expansion_strategy": "auto",
> "enable_reranking": true,
> "search_method": "hybrid"
>}

Detecting Deprecation

When using the legacy API, you’ll receive HTTP headers indicating deprecation:

1X-API-Deprecation: true
2X-API-Deprecation-Message: ...

Airweave lets you search across all your connected data sources through one unified interface. When you query a collection, Airweave runs a multi-step search pipeline that combines AI understanding with keyword precision. You can start with the defaults or configure each step for full control.

Want to try out our search right now? Head to our interactive API documentation where you can test search queries directly in your browser!

Quick Reference

Here are the default settings Airweave uses. You can override any of these in your queries.

ParameterDefaultDescription
expand_querytrueGenerate query variations for better recall
retrieval_strategyhybridCombines AI semantic search with keyword matching
interpret_filtersfalseExtract filters from natural language (you control manually by default)
reranktrueLLM-based result reordering (adds ~10s latency)
temporal_relevance0.3Weight toward recent content (0.0-1.0)
generate_answertrueGenerate AI completion from results
limit1000Maximum results to return
offset0Results to skip for pagination

Which endpoint to use

How Airweave search works

Each search runs through a multi step pipeline. Understanding the stages helps explain why different parameters exist and when to use them:

  1. Query expansion: Generate variations of the user query to capture synonyms and related terms.
  2. Retrieval: Use keyword, neural, or hybrid methods to fetch candidate documents.
  3. Filtering: Apply structured metadata filters before or during retrieval.
  4. Recency bias: Optionally weight results toward fresher content.
  5. Reranking: Use AI to reorder the top results for higher precision.
  6. Answer generation: Return raw documents or synthesize a natural language response.

Defaults are designed to work out of the box, and you can override any stage as needed.

Parameters

Query Expansion

Expands your query to catch related terms and synonyms that may not appear verbatim in your documents. This improves recall when wording differs but meaning is the same.

Parameter: expand_query (boolean)

  • true (default): Generate query variations for better recall
  • false: Search only for your exact query
$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "customer churn analysis",
> "expand_query": true
> }'

Search Method

The search method determines how Airweave searches your data. Different methods balance semantic understanding and keyword precision. You can use AI to understand meaning, traditional keyword matching, or both.

Parameter: retrieval_strategy

  • hybrid (default): Best of both worlds - finds results by meaning AND exact keywords
  • neural: AI-powered search that understands what you mean, not just what you type
  • keyword: Traditional search that looks for exact word matches
$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "authentication flow security vulnerabilities",
> "retrieval_strategy": "hybrid"
> }'

Filtering Results

Applies structured filters before search, ensuring only relevant subsets are scanned. Useful for large datasets or when results must match specific attributes like source, date, or status.

Parameter: filter

Example 1: Filter by source

$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "deployment issues",
> "filter": {
> "must": [{
> "key": "source_name",
> "match": {"value": "GitHub"}
> }]
> }
> }'

Example 2: Multiple filters

$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "customer feedback",
> "filter": {
> "must": [
> {
> "key": "source_name",
> "match": {"any": ["Zendesk", "Intercom", "Slack"]}
> },
> {
> "key": "created_at",
> "range": {
> "gte": "2024-01-01T00:00:00Z"
> }
> }
> ]
> }
> }'

Example 3: Exclude results

$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "bug reports",
> "filter": {
> "must_not": [{
> "key": "status",
> "match": {"any": ["resolved", "closed", "done"]}
> }]
> }
> }'

Query Interpretation

This feature is currently in beta. It can occasionally filter too narrowly, so verify result counts.

Query interpretation allows Airweave to automatically extract structured filters from a natural language query. Instead of manually defining metadata filters, you can simply describe what you are looking for, and Airweave will translate that description into filter conditions.

This feature is useful when you want to let end users search in plain English, for example “open GitHub issues from last week” or “critical bugs reported this month”. Airweave analyzes the query, identifies entities like dates, sources, or statuses, and applies them as filters.

Parameter: interpret_filters (boolean)

  • false (default): You control all filters manually
  • true: AI extracts filters from your natural language query
$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "open asana tickets from last week",
> "interpret_filters": true
> }'

Temporal Relevance

Learn more about this topic in our blogpost: Deep Dive on Temporal Relevance .

Temporal relevance adjusts the results ranking to prefer newer documents. This is valuable for time-sensitive data like messages, customer feedback, tickets, or news.

The scoring formula adjusts results based on age:

Sfinal = Ssimilarity × (1 − β + β × d(t))

where,

  • Sfinal = final relevance score
  • Ssimilarity = semantic similarity score
  • β = recency bias parameter (0 to 1)
  • d(t) = time decay factor (0 = oldest, 1 = newest).

Parameter: temporal_relevance (0.0 to 1.0)

  • 0.3 (default): Slightly prefer newer content
  • 0.0: Don’t care about dates, just find the best matches
  • 1.0: Heavily prioritize the newest content
$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "project updates",
> "temporal_relevance": 0.7
> }'

Use this when freshness matters. For example, prioritizing the latest bug reports or recent customer complaints over historical ones.

Pagination

Control how many results you get and navigate through large result sets.

Parameters:

  • limit: How many results to return (1-1000, default: 20)
  • offset: How many results to skip (for pagination, default: 0)
$# Simple search with pagination
>curl -X GET 'https://api.airweave.ai/collections/your-collection-id/search?query=data%20retention%20policies&limit=50&offset=50' \
> -H 'x-api-key: YOUR_API_KEY'

AI Reranking

AI reranking takes the top set of results from the initial search and reorders them using a large language model. This improves accuracy in cases where keyword or semantic similarity alone might be misleading.

Parameter: rerank (boolean)

  • true (default): AI reviews and reorders results for best relevance
  • false: Skip reranking for faster results

Reranking adds about 10 seconds to your search. Turn it off if you need fast results.

$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "user authentication methods",
> "rerank": false
> }'

Generate AI Answers

Airweave can return either raw results or a synthesized answer. When enabled, a large language model generates a natural language response based on the top results, including sources when available.

Parameter: generate_answer (boolean)

  • true (default): Generate an AI-synthesized answer from the top search results
  • false: Return only raw results
$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "What are our customer refund policies?",
> "generate_answer": true
> }'

Complete example

Here’s everything together in one search using the new API:

$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "customer feedback about pricing",
> "expand_query": true,
> "retrieval_strategy": "hybrid",
> "filter": {
> "must": [{
> "key": "source_name",
> "match": {"any": ["Zendesk", "Slack"]}
> }]
> },
> "temporal_relevance": 0.5,
> "rerank": true,
> "generate_answer": false,
> "limit": 50,
> "offset": 0
> }'

Legacy API Example

If you’re still using the legacy API, here’s how the same query looks (deprecated):

$curl -X POST 'https://api.airweave.ai/collections/your-collection-id/search' \
> -H 'x-api-key: YOUR_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{
> "query": "customer feedback about pricing",
> "expansion_strategy": "auto",
> "search_method": "hybrid",
> "filter": {
> "must": [{
> "key": "source_name",
> "match": {"any": ["Zendesk", "Slack"]}
> }]
> },
> "recency_bias": 0.5,
> "enable_reranking": true,
> "response_type": "raw",
> "limit": 50,
> "offset": 0
> }'
># Response will include X-API-Deprecation header
Ready to search?

Try these examples live in our interactive API documentation. You can execute real searches and see responses instantly!