Search

POST /collections/{id}/search/instant

Direct vector search. Use when speed is critical (~0.5sec).

The only parameter unique to instant is retrieval_strategy, which controls how the vector database matches your query:

  • hybrid (default) — Combines semantic and keyword search via Reciprocal Rank Fusion. Best for most queries.
  • semantic — Dense vector cosine similarity. Finds conceptually similar content even when wording differs.
  • keyword — BM25 text matching. Only returns content with your exact terms. Use for error codes, identifiers, or known phrases.

In classic and agentic search, the retrieval strategy is chosen automatically.

POST /collections/{id}/search/classic

AI-optimized search strategy. Sensible default for most use cases (~2sec).

An LLM analyzes your query and generates an optimized search strategy.

POST /collections/{id}/search/agentic

Agent that navigates through your collection to find the best results. Use when recall matters more than latency (<2min).

An AI agent iteratively searches your data using tool calling. It searches with multiple strategies, reads full documents, navigates entity hierarchies (parent/child/sibling), and builds a comprehensive result set.

Two parameters unique to agentic:

  • thinking — Enables extended chain-of-thought reasoning before tool calls. Better search strategies, but slower and uses more tokens. Useful for complex or ambiguous queries.
  • limit — Unlike instant/classic where the vector database always returns up to limit results, the agent collects results based on relevance. It may return fewer if it decides there aren’t enough matches. Setting a limit caps the maximum — if the agent collects more, results are truncated. When null (default), there is no cap.

Streaming

POST /collections/{id}/search/agentic/stream

Real-time SSE events as the agent works. Events are delivered as data: {json}\n\n messages. The stream terminates after a done or error event.

Emitted once when the search begins.

1{
2 "type": "started",
3 "request_id": "req-abc123",
4 "tier": "agentic",
5 "collection_readable_id": "my-collection",
6 "query": "What authentication methods do we support?",
7 "thinking": true,
8 "filter": null,
9 "limit": null
10}

Emitted once per iteration after the LLM responds. thinking contains extended reasoning (when enabled), text contains conversational output before tool calls.

1{
2 "type": "thinking",
3 "thinking": "The user is asking about authentication methods. I should search for docs about auth, SSO, API keys...",
4 "text": "Searching for authentication documentation...",
5 "duration_ms": 2340,
6 "diagnostics": {
7 "iteration": 0,
8 "prompt_tokens": 4521,
9 "completion_tokens": 892
10 }
11}

Emitted after each tool the agent calls. diagnostics.arguments has the full tool input, diagnostics.stats has the output. The stats shape depends on which tool was called:

1{
2 "type": "tool_call",
3 "tool_name": "read",
4 "duration_ms": 42,
5 "diagnostics": {
6 "iteration": 1,
7 "tool_call_id": "call_r1",
8 "arguments": {
9 "entity_ids": ["page-auth", "page-oauth", "page-sso"]
10 },
11 "stats": {
12 "found": 3,
13 "not_found": 0,
14 "entities": [
15 {
16 "entity_id": "page-auth",
17 "name": "Authentication Overview",
18 "entity_type": "NotionPageEntity",
19 "source_name": "notion"
20 }
21 ],
22 "context_label": "3 entities, chunks 0-2"
23 }
24 }
25}
1{
2 "type": "tool_call",
3 "tool_name": "add_to_results",
4 "duration_ms": 1,
5 "diagnostics": {
6 "iteration": 1,
7 "tool_call_id": "call_a1",
8 "arguments": {
9 "entity_ids": ["page-auth", "page-oauth"]
10 },
11 "stats": {
12 "added": 2,
13 "already_collected": 0,
14 "not_found": 0,
15 "total_collected": 2
16 }
17 }
18}
1{
2 "type": "tool_call",
3 "tool_name": "remove_from_results",
4 "duration_ms": 1,
5 "diagnostics": {
6 "iteration": 3,
7 "tool_call_id": "call_rm1",
8 "arguments": {
9 "entity_ids": ["page-unrelated"]
10 },
11 "stats": {
12 "added": 0,
13 "already_collected": 0,
14 "not_found": 0,
15 "total_collected": 5
16 }
17 }
18}
1{
2 "type": "tool_call",
3 "tool_name": "count",
4 "duration_ms": 23,
5 "diagnostics": {
6 "iteration": 0,
7 "tool_call_id": "call_c1",
8 "arguments": {
9 "filter_groups": [
10 {
11 "conditions": [
12 {
13 "field": "airweave_system_metadata.source_name",
14 "operator": "equals",
15 "value": "github"
16 }
17 ]
18 }
19 ]
20 },
21 "stats": {
22 "count": 312
23 }
24 }
25}
1{
2 "type": "tool_call",
3 "tool_name": "get_children",
4 "duration_ms": 67,
5 "diagnostics": {
6 "iteration": 2,
7 "tool_call_id": "call_gc1",
8 "arguments": {
9 "entity_id": "chan-engineering",
10 "limit": 50
11 },
12 "stats": {
13 "result_count": 34,
14 "context_label": "children of chan-engineering",
15 "first_results": [
16 {
17 "entity_id": "msg-001",
18 "name": "Auth migration plan",
19 "entity_type": "SlackMessageEntity",
20 "source_name": "slack"
21 }
22 ]
23 }
24 }
25}
1{
2 "type": "tool_call",
3 "tool_name": "get_siblings",
4 "duration_ms": 55,
5 "diagnostics": {
6 "iteration": 2,
7 "tool_call_id": "call_gs1",
8 "arguments": {
9 "entity_id": "page-oauth",
10 "limit": 50
11 },
12 "stats": {
13 "result_count": 8,
14 "context_label": "siblings of page-oauth under db-docs",
15 "first_results": [
16 {
17 "entity_id": "page-api-keys",
18 "name": "API Key Management",
19 "entity_type": "NotionPageEntity",
20 "source_name": "notion"
21 }
22 ]
23 }
24 }
25}
1{
2 "type": "tool_call",
3 "tool_name": "get_parent",
4 "duration_ms": 18,
5 "diagnostics": {
6 "iteration": 2,
7 "tool_call_id": "call_gp1",
8 "arguments": {
9 "entity_id": "page-oauth"
10 },
11 "stats": {
12 "result_count": 1,
13 "context_label": "parent of page-oauth",
14 "first_results": [
15 {
16 "entity_id": "db-docs",
17 "name": "Documentation",
18 "entity_type": "NotionDatabaseEntity",
19 "source_name": "notion"
20 }
21 ]
22 }
23 }
24}
1{
2 "type": "tool_call",
3 "tool_name": "review_results",
4 "duration_ms": 2,
5 "diagnostics": {
6 "iteration": 4,
7 "tool_call_id": "call_rr1",
8 "arguments": {},
9 "stats": {
10 "total_collected": 12,
11 "entity_count": 12,
12 "first_results": [
13 {
14 "entity_id": "page-auth",
15 "name": "Authentication Overview",
16 "entity_type": "NotionPageEntity",
17 "source_name": "notion",
18 "relevance_score": 0.97
19 }
20 ]
21 }
22 }
23}
1{
2 "type": "tool_call",
3 "tool_name": "return_results_to_user",
4 "duration_ms": 1,
5 "diagnostics": {
6 "iteration": 5,
7 "tool_call_id": "call_ret1",
8 "arguments": {},
9 "stats": {
10 "accepted": true,
11 "total_collected": 12,
12 "warning": null
13 }
14 }
15}

Emitted after the agent’s collected results are reranked for final ordering.

1{
2 "type": "reranking",
3 "duration_ms": 890,
4 "diagnostics": {
5 "input_count": 12,
6 "output_count": 12,
7 "model": "cohere/rerank-v4.0-pro",
8 "top_relevance_score": 0.98,
9 "bottom_relevance_score": 0.41,
10 "first_results": [
11 {
12 "entity_id": "page-auth",
13 "name": "Authentication Overview",
14 "entity_type": "NotionPageEntity",
15 "source_name": "notion",
16 "relevance_score": 0.98
17 }
18 ]
19 }
20}

Final event. Contains the full result set and run diagnostics.

1{
2 "type": "done",
3 "results": ["..."],
4 "duration_ms": 34521,
5 "diagnostics": {
6 "total_iterations": 6,
7 "all_seen_entity_ids": ["page-auth", "page-oauth", "page-sso", "..."],
8 "all_read_entity_ids": ["page-auth", "page-oauth", "page-sso"],
9 "all_collected_entity_ids": ["page-auth", "page-oauth", "pr-auth-456"],
10 "max_iterations_hit": false,
11 "total_llm_retries": 0,
12 "stagnation_nudges_sent": 0,
13 "prompt_tokens": 28450,
14 "completion_tokens": 5230,
15 "cache_creation_input_tokens": 12000,
16 "cache_read_input_tokens": 8500
17 }
18}

Emitted when the search fails. Also terminates the stream.

1{
2 "type": "error",
3 "message": "Context window too full for useful work after emergency compression",
4 "duration_ms": 15230
5}

Filters

Filters constrain search results by metadata. They work across all three tiers.

In classic and agentic search, the AI generates its own filters internally, your filters are AND’d into every search it performs, acting as constraints that cannot be bypassed.

Structure

Filters use a two-level structure:

  • Conditions within a group are combined with AND
  • Multiple groups are combined with OR

This allows expressions like: (A AND B) OR (C AND D)

1{
2 "filter": [
3 {
4 "conditions": [
5 { "field": "airweave_system_metadata.source_name", "operator": "equals", "value": "slack" },
6 { "field": "airweave_system_metadata.entity_type", "operator": "equals", "value": "SlackMessageEntity" }
7 ]
8 }
9 ]
10}

Filterable Fields

FieldTypeDescription
entity_idtextEntity identifier
nametextEntity display name
created_atdateCreation timestamp
updated_atdateLast update timestamp
breadcrumbs.entity_idtextParent entity ID in the hierarchy
breadcrumbs.nametextParent entity name
breadcrumbs.entity_typetextParent entity type
airweave_system_metadata.entity_typetextEntity type (e.g., SlackMessageEntity)
airweave_system_metadata.source_nametextSource name (e.g., slack, notion)
airweave_system_metadata.original_entity_idtextOriginal entity ID (same across chunks)
airweave_system_metadata.chunk_indexnumericChunk index for chunked documents
airweave_system_metadata.sync_idtextSync ID
airweave_system_metadata.sync_job_idtextSync job ID

Operators

OperatorWorks onDescription
equalsallExact match
not_equalsallNot equal
containstext onlySubstring match
greater_thandate, numeric>
less_thandate, numeric<
greater_than_or_equaldate, numeric>=
less_than_or_equaldate, numeric<=
inallMatches any value in the list
not_inallMatches none of the values

Examples

Filter by source:

1{
2 "filter": [
3 {
4 "conditions": [
5 { "field": "airweave_system_metadata.source_name", "operator": "equals", "value": "github" }
6 ]
7 }
8 ]
9}

Filter by time range (ISO 8601 timestamps required):

1{
2 "filter": [
3 {
4 "conditions": [
5 { "field": "created_at", "operator": "greater_than_or_equal", "value": "2025-01-01T00:00:00Z" },
6 { "field": "created_at", "operator": "less_than", "value": "2025-02-01T00:00:00Z" }
7 ]
8 }
9 ]
10}

Filter by multiple sources (using in):

1{
2 "filter": [
3 {
4 "conditions": [
5 { "field": "airweave_system_metadata.source_name", "operator": "in", "value": ["slack", "notion", "github"] }
6 ]
7 }
8 ]
9}

Combine groups with OR — Slack messages OR Notion pages:

1{
2 "filter": [
3 {
4 "conditions": [
5 { "field": "airweave_system_metadata.source_name", "operator": "equals", "value": "slack" },
6 { "field": "airweave_system_metadata.entity_type", "operator": "equals", "value": "SlackMessageEntity" }
7 ]
8 },
9 {
10 "conditions": [
11 { "field": "airweave_system_metadata.source_name", "operator": "equals", "value": "notion" },
12 { "field": "airweave_system_metadata.entity_type", "operator": "equals", "value": "NotionPageEntity" }
13 ]
14 }
15 ]
16}

Navigate hierarchy — find all entities inside a parent:

1{
2 "filter": [
3 {
4 "conditions": [
5 { "field": "breadcrumbs.entity_id", "operator": "equals", "value": "parent-entity-id-here" }
6 ]
7 }
8 ]
9}

Validation Rules

  • Date fields (created_at, updated_at) require ISO 8601 timestamps (e.g., 2025-01-15T00:00:00Z)
  • Ordering operators (greater_than, less_than, etc.) only work on date and numeric fields
  • contains only works on text fields
  • in and not_in require array values
  • Scalar operators (equals, contains, etc.) require a single value, not an array

Response Format

All three tiers return the same SearchV2Response with a results array. See the API Reference for the full response schema and interactive examples.

Configuring the LLM provider chain

Self-hosted only

This section is only relevant to self-hosted deployments. The managed service ships with providers configured.

Classic and Agentic search call an LLM. Instant search does not — a backend with no LLM configured still answers instant queries, and Classic/Agentic return HTTP 503 until an API key is set.

Default chain

Out of the box, Airweave tries providers in this order:

  1. together:zai-glm-5
  2. anthropic:claude-sonnet-4.6

The first provider with an API key set that responds successfully handles the request. Subsequent entries are tried only on failure.

Setting API keys

Set at least one of the following environment variables on the backend:

Env varProvider
TOGETHER_API_KEYTogether
ANTHROPIC_API_KEYAnthropic
MISTRAL_API_KEYMistral
GROQ_API_KEYGroq
CEREBRAS_API_KEYCerebras

If none are set, the backend boots normally; Classic/Agentic search return 503 Service Unavailable with a message listing these variables.

Overriding the chain

Set LLM_FALLBACK_CHAIN to a comma-separated list of provider:model pairs. Example:

LLM_FALLBACK_CHAIN=cerebras:gpt-oss-120b,anthropic:claude-sonnet-4.6

Supported providers: cerebras, groq, anthropic, together, mistral. The full list of models per provider lives in backend/airweave/adapters/llm/registry.py.

The parser validates three things at startup:

  • Every provider is a known provider.
  • Every model is a known model.
  • Every (provider, model) combination exists in the registry (e.g. together:mistral-large is rejected because mistral-large is hosted on Mistral, not Together).

Misconfiguration is caught at startup with an error that lists the accepted values.

Fallback semantics

  • Providers without an API key are silently skipped when the chain is built.
  • Providers whose initialization raises are logged and skipped.
  • If the resulting chain is empty, the backend wires a null LLM — instant search still works; Classic/Agentic return 503.
  • When a call fails in a chained provider, the next one is tried; a circuit breaker temporarily removes providers that recently failed.