Add New Connector

Create custom connectors for any unsupported data source.

1. Fork the repository

Get the code and set up your development environment.

$git clone https://github.com/YOUR_USERNAME/airweave.git
>cd airweave
>./start.sh

2. Create authentication schema

Define what credentials your source needs in backend/airweave/platform/configs/auth.py.

1class GhibliAuthConfig(AuthConfig):
2 """Studio Ghibli authentication credentials schema."""
3
4 # No authentication needed for this public API
5 pass

3. Create source configuration

Define optional configuration options in backend/airweave/platform/configs/config.py.

1class GhibliConfig(SourceConfig):
2 """Studio Ghibli configuration schema."""
3
4 include_rt_scores: bool = Field(
5 default=True,
6 title="Include RT Scores",
7 description="Whether to include Rotten Tomatoes scores in the metadata"
8 )

4. Define entity schemas

Create entity schemas in backend/airweave/platform/entities/your_source.py that define the structure of the data in the source.

1from datetime import datetime
2from typing import List, Optional
3
4from pydantic import Field
5
6from airweave.platform.entities._base import ChunkEntity
7
8class GhibliFilmEntity(ChunkEntity):
9 """Schema for a Studio Ghibli film."""
10
11 film_id: str = Field(..., description="Unique ID of the film")
12 title: str = Field(..., description="Title of the film")
13 original_title: str = Field(..., description="Original Japanese title")
14 director: str = Field(..., description="Director of the film")
15 producer: str = Field(..., description="Producer of the film")
16 release_date: str = Field(..., description="Release date")
17 running_time: str = Field(..., description="Running time in minutes")
18 rt_score: Optional[str] = Field(None, description="Rotten Tomatoes score")
19 people: List[str] = Field(default_factory=list, description="Characters in the film")
20 species: List[str] = Field(default_factory=list, description="Species in the film")
21 locations: List[str] = Field(default_factory=list, description="Locations in the film")
22 vehicles: List[str] = Field(default_factory=list, description="Vehicles in the film")

Key points about entities:

  • Inherit from ChunkEntity for searchable content (documents, posts, issues)
  • Inherit from FileEntity for downloadable files (PDFs, images, attachments)
  • Use Field(...) for required fields, Field(default=...) for optional ones
  • Add source-specific fields that are relevant for search and metadata

5. Implement source

Create your source connector in backend/airweave/platform/sources/your_source.py.

1from typing import AsyncGenerator, Optional, Dict, Any
2import httpx
3
4from airweave.platform.entities.ghibli import GhibliFilmEntity
5from airweave.platform.decorators import source
6from airweave.platform.sources._base import BaseSource
7from airweave.platform.auth.schemas import AuthType
8
9@source(
10 name="Studio Ghibli",
11 short_name="ghibli",
12 auth_type=AuthType.none,
13 auth_config_class="GhibliAuthConfig",
14 config_class="GhibliConfig",
15 labels=["Entertainment", "API"]
16)
17class GhibliSource(BaseSource):
18 """Studio Ghibli source implementation."""
19
20 @classmethod
21 async def create(
22 cls,
23 credentials=None,
24 config: Optional[Dict[str, Any]] = None
25 ) -> "GhibliSource":
26 """Create a new Ghibli source instance."""
27 instance = cls()
28 instance.config = config or {}
29 return instance
30
31 async def generate_entities(self) -> AsyncGenerator[GhibliFilmEntity, None]:
32 """Generate entities from the Ghibli API."""
33 async with httpx.AsyncClient() as client:
34 response = await client.get("https://ghibli.rest/films")
35 response.raise_for_status()
36 films = response.json()
37
38 for film in films:
39 yield GhibliFilmEntity(
40 entity_id=film["id"],
41 film_id=film["id"],
42 title=film["title"],
43 original_title=film["original_title"],
44 content=film["description"], # Required by ChunkEntity
45 director=film["director"],
46 producer=film["producer"],
47 release_date=film["release_date"],
48 running_time=film["running_time"],
49 rt_score=film["rt_score"] if self.config.get("include_rt_scores", True) else None,
50 people=film.get("people", []),
51 species=film.get("species", []),
52 locations=film.get("locations", []),
53 vehicles=film.get("vehicles", [])
54 )

Key implementation points:

  • Import your custom entity classes
  • Use the @source() decorator with your auth and config classes
  • Implement create() classmethod that handles credentials and config
  • Implement generate_entities() that yields your custom entity objects
  • Handle authentication based on your auth type
  • Use config options to customize behavior

6. Test your source

Verify everything works by running Airweave and creating a test connection.

  1. Start Airweave: Your connector appears automatically in the dashboard
  2. Create a collection and add your new source
  3. Test the connection and verify data syncs correctly
  4. Search your data to confirm everything works end-to-end

File Structure

Your complete implementation will create files in these locations:

File structure
backend/airweave/platform/
├── configs/
│ ├── auth.py # Add YourSourceAuthConfig class
│ └── config.py # Add YourSourceConfig class
├── entities/
│ └── your_source.py # Define entity schemas for your data
└── sources/
└── your_source.py # Your source implementation

Notes

Be careful about what kind of authentication your app uses. Airweave supports many auth types including API keys, various OAuth2 flows, and database connections. Check your data source’s API documentation to determine the correct AuthType to use in your @source() decorator.

For OAuth2 sources, you’ll also need to add your integration to the dev.integrations.yaml file.

Contributing