Word
Word
Configuration
Microsoft Word source connector integrates with the Microsoft Graph API.
Synchronizes Word documents from Microsoft OneDrive and SharePoint. Documents are processed through Airweave’s file handling pipeline which:
- Downloads the .docx/.doc file
- Converts to markdown for text extraction
- Chunks content for vector search
- Indexes for semantic search
It provides comprehensive access to Word documents with proper token refresh and rate limiting.
Authentication
This connector uses OAuth 2.0 authentication. You can connect through the Airweave UI or API using the OAuth flow.
Supported authentication methods:
- OAuth Browser Flow (recommended for UI)
- OAuth Token (for programmatic access)
- Auth Provider (enterprise SSO)
Configuration Options
This connector does not have any additional configuration options.
Data Models
The following data models are available for this connector:
WordDocumentEntity
Schema for a Microsoft Word document as a file entity.
Represents Word documents (.docx, .doc) stored in OneDrive/SharePoint. Extends FileEntity to leverage Airweave’s file processing pipeline which will:
- Download the Word document
- Convert it to markdown using document converters
- Chunk the content for indexing
Reference: https://learn.microsoft.com/en-us/graph/api/resources/driveitem