Word

Word logo

Word

Configuration

Microsoft Word source connector integrates with the Microsoft Graph API.

Synchronizes Word documents from Microsoft OneDrive and SharePoint. Documents are processed through Airweave’s file handling pipeline which:

  • Downloads the .docx/.doc file
  • Converts to markdown for text extraction
  • Chunks content for vector search
  • Indexes for semantic search

It provides comprehensive access to Word documents with proper token refresh and rate limiting.

Authentication

This connector uses OAuth 2.0 authentication. You can connect through the Airweave UI or API using the OAuth flow.

Supported authentication methods:

  • OAuth Browser Flow (recommended for UI)
  • OAuth Token (for programmatic access)
  • Auth Provider (enterprise SSO)

Configuration Options

This connector does not have any additional configuration options.

Data Models

The following data models are available for this connector:

Schema for a Microsoft Word document as a file entity.

Represents Word documents (.docx, .doc) stored in OneDrive/SharePoint. Extends FileEntity to leverage Airweave’s file processing pipeline which will:

  • Download the Word document
  • Convert it to markdown using document converters
  • Chunk the content for indexing

Reference: https://learn.microsoft.com/en-us/graph/api/resources/driveitem

FieldTypeDescription
titlestrThe title/name of the document.
web_urlOptional[str]URL to open the document in Word Online.
content_download_urlOptional[str]Direct download URL for the document content.
created_byOptional[Dict[str, Any]]Identity of the user who created the document.
last_modified_byOptional[Dict[str, Any]]Identity of the user who last modified the document.
parent_referenceOptional[Dict[str, Any]]Information about the parent folder/drive location.
drive_idOptional[str]ID of the drive containing this document.
folder_pathOptional[str]Full path to the parent folder.
descriptionOptional[str]Description of the document if available.
sharedOptional[Dict[str, Any]]Information about sharing status of the document.