Overview
The Microsoft 365 Connector provides functionality to retrieve data from Microsoft 365 services (OneDrive, OneNote, Teams, SharePoint) and register it in the Fess index.
This feature requires the fess-ds-microsoft365 plugin.
Supported Services
OneDrive: User drives, group drives, shared documents
OneNote: Notebooks (sites, users, groups)
Teams: Channels, messages, chats
SharePoint Document Libraries: Document library metadata
SharePoint Lists: Lists and list items
SharePoint Pages: Site pages, news articles
Prerequisites
Plugin installation is required
Azure AD application registration is required
Microsoft Graph API permissions configuration and admin consent is required
Java 21 or higher, Fess 15.2.0 or higher
Installing the Plugin
Method 1: Direct JAR file placement
Method 2: Build from source
Restart Fess after installation.
Configuration
Configure in the admin console under “Crawler” -> “Data Store” -> “Create New”.
Basic Settings
| Item | Example |
|---|---|
| Name | Microsoft 365 OneDrive |
| Handler Name | OneDriveDataStore / OneNoteDataStore / TeamsDataStore / SharePointDocLibDataStore / SharePointListDataStore / SharePointPageDataStore |
| Enabled | On |
Common Parameter Configuration
Common Parameter List
| Parameter | Required | Description |
|---|---|---|
tenant | Yes | Azure AD tenant ID |
client_id | Yes | App registration client ID |
client_secret | Yes | App registration client secret |
number_of_threads | No | Number of parallel processing threads (default: 1) |
ignore_error | No | Continue processing on errors (default: false) |
max_content_length | No | Maximum size of content to retrieve (default: -1, unlimited) |
cache_size | No | Cache size for user/group information (default: 10000) |
proxy_host | No | HTTP proxy host |
proxy_port | No | HTTP proxy port |
proxy_username | No | Proxy authentication username |
proxy_password | No | Proxy authentication password |
Azure AD Application Registration
1. Register an Application in Azure Portal
Open Azure Active Directory at https://portal.azure.com:
Click “App registrations” -> “New registration”
Enter the application name
Select the supported account types
Click “Register”
2. Create Client Secret
In “Certificates & secrets”:
Click “New client secret”
Set the description and expiration
Copy the secret value (it cannot be viewed later)
3. Add API Permissions
In “API permissions”:
Click “Add a permission”
Select “Microsoft Graph”
Select “Application permissions”
Add the required permissions (see below)
Click “Grant admin consent”
Required Permissions by Data Store
OneDriveDataStore
Required permissions:
Files.Read.All
Conditional permissions:
User.Read.All- When user_drive_crawler=trueGroup.Read.All- When group_drive_crawler=trueSites.Read.All- When shared_documents_drive_crawler=true
OneNoteDataStore
Required permissions:
Notes.Read.All
Conditional permissions:
User.Read.All- When user_note_crawler=trueGroup.Read.All- When group_note_crawler=trueSites.Read.All- When site_note_crawler=true
TeamsDataStore
Required permissions:
Team.ReadBasic.AllGroup.Read.AllChannel.ReadBasic.AllChannelMessage.Read.AllChannelMember.Read.AllUser.Read.All
Conditional permissions:
Chat.Read.All- When specifying chat_idFiles.Read.All- When append_attachment=true
Script Configuration
OneDrive
Available fields:
file.name- File namefile.description- File descriptionfile.contents- Text contentfile.mimetype- MIME typefile.filetype- File typefile.created- Creation date and timefile.last_modified- Last modified date and timefile.size- File sizefile.web_url- URL to open in browserfile.url- File URLfile.id- Drive item IDfile.ctag- Change tag (cTag)file.etag- Entity tag (eTag)file.webdav_url- WebDAV URLfile.parent_id- Parent folder IDfile.parent_name- Parent folder namefile.parent_path- Parent folder pathfile.roles- Access permissions
Note
In addition to the above, Microsoft Graph metadata fields such as file.createdby_user, file.last_modifiedby_user, file.image, file.video, and file.special_folder are also available.
OneNote
Available fields:
notebook.name- Notebook namenotebook.contents- Integrated content of sections and pagesnotebook.size- Content size (character count)notebook.created- Creation date and timenotebook.last_modified- Last modified date and timenotebook.web_url- URL to open in browsernotebook.roles- Access permissions
Teams
Available fields:
message.title- Message titlemessage.content- Message contentmessage.body- Message body (raw data including HTML)message.subject- Message subjectmessage.summary- Message summarymessage.importance- Importancemessage.from- Sender informationmessage.created_date_time- Creation date and timemessage.last_modified_date_time- Last modified date and timemessage.last_edited_date_time- Last edited date and timemessage.deleted_date_time- Deletion date and timemessage.web_url- URL to open in browsermessage.id- Message IDmessage.etag- Entity tagmessage.locale- Localemessage.chat_id- Chat IDmessage.reply_to_id- Reply-to message IDmessage.channel_identity- Channel identity information (team ID and channel ID)message.mentions- Mention informationmessage.attachments- Attachment informationmessage.replies- Reply messagesmessage.hosted_contents- Inline content (images, etc.)message.roles- Access permissions
Top-level fields (set only for channel messages):
team- Team (Microsoft GraphGroupobject)channel- Channel (Microsoft GraphChannelobject)parent- Parent message (set when the message is a reply)
Additional Parameters by Data Store
OneDrive
OneNote
Teams
SharePoint Document Libraries
SharePoint Lists
SharePoint Pages
Usage Examples
Crawling All OneDrive Drives
Parameters:
Script:
Crawling Teams Messages from a Specific Team
Parameters:
Script:
Troubleshooting
Authentication Errors
Symptom: Authentication failed or Insufficient privileges
Check:
Verify that the tenant ID, client ID, and client secret are correct
Verify that the required API permissions are granted in Azure Portal
Verify that admin consent has been granted
Check the client secret expiration
API Rate Limit Errors
Symptom: 429 Too Many Requests
Resolution:
Reduce
number_of_threads(set to 1 or 2)Increase the crawl interval
Set
ignore_error=trueto continue processing
Cannot Retrieve Data
Symptom: Crawl succeeds but 0 documents
Check:
Verify that the target data exists
Verify that the API permissions are correctly configured
Check the user/group drive crawler settings
Check the logs for error messages
Crawling Large Volumes of Data
Resolution:
Split into multiple data stores (per site, per drive, etc.)
Use scheduled settings to distribute the load
Adjust
number_of_threadsfor parallel processingCrawl only specific folders/sites
Reference Information
Data Store Connector Overview - Data Store Connector Overview
Google Workspace Connector - Google Workspace Connector
Data Store Crawling - Data Store Configuration Guide