Data Store Connector Overview

Overview

Fess Data Store Connectors provide functionality to retrieve and index content from data sources other than websites and file systems.

Using Data Store Connectors, you can make data searchable from the following sources:

Cloud storage (Box, Dropbox, Google Drive, OneDrive)
Collaboration tools (Confluence, Jira, Slack)
Databases (MySQL, PostgreSQL, Oracle, etc.)
Other systems (Git, Salesforce, Elasticsearch, etc.)

Available Connectors

Fess provides connectors for various data sources. Many connectors are available as plugins and can be installed as needed.

Cloud Storage

Connector	Plugin	Description
Box Connector	fess-ds-box	Crawl files and folders from Box.com
Dropbox Connector	fess-ds-dropbox	Crawl files and folders from Dropbox
Google Workspace Connector	fess-ds-gsuite	Crawl Google Drive
Microsoft 365 Connector	fess-ds-microsoft365	Crawl OneDrive, SharePoint, etc.

Collaboration Tools

Connector	Plugin	Description
Atlassian Connector	fess-ds-atlassian	Crawl Confluence and Jira
Slack Connector	fess-ds-slack	Crawl Slack messages and files

Development & Operations Tools

Connector	Plugin	Description
Git Connector	fess-ds-git	Crawl source code from Git repositories
Elasticsearch/OpenSearch Connector	fess-ds-elasticsearch	Retrieve data from Elasticsearch/OpenSearch
Salesforce Connector	fess-ds-salesforce	Crawl Salesforce objects

Databases & Files

Connector	Plugin	Description
Database Connector (Database Search)	fess-ds-db	Retrieve data from JDBC-compatible databases
CSV Connector	fess-ds-csv	Retrieve data from CSV files
JSON Connector	fess-ds-json	Retrieve data from JSON files

Installing Connectors

Installing Plugins

Data Store Connector plugins can be installed from the admin UI.

Log in to the admin console
Go to “System” -> “Plugin”
Click the “Install” button
Select the plugin from the “Remote” tab (or upload a JAR file from the “Local” tab)
Click “Install”
Restart Fess

Data Store Configuration Basics

Data Store Connector configuration is done in the admin console under “Crawler” -> “Data Store”.

Common Configuration Items

Configuration items common to all Data Store Connectors:

Item	Description
Name	Identifier name for the configuration
Handler Name	Handler name for the connector (e.g., `CsvDataStore`)
Description	A free-text description of the configuration
Parameters	Connector-specific configuration parameters (key=value format)
Script	Index field mapping script
Permissions	Access permissions for crawled documents (e.g., `{role}guest`)
Virtual Hosts	Virtual host keys to restrict which site this config applies to
Sort Order	Display order in the configuration list
Boost	Search result priority
Enabled	Whether to enable this configuration

Parameter Configuration

Parameters are specified in key=value format, separated by newlines:

Script Configuration

Scripts map retrieved data to Fess index fields. The left side of each line is a Fess index field; the right side is the field obtained from the connector.

The following example is for a CSV connector whose header columns are link, subject, and body:

Note

Field names referenced in scripts differ per connector. Box/Dropbox/Google Drive/OneDrive reference the fetched object with a prefix: file.*; Slack uses message.*; Jira uses issue.*. CSV, JSON, and Database connectors do NOT use a prefix; fields are referenced directly:

CSV: header column names (when has_header_line=true), or cell1, cell2, … (1-based column index); plus csvfile and csvfilename.
JSON: JSON object field names.
Database: column names (aliases) from the SELECT result.

Refer to each connector’s individual documentation for details.

Authentication Configuration

Many Data Store Connectors require authentication (OAuth 2.0, API keys, service accounts, etc.). The specific authentication parameters differ for each connector. Refer to the individual connector documentation for the required settings.

Common Parameters

The following parameter is inherited from AbstractDataStore and available in all connectors:

Parameter	Default	Description
`readInterval`	`0`	Delay in milliseconds between processing each record. Increase this value to reduce load on the external service.
`script_type`	`groovy`	Type of script engine used for index field mapping. Only `groovy` is available by default.

Troubleshooting

Connector Not Displayed

Verify that the plugin is correctly installed
Restart Fess
Check logs for errors

Authentication Errors

Verify that authentication information is correct
Check token expiration
Verify that required permissions are granted
Verify that API access is allowed on the service side

Cannot Retrieve Data

Verify that parameter format is correct
Check access permissions for target folders/files
Check filter settings
Check logs for detailed error messages

Debug Configuration

When investigating issues, adjust the log level. Datastore crawling runs in the crawler process, so you must edit the crawler’s log configuration file:

app/WEB-INF/env/crawler/resources/log4j2.xml:

Reference Information

Data Store Crawling - Data Store Configuration Guide
Plugin - Plugin Management Guide
DataConfig API - Data Store Configuration API