Overview
The Google Workspace Connector provides functionality to retrieve files from Google Drive (formerly G Suite) and register them in the Fess index.
This feature requires the fess-ds-gsuite plugin.
Supported Services
Google Drive (My Drive, Shared Drives)
Google Docs, Spreadsheets, Slides, Forms, etc.
Prerequisites
Plugin installation is required
A Google Cloud Platform project must be created
A service account must be created and credentials obtained
Domain-wide delegation must be configured for Google Workspace
Plugin Installation
Method 1: Place JAR file directly
# Download from Maven Central
wget https://repo1.maven.org/maven2/org/codelibs/fess/fess-ds-gsuite/X.X.X/fess-ds-gsuite-X.X.X.jar
# Place the file
cp fess-ds-gsuite-X.X.X.jar $FESS_HOME/app/WEB-INF/lib/
# or
cp fess-ds-gsuite-X.X.X.jar /usr/share/fess/app/WEB-INF/lib/
Method 2: Install from admin console
Open “System” -> “Plugins”
Upload the JAR file
Restart Fess
Configuration
Configure in the admin console under “Crawler” -> “Data Store” -> “Create New”.
Basic Settings
| Item | Example |
|---|---|
| Name | Company Google Drive |
| Handler Name | GSuiteDataStore |
| Enabled | On |
Parameter Configuration
private_key=-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQE...\n-----END PRIVATE KEY-----\n
private_key_id=46812a3b4c5d6e7f8g9h0i1j2k3l4m5n6o7p8q9r
client_email=fess-crawler@your-project.iam.gserviceaccount.com
Parameter List
| Parameter | Required | Description |
|---|---|---|
private_key | Yes | Service account private key (PEM format, newlines as \n) |
private_key_id | Yes | Private key ID |
client_email | Yes | Service account email address |
Script Configuration
title=file.name
content=file.description + "\n" + file.contents
mimetype=file.mimetype
created=file.created_time
last_modified=file.modified_time
url=file.url
thumbnail=file.thumbnail_link
content_length=file.size
filetype=file.filetype
role=file.roles
filename=file.name
Available Fields
| Field | Description |
|---|---|
file.name | File name |
file.description | File description |
file.contents | File text content |
file.mimetype | File MIME type |
file.filetype | File type |
file.created_time | Creation date/time |
file.modified_time | Last modified date/time |
file.web_view_link | Link to open in browser |
file.url | File URL |
file.thumbnail_link | Thumbnail link (valid for short period) |
file.size | File size (bytes) |
file.roles | Access permissions |
For details, see the Google Drive Files API.
Google Cloud Platform Configuration
1. Create a Project
Access https://console.cloud.google.com/:
Create a new project
Enter a project name
Select organization and location
2. Enable Google Drive API
In “APIs & Services” -> “Library”:
Search for “Google Drive API”
Click “Enable”
3. Create a Service Account
In “APIs & Services” -> “Credentials”:
Select “Create credentials” -> “Service account”
Enter a service account name (e.g., fess-crawler)
Click “Create and continue”
Skip role assignment
Click “Done”
4. Create Service Account Key
For the created service account:
Click on the service account
Open the “Keys” tab
Click “Add key” -> “Create new key”
Select JSON format
Save the downloaded JSON file
5. Enable Domain-wide Delegation
In the service account settings:
Check “Enable domain-wide delegation”
Click “Save”
Copy the “OAuth 2 Client ID”
Credential Configuration
Retrieve Information from JSON File
The downloaded JSON file:
{
"type": "service_account",
"project_id": "your-project-id",
"private_key_id": "46812a3b4c5d6e7f8g9h0i1j2k3l4m5n6o7p8q9r",
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgk...\n-----END PRIVATE KEY-----\n",
"client_email": "fess-crawler@your-project.iam.gserviceaccount.com",
"client_id": "123456789012345678901",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
}
Set the following information in parameters:
private_key_id->private_key_idprivate_key->private_key(keep newlines as\n)client_email->client_email
Private Key Format
private_key preserves newlines as \n:
private_key=-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG...\n-----END PRIVATE KEY-----\n
Usage Examples
Crawl Entire Google Drive
Parameters:
private_key=-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQE...\n-----END PRIVATE KEY-----\n
private_key_id=46812a3b4c5d6e7f8g9h0i1j2k3l4m5n6o7p8q9r
client_email=fess-crawler@your-project-123456.iam.gserviceaccount.com
Script:
title=file.name
content=file.description + "\n" + file.contents
mimetype=file.mimetype
created=file.created_time
last_modified=file.modified_time
url=file.web_view_link
thumbnail=file.thumbnail_link
content_length=file.size
filetype=file.filetype
filename=file.name
Crawl with Permissions
Parameters:
private_key=-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQE...\n-----END PRIVATE KEY-----\n
private_key_id=46812a3b4c5d6e7f8g9h0i1j2k3l4m5n6o7p8q9r
client_email=fess-crawler@your-project-123456.iam.gserviceaccount.com
default_permissions={role}drive-users
Script:
title=file.name
content=file.description + "\n" + file.contents
mimetype=file.mimetype
created=file.created_time
last_modified=file.modified_time
url=file.web_view_link
role=file.roles
filename=file.name
Crawl Only Specific File Types
Google Docs only:
if (file.mimetype == "application/vnd.google-apps.document") {
title=file.name
content=file.description + "\n" + file.contents
mimetype=file.mimetype
created=file.created_time
last_modified=file.modified_time
url=file.web_view_link
}
Troubleshooting
Authentication Error
Symptom: 401 Unauthorized or 403 Forbidden
Check:
Verify service account credentials are correct:
Is
private_keyformatted with\nfor newlines?Is
private_key_idcorrect?Is
client_emailcorrect?
Verify Google Drive API is enabled
Verify domain-wide delegation is configured
Verify authorization in Google Workspace admin console
Verify OAuth scope is correct (
https://www.googleapis.com/auth/drive.readonly)
Domain-wide Delegation Error
Symptom: Not Authorized to access this resource/api
Resolution:
Verify authorization in Google Workspace admin console:
Is the Client ID registered correctly?
Is the OAuth scope correct? (
https://www.googleapis.com/auth/drive.readonly)
Verify domain-wide delegation is enabled for the service account
Cannot Retrieve Files
Symptom: Crawl succeeds but 0 files found
Check:
Verify files exist in Google Drive
Verify service account has read permissions
Verify domain-wide delegation is configured correctly
Verify access to target user’s Drive is possible
API Quota Error
Symptom: 403 Rate Limit Exceeded or 429 Too Many Requests
Resolution:
Check quota in Google Cloud Platform
Increase crawl interval
Request quota increase if needed
Private Key Format Error
Symptom: Invalid private key format
Resolution:
Verify newlines are correctly formatted as \n:
# Correct
private_key=-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQE...\n-----END PRIVATE KEY-----\n
# Incorrect (contains actual newlines)
private_key=-----BEGIN PRIVATE KEY-----
MIIEvgIBADANBgkqhkiG9w0BAQE...
-----END PRIVATE KEY-----
Large Number of Files
Symptom: Crawl takes too long or times out
Resolution:
Split into multiple data stores
Distribute load with scheduled crawling
Adjust crawl interval
Crawl only specific folders
Permissions and Access Control
Reflect Google Drive Sharing Permissions
Reflect Google Drive sharing settings in Fess permissions:
Parameters:
private_key=-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQE...\n-----END PRIVATE KEY-----\n
private_key_id=46812a3b4c5d6e7f8g9h0i1j2k3l4m5n6o7p8q9r
client_email=fess-crawler@your-project-123456.iam.gserviceaccount.com
default_permissions={role}drive-users
Script:
title=file.name
content=file.description + "\n" + file.contents
role=file.roles
mimetype=file.mimetype
created=file.created_time
last_modified=file.modified_time
url=file.web_view_link
file.roles contains Google Drive sharing information.
Reference Information
Data Store Connector Overview - Data Store Connector Overview
Microsoft 365 Connector - Microsoft 365 Connector
Box Connector - Box Connector
Data Store Crawling - Data Store Configuration Guide