Overview
The Elasticsearch/OpenSearch Connector provides functionality to retrieve data from Elasticsearch or OpenSearch clusters and register it in the Fess index.
This feature requires the fess-ds-elasticsearch plugin.
Supported Versions
Elasticsearch 7.x / 8.x
OpenSearch 1.x / 2.x
Prerequisites
Plugin installation is required
Read access to the Elasticsearch/OpenSearch cluster is required
Query execution permissions are required
Plugin Installation
Method 1: Place JAR file directly
Method 2: Install from admin console
Open “System” -> “Plugins”
Upload the JAR file
Restart Fess
Configuration
Configure from admin console via “Crawler” -> “Data Store” -> “Create New”.
Basic Settings
| Item | Example |
|---|---|
| Name | External Elasticsearch |
| Handler Name | ElasticsearchDataStore |
| Enabled | On |
Parameter Settings
Basic connection:
Authenticated connection:
Multiple hosts:
Parameter List
Script Settings
Basic mapping:
Accessing nested fields:
Available Fields
source.<field_name>- Elasticsearch document_sourcefieldid- Document IDindex- Index namescore- Search scoreversion- Document versionseqNo- Sequence numberprimaryTerm- Primary termclusterAlias- Cluster alias (for cross-cluster search)hit- SearchHit object (advanced usage)
Query Configuration
Retrieve All Documents
By default, all documents are retrieved. If the query parameter is not specified, match_all is used.
Filtering with Specific Conditions
Range query:
Multiple conditions:
Note
The query parameter accepts only the query body. The outer {"query":...} wrapper is not needed. Search-level options such as sort cannot be specified in this parameter.
Retrieving Specific Fields Only
Limiting fields with fields parameter
To retrieve all fields, do not specify fields or leave it empty.
Usage Examples
Basic Index Crawl
Parameters:
Script:
Authenticated Cluster Crawl
Parameters:
Script:
Multiple Indices Crawl
Parameters:
Script:
OpenSearch Cluster Crawl
Parameters:
Script:
Crawl with Limited Fields
Parameters:
Script:
Load Balancing with Multiple Hosts
Parameters:
Script:
Troubleshooting
Connection Error
Symptom: Connection refused or No route to host
Check:
Verify host URL is correct (protocol, hostname, port)
Verify Elasticsearch/OpenSearch is running
Check firewall settings
For HTTPS, verify certificate is valid
Authentication Error
Symptom: 401 Unauthorized or 403 Forbidden
Check:
Verify username and password are correct
Verify user has appropriate permissions:
Read permission on index
Scroll API usage permission
If Elasticsearch Security (X-Pack) is enabled, verify proper configuration
Index Not Found
Symptom: index_not_found_exception
Check:
Verify index name is correct (including case)
Verify index exists:
Verify wildcard pattern is correct (e.g.,
logs-*)
Query Error
Symptom: parsing_exception or search_phase_execution_exception
Check:
Verify query JSON is correct
Verify query is compatible with Elasticsearch/OpenSearch version
Verify field names are correct
Test query directly on Elasticsearch/OpenSearch:
Scroll Timeout
Symptom: No search context found or Scroll timeout
Solution:
Increase
scroll:Decrease
size:Check cluster resources
Large Data Crawl
Symptom: Crawl is slow or times out
Solution:
Adjust
size(too large can slow down):Limit fields with
fieldsFilter documents with
querySplit into multiple data stores (by index, time range, etc.)
Out of Memory
Symptom: OutOfMemoryError
Solution:
Decrease
sizeLimit fields with
fieldsIncrease Fess heap size
Exclude large fields (binary data, etc.)
SSL/TLS Connection
Self-Signed Certificate
Warning
Use properly signed certificates in production environments.
For self-signed certificates, add certificate to Java keystore:
Client Certificate Authentication
For client certificate authentication, additional parameter configuration is required. Refer to Elasticsearch client documentation for details.
Advanced Query Examples
Query with Aggregation
Note
The query parameter accepts only the query body. Aggregations (aggs), sort, and other search-level options cannot be specified. Only documents are retrieved.
Script Fields
Note
Elasticsearch/OpenSearch script fields are not included in _source, so they cannot be accessed via the source.* prefix. To use script fields, access them via the hit object using hit.getFields().
Reference
Data Store Connector Overview - DataStore Connector Overview
Database Connector - Database Connector
Data Store Crawling - Data Store Configuration Guide