Introduction
Important corporate data is stored not only on file servers and cloud storage, but also in SaaS applications and databases. Customer information in Salesforce, product master data in internal databases, list data managed in CSV files – these data sets are typically searchable only within their respective systems.
This article covers scenarios for importing SaaS and database data into the Fess index, enabling cross-source searching alongside other documents.
Target Audience
Those who want to include SaaS and database information in search results
Those who want to learn how to use data store plugins
Those who want to build a search platform spanning multiple data sources
Scenario
A sales organization has data distributed across the following systems.
The goal is to enable cross-source searching of all this data with Fess, so that information needed for sales activities can be found from a single search box.
Salesforce Data Integration
To make Salesforce data searchable in Fess, use the Salesforce data store plugin.
Plugin Installation
Navigate to [System] > [Plugins] in the admin panel
Install
fess-ds-salesforce
Connection Settings
Integration with Salesforce requires configuring a Connected App.
Salesforce-Side Preparation
Create a Connected App in the Salesforce settings
Enable OAuth settings
Obtain the consumer key and secret
Fess-Side Configuration
Navigate to [Crawler] > [Data Store] > [Create New]
Handler Name: Select SalesforceDataStore
Configure parameters and scripts
Label: Set
salesforce
Parameter Configuration Example
base_url=https://login.salesforce.com
auth_type=oauth_password
username=user@example.com
password=your-password
security_token=your-security-token
client_id=your-consumer-key
client_secret=your-consumer-secret
Script Configuration Example
url=url
title=title
content=content
last_modified=last_modified
For auth_type, specify oauth_password (username/password authentication) or oauth_token (JWT Bearer token authentication). When using JWT authentication, set the RSA private key in private_key.
Selecting Target Data
Salesforce contains many objects, but not all of them need to be searchable. Focus on the objects that the sales team searches for frequently.
Database Integration
To make internal database data searchable, use the database data store plugin.
Plugin Installation
Install the fess-ds-db plugin. This plugin can connect to various databases (MySQL, PostgreSQL, Oracle, SQL Server, etc.) via JDBC.
Configuration
Navigate to [Crawler] > [Data Store] > [Create New]
Handler Name: Select DatabaseDataStore
Configure parameters and scripts
Label: Set
database
Parameter Configuration Example
driver=com.mysql.cj.jdbc.Driver
url=jdbc:mysql://db-server:3306/mydb?useSSL=true
username=fess_reader
password=your-password
sql=SELECT product_id, product_name, description, price, CONCAT('https://internal-app/products/', product_id) AS url FROM products WHERE status = 'active'
Script Configuration Example
url=url
title=product_name
content=description
The results of the SQL query specified in sql are crawled. In scripts, use SQL column names (or column labels) to map to Fess index fields.
SQL Query Design
Key points when designing the SQL query for the sql parameter:
Include a
urlcolumn that serves as the link destination in search results (e.g.,CONCAT('https://.../', id) AS url)Include columns that serve as the searchable body text
Use a
WHEREclause to exclude unnecessary data (e.g.,status = 'active')
In scripts, use the SQL column names directly to map to Fess index fields.
CSV File Integration
CSV file data can also be made searchable.
Configuration
Use the fess-ds-csv plugin or the CSV data store functionality.
Navigate to [Crawler] > [Data Store] > [Create New]
Handler Name: Select CsvDataStore
Configure parameters and scripts
Label: Set
csv-data
Parameter Configuration Example
directories=/opt/fess/csv-data
file_encoding=UTF-8
has_header_line=true
separator_character=,
Script Configuration Example (use column names when a header line is present)
url="https://internal-app/contacts/" + id
title=company_name
content=company_name + " " + contact_name + " " + email
When has_header_line=true, column names from the header line can be used in scripts. When there is no header line, columns are referenced by number, such as cell1, cell2, cell3. Scripts can contain Groovy expressions, including string concatenation.
If CSV files are updated regularly, fix the file placement location and set a crawl schedule so that the latest data is automatically reflected in the index.
Cross-Source Searching
Once all data source configurations are complete, you can experience cross-source searching.
Search Example
Searching for “ABC Corporation” returns results such as:
Salesforce account information (Account)
Proposals from the file server (PDF)
Product purchase history from the database
Trade show attendee list from CSV
Users can find the information they need without having to be aware of where it is stored.
Filtering by Label
When there are many search results, use labels to narrow them down.
salesforce: Salesforce data onlydatabase: Database data onlycsv-data: CSV data onlyshared-files: File server documents only
Operational Considerations
Data Freshness
SaaS and database data may be updated frequently. Set the crawl frequency appropriately to keep search results fresh.
Database Connection Security
When connecting directly to a database, pay careful attention to security.
Use a read-only database user
Restrict connections to the Fess server’s IP address
Do not grant access permissions to unnecessary tables
Be careful with password management
Summary
This article covered scenarios for making Salesforce, database, and CSV file data searchable with Fess.
CRM data integration using the Salesforce data store plugin
Internal DB integration using the database data store plugin
List data integration using the CSV data store
Field mapping and SQL query design
Leveraging labels in cross-source searching
By eliminating data silos, you can achieve an environment where all information sources are searchable from a single platform. This concludes the Practical Solutions section. Starting from the next part, we will cover the Architecture and Scaling section, beginning with multi-tenant design.