Overview
WebConfig API is an API for managing Fess web crawl configurations. You can configure crawl target URLs, crawl depth, exclusion patterns, and more.
Base URL
Note
All endpoints require administrator privileges and a valid access token. Refer to Admin API Overview for authentication details.
Endpoint List
| Method | Path | Description |
|---|---|---|
| GET | /settings | List web crawl configurations |
| GET | /setting/{id} | Get web crawl configuration |
| POST | /setting | Create web crawl configuration |
| PUT | /setting | Update web crawl configuration |
| DELETE | /setting/{id} | Delete web crawl configuration |
List Web Crawl Configurations
Request
Note
The list endpoint is also accessible via PUT in addition to GET.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
page | Integer | No | Page number (1-based, default: 1) |
size | Integer | No | Number of items per page (default: 25, follows the paging.page.size setting) |
name | String | No | Filter by configuration name |
urls | String | No | Filter by crawl URL |
description | String | No | Filter by description |
Response
total represents the total number of configurations matching the filter conditions.
Get Web Crawl Configuration
Request
Response
Note
The response includes createdBy, createdTime, updatedBy, updatedTime, and versionNo, which are automatically populated by the server when a configuration is created or updated. versionNo is required when updating a configuration (see “Update Web Crawl Configuration” below).
Create Web Crawl Configuration
Request
Request Body
Field Description
| Field | Required | Description |
|---|---|---|
name | Yes | Configuration name (up to 200 characters) |
description | No | Configuration description (up to 1000 characters) |
urls | Yes | Crawl start URLs (newline-separated for multiple URLs). Specify using http: or https: |
includedUrls | No | Regex pattern for URLs to include in crawling |
excludedUrls | No | Regex pattern for URLs to exclude from crawling |
includedDocUrls | No | Regex pattern for URLs to include in indexing |
excludedDocUrls | No | Regex pattern for URLs to exclude from indexing |
configParameter | No | Additional configuration parameters (key=value format, one entry per line) |
depth | No | Crawl depth (0 or greater) |
maxAccessCount | No | Maximum access count (0 or greater) |
userAgent | Yes | User-Agent string (up to 200 characters) |
numOfThread | Yes | Number of parallel threads (1 or greater) |
intervalTime | Yes | Access interval in milliseconds (0 or greater) |
boost | Yes | Search result boost value |
available | Yes | Enable/disable (string "true" / "false") |
sortOrder | Yes | Display order (0 or greater) |
permissions | No | Access permission roles (newline-separated for multiple values) |
virtualHosts | No | Virtual hosts (newline-separated for multiple values) |
Note
Audit fields such as createdBy, createdTime, updatedBy, and updatedTime are automatically set by the server and do not need to be included in the request body.
Response
Update Web Crawl Configuration
Request
Request Body
When updating, id to identify the target configuration and versionNo are required in addition to the fields used at creation time. Specify the current value of versionNo as returned in the GET response.
Additional Fields for Update
| Field | Required | Description |
|---|---|---|
id | Yes | ID of the configuration to update (up to 1000 characters) |
versionNo | Yes | Current version number of the configuration to update. Use the versionNo value from the GET response. |
Response
Delete Web Crawl Configuration
Request
Response
URL Pattern Examples
includedUrls / excludedUrls / includedDocUrls / excludedDocUrls accept regular expressions.
| Pattern | Description |
|---|---|
.*example\\.com.* | All URLs containing example.com |
https://example\\.com/docs/.* | Only URLs under /docs/ |
.*\\.(pdf|doc|docx)$ | PDF, DOC, DOCX files |
.*\\?.* | URLs with query parameters |
.*/(login|logout|admin)/.* | URLs containing specific paths |
Usage Examples
Corporate Site Crawl Configuration
Documentation Site Crawl Configuration
Reference
Admin API Overview - Admin API Overview
FileConfig API - File Crawl Configuration API
DataConfig API - Data Store Configuration API
Web Crawling - Web Crawl Configuration Guide