Overview
FailureUrl API is an API for managing Fess crawl failure URLs. You can list, retrieve individual entries, and delete URLs that encountered errors during crawling.
Base URL
Endpoint List
| Method | Path | Description |
|---|---|---|
| GET | /logs | List failure URLs |
| GET | /log/{id} | Get failure URL |
| DELETE | /log/{id} | Delete failure URL |
| DELETE | /all | Delete all failure URLs |
List Failure URLs
Request
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
size | Integer | No | Number of items per page (default: 20) |
page | Integer | No | Page number (starts at 1, default: 1) |
url | String | No | URL filter (wildcards * ? supported) |
errorCountMin | Integer | No | Lower bound for the error count (greater than or equal to the specified value) |
errorCountMax | Integer | No | Upper bound for the error count (less than or equal to the specified value) |
errorName | String | No | Error name filter (wildcard match against the stored fully-qualified class name; * ? supported) |
Response
Response Fields
| Field | Description |
|---|---|
id | Failure URL ID |
url | Failed URL |
threadName | Thread name |
errorName | Error name (fully-qualified class name of the exception that occurred; e.g. java.net.ConnectException) |
errorLog | Error log (exception message or stack trace) |
errorCount | Number of error occurrences (a numeric value as a string) |
lastAccessTime | Last access time (epoch milliseconds as a string) |
configId | Crawl configuration ID |
Note
All response fields are returned as strings (JSON string). errorCount is a numeric value represented as a string, and lastAccessTime is epoch milliseconds represented as a string.
Get Failure URL
Request
Response
Delete Failure URL
Request
Response
Delete All Failure URLs
Deletes all failure URLs. There are no parameters.
Request
Response
Error Types
errorName stores the fully-qualified class name of the exception that occurred during crawling, exactly as captured. It is not a fixed enumeration; any class name may appear depending on the exception that was raised. The following are representative examples.
| Error Name (example) | Description |
|---|---|
java.net.ConnectException | Connection refused (cannot connect to the server) |
java.net.UnknownHostException | Host name could not be resolved (DNS error) |
java.net.SocketTimeoutException | Connection or read timeout |
javax.net.ssl.SSLException | SSL/TLS handshake or certificate error |
java.io.IOException | I/O error |
org.codelibs.fess.exception.ContentNotFoundException | URL that returned an HTTP status code configured in crawler.failure.url.status.codes (default: 403, 404, 410) |
org.codelibs.fess.crawler.exception.MaxLengthExceededException | Content exceeded the maximum length |
Usage Examples
List Failure URLs
Filter by Error Count
Filter by Error Name
Get Failure URL
Delete Failure URL
Delete All Failure URLs
Aggregate by Error Type
Reference
Admin API Overview - Admin API Overview
CrawlingInfo API - Crawling Info API
JobLog API - Job Log API
Failure URL - Failure URL Management Guide