Overview
The Chat API is a v2 API for programmatic access to Fess’s AI search mode (RAG chat) feature. You can obtain LLM-generated answers (completions) based on search results.
This API provides the following three endpoints:
| Endpoint | Description |
|---|---|
POST /chat | Batch (non-streaming) RAG chat completion. |
POST /chat/stream | Streaming RAG chat completion (Server-Sent Events). |
DELETE /chat/sessions/{session_id} | Clears the conversation history of a chat session. |
For the base URL, common response envelope, and error codes, see API Overview.
Local environment example:
Prerequisites
To use the Chat API, the following configuration is required:
The AI search mode (RAG chat) feature must be enabled (
rag.chat.enabled=true)An LLM provider must be configured
If the feature is disabled (rag.chat.enabled=false), requests return an invalid_request error.
For detailed configuration, see AI Mode Configuration and LLM Integration Overview.
Authentication and CSRF
All Chat API endpoints are state-changing requests (POST / DELETE), so the X-Fess-CSRF-Token header is required. For information on how to obtain the CSRF token and details about authentication and sessions, see API Overview.
Rate Limiting
POST /chat , POST /chat/stream and DELETE /chat/sessions/{session_id} have per-user rate limits applied.
Default: 30 requests per minute (per user)
Configuration key:
api.v2.chat.rate.limit.per.user.per.minuteSetting the value to
0or less disables the rate limit.
When the rate limit is exceeded, a rate_limited error (HTTP 429) is returned. The Retry-After header is set to a fixed value of 60 (seconds). This rate limit is shared between POST /chat , POST /chat/stream and DELETE /chat/sessions/{session_id}.
Note
The rate limit applies only when the user can be identified. For anonymous calls where no session is established and the user ID cannot be resolved, the rate limit is skipped.
POST /chat
Performs synchronous chat completion. Sessions are identified by session_id. If session_id is omitted, the server creates a session and returns it in the response’s session_id.
Invalid values passed in fields.label or extra_queries are silently removed from the resolved request and do not surface in the response envelope.
Endpoint
Request Body
A JSON body with Content-Type: application/json.
The request body size limit is 32 KiB. Exceeding it results in a payload_too_large error (HTTP 413).
Request example:
Response
On success (HTTP 200, ChatResponse)
The response is stored in the common envelope response. session_id is always present.
ChatSource
Response example:
HTTP Status Codes
| Code | Description |
|---|---|
| 200 | Request successful. |
| 400 | Invalid request (missing message, message exceeding the maximum length, rag.chat.enabled=false, etc.). |
| 403 | Missing or expired CSRF token. |
| 405 | HTTP method not allowed. |
| 413 | The request body exceeds the size limit (32 KiB). |
| 415 | The Content-Type is not application/json, is missing, or the charset is not UTF-8. |
| 429 | Rate limit exceeded. |
| 500 | Internal server error. |
cURL Example
POST /chat/stream
Performs streaming chat completion. The request body is the same as POST /chat (ChatRequest).
A successful response is a series of named events in text/event-stream format (Server-Sent Events). Each event consists of event: <name> and data: <JSON>.
Validation failures before the stream begins still return a JSON envelope (same error codes as POST /chat). Invalid values in fields.label or extra_queries are silently removed and do not appear in the response envelope or SSE events.
Endpoint
SSE Events
| Event | Description (payload) |
|---|---|
phase | Pipeline phase transition ({ phase, status, message?, keywords?, hit_count?, ... }). message and keywords are emitted on onPhaseStart. Additional optional fields (e.g., hit_count) flow from the onPhaseComplete payload. |
chunk | Fragment of generated text ({ content }). |
sources | Retrieved sources ({ sources: [ChatSource] }). |
retry | Back-off for transient failure ({ phase, operation, attempt, max_attempts, sleep_ms, cause? }). |
waiting | Progress of a long-running phase ({ phase, reason, elapsed_ms, timeout_ms }). |
fallback | Query rewrite or strategy fallback ({ phase, reason, original_query?, new_query? }). |
warning | Recoverable warning ({ phase, code, detail? }). |
done | Stream end ({ session_id, html_content? }). |
error | Terminal mid-stream failure ({ phase?, message, error_code }). The message field contains the same string as error_code. Clients should localize based on error_code. |
SSE stream example:
HTTP Status Codes
When pre-stream validation fails, the following error codes are returned in a JSON envelope.
| Code | Description |
|---|---|
| 200 | SSE stream started (success). |
| 400 | Invalid request (missing message, rag.chat.enabled=false, etc.). |
| 403 | Missing or expired CSRF token. |
| 405 | HTTP method not allowed. |
| 413 | The request body exceeds the size limit (32 KiB). |
| 415 | The Content-Type is not application/json, is missing, or the charset is not UTF-8. |
| 429 | Rate limit exceeded. |
| 500 | Internal server error. |
cURL Example
DELETE /chat/sessions/{session_id}
Clears the conversation history of the specified chat session. The session is identified by the session_id in the path.
On success, cleared: true is returned. When no matching active session exists, a not_found error (HTTP 404) is returned.
Endpoint
Path Parameters
| Parameter | Type | Description |
|---|---|---|
session_id | string | ID of the session to clear. minLength 1, maxLength 128, pattern ^[A-Za-z0-9._-]+$. |
Response
On success (HTTP 200, ChatClearResponse)
The response is stored in the common envelope response. session_id and cleared are always present.
| Field | Type | Description |
|---|---|---|
session_id | string | Session ID. |
cleared | boolean | Always true (when the session was found and cleared). |
Response example:
HTTP Status Codes
| Code | Description |
|---|---|
| 200 | Session cleared. |
| 400 | Invalid request (e.g., session_id does not match the pattern ^[A-Za-z0-9._-]+$ or the length limit of 1–128 characters, or rag.chat.enabled=false). |
| 403 | Missing or expired CSRF token. |
| 404 | No matching active session found. |
| 405 | HTTP method not allowed. |
| 429 | Rate limit exceeded. |
| 500 | Internal server error. |
cURL Example
Security
Security considerations when using the Chat API:
Authentication: The v2 API uses session-based authentication. See API Overview for details.
CSRF: State-changing requests require the
X-Fess-CSRF-Tokenheader.Rate Limiting: Per-user rate limiting (default 30/minute) is applied to prevent DoS attacks. The configuration key is
api.v2.chat.rate.limit.per.user.per.minute.
References
AI Mode Configuration - AI search mode feature configuration
LLM Integration Overview - LLM integration overview
AI Search Mode - End user chat search guide
API Overview - API Overview