Overview
Ollama is an open-source platform for running Large Language Models (LLM) in local environments. It is configured as the default LLM provider for Fess and is suitable for use in private environments.
Using Ollama allows you to use AI search mode functionality without sending data externally.
Key Features
Local Execution: Data is not sent externally, ensuring privacy
Various Models: Supports multiple models including Llama, Mistral, Gemma, and CodeLlama
Cost Efficiency: No API costs (only hardware costs)
Customization: Can use custom fine-tuned models
Supported Models
Main models available with Ollama:
llama3.3:70b- Meta’s Llama 3.3 (70B parameters)gemma3:4b- Google’s Gemma 3 (4B parameters, default)mistral:7b- Mistral AI’s Mistral (7B parameters)codellama:13b- Meta’s Code Llama (13B parameters)phi3:3.8b- Microsoft’s Phi-3 (3.8B parameters)
Note
For the latest list of available models, see Ollama Library.
Prerequisites
Before using Ollama, verify the following.
Ollama Installation: Download and install from https://ollama.com/
Model Download: Download the model you want to use to Ollama
Ollama Server Running: Verify Ollama is running
Installing Ollama
Linux/macOS
Windows
Download and run the installer from the official website.
Docker
Downloading Models
Basic Configuration
Add the following settings to app/WEB-INF/conf/fess_config.properties.
Minimal Configuration
Recommended Configuration (Production)
Configuration Options
All configuration options available for the Ollama client.
| Property | Description | Default |
|---|---|---|
rag.llm.ollama.api.url | Ollama server base URL | http://localhost:11434 |
rag.llm.ollama.model | Model name to use (must be downloaded to Ollama) | gemma3:4b |
rag.llm.ollama.timeout | Request timeout (in milliseconds) | 60000 |
Network Configuration
Docker Configuration
Example configuration when running both Fess and Ollama in Docker.
docker-compose.yml:
Note
In Docker Compose environments, use ollama as the hostname (not localhost).
Remote Ollama Server
When running Ollama on a separate server from Fess:
Warning
Ollama does not have authentication by default, so when making it externally accessible, consider network-level security measures (firewall, VPN, etc.).
Model Selection Guide
Guidelines for selecting models based on intended use.
| Model | Size | Required VRAM | Use Case |
|---|---|---|---|
phi3:3.8b | Small | 4GB+ | Lightweight environments, simple Q&A |
gemma3:4b | Small-Medium | 6GB+ | Well-balanced general use (default) |
mistral:7b | Medium | 8GB+ | When high-quality responses are needed |
llama3.3:70b | Large | 48GB+ | Highest quality responses, complex reasoning |
GPU Support
Ollama supports GPU acceleration. Using an NVIDIA GPU significantly improves inference speed.
Troubleshooting
Connection Errors
Symptom: Chat functionality shows errors, LLM displays as unavailable
Check the following:
Verify Ollama is running:
Verify the model is downloaded:
Check firewall settings
Model Not Found
Symptom: “Configured model not found in Ollama” appears in logs
Solutions:
Verify the model name is correct (may need to include
:latesttag):Download the required model:
Timeout
Symptom: Requests time out
Solutions:
Extend timeout duration:
Consider using a smaller model or GPU environment
Debug Settings
When investigating issues, adjust Fess log levels to output detailed Ollama-related logs.
app/WEB-INF/classes/log4j2.xml:
References
LLM Integration Overview - LLM Integration Overview
AI Mode Configuration - AI Mode Details