Menu

File Crawling

Overview

The File Crawling configuration page allows you to manage settings for crawling files in the file system or shared folders on the network.

Management Method

Display Method

To open the list page for File Crawling settings, click on “[Crawler > File System]” in the left menu.

image0

To edit, click on the setting name.

Creating Settings

To open the File Crawling configuration page, click on the “Create New” button.

image1

Setting Items

The name of the setting.

Specifies the starting location for crawling (e.g., file:/ or smb://).

Paths that match the specified regular expression (Java format) in this field will be crawled by the Fess crawler.

Paths that match the specified regular expression (Java format) in this field will not be crawled by the Fess crawler.

Paths that match the specified regular expression (Java format) in this field will be included in the search.

Paths that match the specified regular expression (Java format) in this field will be excluded from the search.

You can specify the crawling configuration information.

Specifies the depth of the file system structure to crawl.

Specifies the number of paths to index.

Specifies the number of threads to use for this setting.

Specifies the wait time for threads to crawl paths.

The boost value represents the priority of documents indexed by this setting.

Specifies the permissions for this setting. To display search results to users belonging to the developer group, specify {group}developer. User-level specification is {user}username, role-level specification is {role}rolename, and group-level specification is {group}groupname.

Specifies the hostname of the virtual host. For more information, refer to the Virtual Host section of the Configuration Guide.

When this setting is enabled, the default crawler job will include this setting in the crawl.

You can enter a description.

Deleting Settings

Click on the setting name on the list page, and then click the delete button to display the confirmation screen. Clicking the delete button will remove the setting.

Example

Crawling Local Files

If you want to crawl files under /home/share, the settings would be as follows:

Name Value
Name Share Directory
Paths file:/home/share

Other parameters can be left as default.

Crawling a Windows Shared Folder

If you want to crawl files under \SERVERSharedFolder, the configuration should be as follows:

Name Value
Name Shared Folder
Path smb://SERVER/SharedFolder/

If username and password are required to access the shared folder, you need to create file authentication settings from the [Crawler > File Authentication] section in the left menu. The configuration will be as follows:

Name Value
Hostname SERVER
Scheme SAMBA
Username (Enter your username)
Password (Enter your password)