|
a/src/doc/man/recoll.conf.5 |
|
b/src/doc/man/recoll.conf.5 |
1 |
.\" $Id: recoll.conf.5,v 1.5 2007-07-13 10:18:49 dockes Exp $ (C) 2005 J.F.Dockes\$
|
1 |
.\" $Id: recoll.conf.5,v 1.5 2007-07-13 10:18:49 dockes Exp $ (C) 2005 J.F.Dockes\$
|
2 |
.TH RECOLL.CONF 5 "8 January 2006"
|
2 |
.TH RECOLL.CONF 5 "8 January 2006"
|
3 |
.SH NAME
|
3 |
.SH NAME
|
4 |
recoll.conf \- main personal configuration file for Recoll
|
4 |
recoll.conf \- main personal configuration file for Recoll
|
5 |
.SH DESCRIPTION
|
5 |
.SH DESCRIPTION
|
6 |
This file defines the indexation configuration for the Recoll full-text search
|
6 |
This file defines the index configuration for the Recoll full-text search
|
7 |
system.
|
7 |
system.
|
8 |
.LP
|
8 |
.LP
|
9 |
The system-wide configuration file is normally located inside
|
9 |
The system-wide configuration file is normally located inside
|
10 |
/usr/[local]/share/recoll/examples. Any parameter set in the common file
|
10 |
/usr/[local]/share/recoll/examples. Any parameter set in the common file
|
11 |
may be overridden by setting it in the personal configuration file, by default:
|
11 |
may be overridden by setting it in the personal configuration file, by default:
|
12 |
.IR $HOME/.recoll/recoll.conf
|
12 |
.IR $HOME/.recoll/recoll.conf
|
13 |
.LP
|
13 |
.LP
|
14 |
Please note while we try to keep this manual page reasonably up to date, it
|
14 |
Please note while we try to keep this manual page reasonably up to date, it
|
15 |
will frequently lag the current state of the software. The best source of
|
15 |
will frequently lag the current state of the software. The best source of
|
16 |
information about the configuration are the comments in the configuration
|
16 |
information about the configuration are the comments in the system-wide
|
17 |
file.
|
17 |
configuration file.
|
18 |
|
18 |
|
19 |
.LP
|
19 |
.LP
|
20 |
A short extract of the file might look as follows:
|
20 |
A short extract of the file might look as follows:
|
21 |
.IP
|
21 |
.IP
|
22 |
.nf
|
22 |
.nf
|
|
... |
|
... |
42 |
Empty lines or lines beginning with # are ignored.
|
42 |
Empty lines or lines beginning with # are ignored.
|
43 |
.LP
|
43 |
.LP
|
44 |
Affectation lines are in the form 'name = value'.
|
44 |
Affectation lines are in the form 'name = value'.
|
45 |
.LP
|
45 |
.LP
|
46 |
Section lines allow redefining a parameter for a directory subtree. Some of
|
46 |
Section lines allow redefining a parameter for a directory subtree. Some of
|
47 |
the parameters used for indexaction are looked up hierarchically from the
|
47 |
the parameters used for indexing are looked up hierarchically from the
|
48 |
more to the less specific. Not all parameters can be meaningfully
|
48 |
more to the less specific. Not all parameters can be meaningfully
|
49 |
redefined, this is specified for each in the next section.
|
49 |
redefined, this is specified for each in the next section.
|
50 |
.LP
|
50 |
.LP
|
51 |
The tilde character (~) is expanded in file names to the name of the user's
|
51 |
The tilde character (~) is expanded in file names to the name of the user's
|
52 |
home directory.
|
52 |
home directory.
|
|
... |
|
... |
55 |
embedded spaces can be quoted with double-quotes.
|
55 |
embedded spaces can be quoted with double-quotes.
|
56 |
.SH OPTIONS
|
56 |
.SH OPTIONS
|
57 |
.TP
|
57 |
.TP
|
58 |
.BI "topdirs = " directories
|
58 |
.BI "topdirs = " directories
|
59 |
Specifies the list of directories to index (recursively).
|
59 |
Specifies the list of directories to index (recursively).
|
60 |
.TP
|
|
|
61 |
.BI "dbdir = " directory
|
|
|
62 |
The name of the Xapian database directory. It will be created if needed
|
|
|
63 |
when the database is initialized. If this is not an absolute pathname, it
|
|
|
64 |
will be taken relative to the configuration directory.
|
|
|
65 |
.TP
|
60 |
.TP
|
66 |
.BI "skippedNames = " patterns
|
61 |
.BI "skippedNames = " patterns
|
67 |
A space-separated list of patterns for names of files or directories that
|
62 |
A space-separated list of patterns for names of files or directories that
|
68 |
should be completely ignored. The list defined in the default file is:
|
63 |
should be completely ignored. The list defined in the default file is:
|
69 |
.sp
|
64 |
.sp
|
|
... |
|
... |
76 |
.I topdirs
|
71 |
.I topdirs
|
77 |
.TP
|
72 |
.TP
|
78 |
.BI "skippedPaths = " patterns
|
73 |
.BI "skippedPaths = " patterns
|
79 |
A space-separated list of patterns for paths the indexer should not descend
|
74 |
A space-separated list of patterns for paths the indexer should not descend
|
80 |
into. Together with topdirs, this allows pruning the indexed tree to one's
|
75 |
into. Together with topdirs, this allows pruning the indexed tree to one's
|
81 |
content. daemSkippedPaths can be used to define a specific value for the
|
76 |
content.
|
82 |
real time indexing monitor.
|
77 |
.B daemSkippedPaths
|
|
|
78 |
can be used to define a specific value for the real time indexing monitor.
|
|
|
79 |
.TP
|
|
|
80 |
.BI "skippedPathsFnmPathname = " 0/1
|
|
|
81 |
The values in the *skippedPaths variables are matched by default with
|
|
|
82 |
fnmatch(3), with the FNM_PATHNAME and FNM_LEADING_DIR flags. This means
|
|
|
83 |
that '/' characters must be matched explicitely. You can set
|
|
|
84 |
skippedPathsFnmPathname to 0 to disable the use of FNM_PATHNAME (meaning
|
|
|
85 |
that /*/dir3 will match /dir1/dir2/dir3).
|
83 |
.TP
|
86 |
.TP
|
84 |
.BI "followLinks = " boolean
|
87 |
.BI "followLinks = " boolean
|
85 |
Specifies if the indexer should follow
|
88 |
Specifies if the indexer should follow
|
86 |
symbolic links while walking the file tree. The default is
|
89 |
symbolic links while walking the file tree. The default is
|
87 |
to ignore symbolic links to avoid multiple indexing of
|
90 |
to ignore symbolic links to avoid multiple indexing of
|
|
... |
|
... |
91 |
.I topdirs
|
94 |
.I topdirs
|
92 |
members by using sections. It can not be changed below the
|
95 |
members by using sections. It can not be changed below the
|
93 |
.I topdirs
|
96 |
.I topdirs
|
94 |
level.
|
97 |
level.
|
95 |
.TP
|
98 |
.TP
|
96 |
.BI "loglevel = " value
|
99 |
.BI "indexedmimetypes = " list
|
97 |
Verbosity level for recoll and recollindex. A value of 4 lists quite a lot of
|
100 |
Recoll normally indexes any file which it knows how to read. This list lets
|
98 |
debug/information messages. 3 lists only errors.
|
101 |
you restrict the indexed mime types to what you specify. If the variable is
|
99 |
.B daemloglevel
|
102 |
unspecified or the list empty (the default), all supported types are
|
100 |
can be used to specify a different value for the real-time indexing daemon.
|
103 |
processed.
|
|
|
104 |
.TP
|
|
|
105 |
.BI "compressedfilemaxkbs = " value
|
|
|
106 |
Size limit for compressed (.gz or .bz2) files. These need to be
|
|
|
107 |
decompressed in a temporary directory for identification, which can be very
|
|
|
108 |
wasteful if 'uninteresting' big compressed files are present. Negative
|
|
|
109 |
means no limit, 0 means no processing of any compressed file. Defaults
|
|
|
110 |
to \-1.
|
|
|
111 |
.TP
|
|
|
112 |
.BI "textfilemaxmbs = " value
|
|
|
113 |
Maximum size for text files. Very big text files are often uninteresting
|
|
|
114 |
logs. Set to -1 to disable (default 20MB).
|
|
|
115 |
.TP
|
|
|
116 |
.BI "textfilepagekbs = " value
|
|
|
117 |
If this is set to other than -1, text files will be indexed as multiple
|
|
|
118 |
documents of the given page size. This may be useful if you do want to
|
|
|
119 |
index very big text files as it will both reduce memory usage at index time
|
|
|
120 |
and help with loading data to the preview window. A size of a few megabytes
|
|
|
121 |
would seem reasonable (default: 1000 : 1MB).
|
|
|
122 |
.TP
|
|
|
123 |
.BI "membermaxkbs = " "value in kilobytes"
|
|
|
124 |
This defines the maximum size for an archive member (zip, tar or rar at
|
|
|
125 |
the moment). Bigger entries will be skipped. Current default: 50000 (50 MB).
|
|
|
126 |
.TP
|
|
|
127 |
.BI "indexallfilenames = " boolean
|
|
|
128 |
Recoll indexes file names into a special section of the database to allow
|
|
|
129 |
specific file names searches using wild cards. This parameter decides if
|
|
|
130 |
file name indexing is performed only for files with mime types that would
|
|
|
131 |
qualify them for full text indexing, or for all files inside
|
|
|
132 |
the selected subtrees, independent of mime type.
|
|
|
133 |
.TP
|
|
|
134 |
.BI "usesystemfilecommand = " boolean
|
|
|
135 |
Decide if we use the
|
|
|
136 |
.B "file \-i"
|
|
|
137 |
system command as a final step for determining the mime type for a file
|
|
|
138 |
(the main procedure uses suffix associations as defined in the
|
|
|
139 |
.B mimemap
|
|
|
140 |
file). This can be useful for files with suffixless names, but it will
|
|
|
141 |
also cause the indexing of many bogus "text" files.
|
101 |
.TP
|
142 |
.TP
|
102 |
.BI "logfilename = " file
|
143 |
.BI "processbeaglequeue = " 0/1
|
103 |
Where should the messages go. 'stderr' can be used as a special value.
|
144 |
If this is set, process the directory where Beagle Web browser plugins copy
|
104 |
.B daemlogfilename
|
145 |
visited pages for indexing. Of course, Beagle MUST NOT be running, else
|
105 |
can be used to specify a different value for the real-time indexing daemon.
|
146 |
things will behave strangely.
|
|
|
147 |
.TP
|
|
|
148 |
.BI "beaglequeuedir = " directory path
|
|
|
149 |
The path to the Beagle indexing queue. This is hard-coded in the Beagle
|
|
|
150 |
plugin as ~/.beagle/ToIndex so there should be no need to change it.
|
|
|
151 |
.TP
|
|
|
152 |
.BI "indexStripChars = " 0/1
|
|
|
153 |
Decide if we strip characters of diacritics and convert them to lower-case
|
|
|
154 |
before terms are indexed. If we don't, searches sensitive to case and
|
|
|
155 |
diacritics can be performed, but the index will be bigger, and some
|
|
|
156 |
marginal weirdness may sometimes occur. The default is a stripped index
|
|
|
157 |
(indexStripChars = 1) for now. When using multiple indexes for a search,
|
|
|
158 |
this parameter must be defined identically for all. Changing the value
|
|
|
159 |
implies an index reset.
|
|
|
160 |
.TP
|
|
|
161 |
.BI "maxTermExpand = " value
|
|
|
162 |
Maximum expansion count for a single term (e.g.: when using wildcards). The
|
|
|
163 |
default of 10000 is reasonable and will avoid queries that appear frozen
|
|
|
164 |
while the engine is walking the term list.
|
|
|
165 |
.TP
|
|
|
166 |
.BI "maxXapianClauses = " value
|
|
|
167 |
Maximum number of elementary clauses we can add to a single Xapian
|
|
|
168 |
query. In some cases, the result of term expansion can be multiplicative,
|
|
|
169 |
and we want to avoid using excessive memory. The default of 100 000 should
|
|
|
170 |
be both high enough in most cases and compatible with current typical
|
|
|
171 |
hardware configurations.
|
|
|
172 |
.TP
|
|
|
173 |
.BI "nonumbers = " 0/1
|
|
|
174 |
If this set to true, no terms will be generated for numbers. For example
|
|
|
175 |
"123", "1.5e6", 192.168.1.4, would not be indexed ("value123" would still
|
|
|
176 |
be). Numbers are often quite interesting to search for, and this should
|
|
|
177 |
probably not be set except for special situations, ie, scientific documents
|
|
|
178 |
with huge amounts of numbers in them. This can only be set for a whole
|
|
|
179 |
index, not for a subtree.
|
|
|
180 |
.TP
|
|
|
181 |
.BI "nocjk = " boolean
|
|
|
182 |
If this set to true, specific east asian (Chinese Korean Japanese)
|
|
|
183 |
characters/word splitting is turned off. This will save a small amount of
|
|
|
184 |
cpu if you have no CJK documents. If your document base does include such
|
|
|
185 |
text but you are not interested in searching it, setting
|
|
|
186 |
.I nocjk
|
|
|
187 |
may be a significant time and space saver.
|
|
|
188 |
.TP
|
|
|
189 |
.BI "cjkngramlen = " value
|
|
|
190 |
This lets you adjust the size of n-grams used for indexing CJK text. The
|
|
|
191 |
default value of 2 is probably appropriate in most cases. A value of 3
|
|
|
192 |
would allow more precision and efficiency on longer words, but the index
|
|
|
193 |
will be approximately twice as large.
|
106 |
.TP
|
194 |
.TP
|
107 |
.BI "indexstemminglanguages = " languages
|
195 |
.BI "indexstemminglanguages = " languages
|
108 |
A list of languages for which the stem expansion databases will be
|
196 |
A list of languages for which the stem expansion databases will be
|
109 |
built. See recollindex(1) for possible values.
|
197 |
built. See recollindex(1) for possible values.
|
110 |
.TP
|
198 |
.TP
|
111 |
.BI "defaultcharset = " charset
|
199 |
.BI "defaultcharset = " charset
|
112 |
The name of the character set used for files that do not contain a
|
200 |
The name of the character set used for files that do not contain a
|
113 |
character set definition (ie: plain text files). This can be redefined for
|
201 |
character set definition (ie: plain text files). This can be redefined for
|
114 |
any subdirectory.
|
202 |
any subdirectory.
|
|
|
203 |
.TP
|
|
|
204 |
.BI "unac_except_trans = " "list of utf-8 groups"
|
|
|
205 |
This is a list of characters, encoded in UTF-8, which should be handled
|
|
|
206 |
specially when converting text to unaccented lowercase. For example, in
|
|
|
207 |
Swedish, the letter "a with diaeresis" has full alphabet citizenship and
|
|
|
208 |
should not be turned into an a.
|
|
|
209 |
.br
|
|
|
210 |
Each element in the space-separated list has the special character as first
|
|
|
211 |
element and the translation following. The handling of both the lowercase
|
|
|
212 |
and upper-case versions of a character should be specified, as appartenance
|
|
|
213 |
to the list will turn-off both standard accent and case processing.
|
|
|
214 |
.br
|
|
|
215 |
Note that the translation is not limited to a single character.
|
|
|
216 |
.br
|
|
|
217 |
This parameter cannot be redefined for subdirectories, it is global,
|
|
|
218 |
because there is no way to do otherwise when querying. If you have document
|
|
|
219 |
sets which would need different values, you will have to index and query
|
|
|
220 |
them separately.
|
|
|
221 |
.TP
|
|
|
222 |
.BI "maildefcharset = " character set name
|
|
|
223 |
This can be used to define the default character set specifically for email
|
|
|
224 |
messages which don't specify it. This is mainly useful for readpst (libpst)
|
|
|
225 |
dumps, which are utf-8 but do not say so.
|
|
|
226 |
.TP
|
|
|
227 |
.BI "localfields = " "fieldname = value:..."
|
|
|
228 |
This allows setting fields for all documents under a given
|
|
|
229 |
directory. Typical usage would be to set an "rclaptg" field, to be used in
|
|
|
230 |
mimeview to select a specific viewer. If several fields are to be set, they
|
|
|
231 |
should be separated with a colon (':') character (which there is currently
|
|
|
232 |
no way to escape). Ie: localfields= rclaptg=gnus:other = val, then select
|
|
|
233 |
specifier viewer with mimetype|tag=... in mimeview.
|
|
|
234 |
.TP
|
|
|
235 |
.BI "dbdir = " directory
|
|
|
236 |
The name of the Xapian database directory. It will be created if needed
|
|
|
237 |
when the database is initialized. If this is not an absolute pathname, it
|
|
|
238 |
will be taken relative to the configuration directory.
|
|
|
239 |
.TP
|
|
|
240 |
.BI "idxstatusfile = " "file path"
|
|
|
241 |
The name of the scratch file where the indexer process updates its
|
|
|
242 |
status. Default: idxstatus.txt inside the configuration directory.
|
115 |
.TP
|
243 |
.TP
|
116 |
.BI "maxfsoccuppc = " percentnumber
|
244 |
.BI "maxfsoccuppc = " percentnumber
|
117 |
Maximum file system occupation before we
|
245 |
Maximum file system occupation before we
|
118 |
stop indexing. The value is a percentage, corresponding to
|
246 |
stop indexing. The value is a percentage, corresponding to
|
119 |
what the "Capacity" df output column shows. The default
|
247 |
what the "Capacity" df output column shows. The default
|
120 |
value is 0, meaning no checking.
|
248 |
value is 0, meaning no checking.
|
|
|
249 |
.TP
|
|
|
250 |
.BI "mboxcachedir = " "directory path"
|
|
|
251 |
The directory where mbox message offsets cache files are held. This is
|
|
|
252 |
normally $RECOLL_CONFDIR/mboxcache, but it may be useful to share a
|
|
|
253 |
directory between different configurations.
|
|
|
254 |
.TP
|
|
|
255 |
.BI "mboxcacheminmbs = " "value in megabytes"
|
|
|
256 |
The minimum mbox file size over which we cache the offsets. There is really no sense in caching offsets for small files. The default is 5 MB.
|
|
|
257 |
.TP
|
|
|
258 |
.BI "webcachedir = " "directory path"
|
|
|
259 |
This is only used by the Beagle web browser plugin indexing code, and
|
|
|
260 |
defines where the cache for visited pages will live. Default:
|
|
|
261 |
$RECOLL_CONFDIR/webcache
|
|
|
262 |
.TP
|
|
|
263 |
.BI "webcachemaxmbs = " "value in megabytes"
|
|
|
264 |
This is only used by the Beagle web browser plugin indexing code, and
|
|
|
265 |
defines the maximum size for the web page cache. Default: 40 MB.
|
121 |
.TP
|
266 |
.TP
|
122 |
.BI "idxflushmb = " megabytes
|
267 |
.BI "idxflushmb = " megabytes
|
123 |
Threshold (megabytes of new text data)
|
268 |
Threshold (megabytes of new text data)
|
124 |
where we flush from memory to disk index. Setting this can
|
269 |
where we flush from memory to disk index. Setting this can
|
125 |
help control memory usage. A value of 0 means no explicit
|
270 |
help control memory usage. A value of 0 means no explicit
|
126 |
flushing, letting Xapian use its own default, which is
|
271 |
flushing, letting Xapian use its own default, which is
|
127 |
flushing every 10000 documents (or XAPIAN_FLUSH_THRESHOLD), meaning that
|
272 |
flushing every 10000 documents (or XAPIAN_FLUSH_THRESHOLD), meaning that
|
128 |
memory usage depends on average document size. The default value is 10.
|
273 |
memory usage depends on average document size. The default value is 10.
|
129 |
.TP
|
274 |
.TP
|
|
|
275 |
.BI "autodiacsens = " 0/1
|
|
|
276 |
IF the index is not stripped, decide if we automatically trigger diacritics
|
|
|
277 |
sensitivity if the search term has accented characters (not in
|
|
|
278 |
unac_except_trans). Else you need to use the query language and the D
|
|
|
279 |
modifier to specify diacritics sensitivity. Default is no.
|
|
|
280 |
.TP
|
|
|
281 |
.BI "autocasesens = " 0/1
|
|
|
282 |
IF the index is not stripped, decide if we automatically trigger character
|
|
|
283 |
case sensitivity if the search term has upper-case characters in any but
|
|
|
284 |
the first position. Else you need to use the query language and the C
|
|
|
285 |
modifier to specify character-case sensitivity. Default is yes.
|
|
|
286 |
.TP
|
|
|
287 |
.BI "loglevel = " value
|
|
|
288 |
Verbosity level for recoll and recollindex. A value of 4 lists quite a lot of
|
|
|
289 |
debug/information messages. 3 lists only errors.
|
|
|
290 |
.B daemloglevel
|
|
|
291 |
can be used to specify a different value for the real-time indexing daemon.
|
|
|
292 |
.TP
|
|
|
293 |
.BI "logfilename = " file
|
|
|
294 |
Where should the messages go. 'stderr' can be used as a special value.
|
|
|
295 |
.B daemlogfilename
|
|
|
296 |
can be used to specify a different value for the real-time indexing daemon.
|
|
|
297 |
.TP
|
|
|
298 |
.BI "mondelaypatterns = " "list of patterns"
|
|
|
299 |
This allows specify wildcard path patterns (processed with fnmatch(3) with
|
|
|
300 |
0 flag), to match files which change too often and for which a delay should
|
|
|
301 |
be observed before re-indexing. This is a space-separated list, each entry
|
|
|
302 |
being a pattern and a time in seconds, separated by a colon. You can use
|
|
|
303 |
double quotes if a path entry contains white space. Example:
|
|
|
304 |
.sp
|
|
|
305 |
mondelaypatterns = *.log:20 "this one has spaces*:10"
|
|
|
306 |
.TP
|
|
|
307 |
.BI "monixinterval = " "value in seconds
|
|
|
308 |
Minimum interval (seconds) for processing the indexing queue. The real time
|
|
|
309 |
monitor does not process each event when it comes in, but will wait this
|
|
|
310 |
time for the queue to accumulate to diminish overhead and in order to
|
|
|
311 |
aggregate multiple events to the same file. Default 30 S.
|
|
|
312 |
.TP
|
|
|
313 |
.BI "monauxinterval = " "value in seconds
|
|
|
314 |
Period (in seconds) at which the real time monitor will regenerate the
|
|
|
315 |
auxiliary databases (spelling, stemming) if needed. The default is one
|
|
|
316 |
hour.
|
|
|
317 |
.TP
|
|
|
318 |
.BI "monioniceclass, monioniceclassdata"
|
|
|
319 |
These allow defining the ionice class and data used by the indexer (default
|
|
|
320 |
class 3, no data).
|
|
|
321 |
.TP
|
|
|
322 |
.BI "filtermaxseconds = " "value in seconds"
|
|
|
323 |
Maximum filter execution time, after which it is aborted. Some postscript
|
|
|
324 |
programs just loop...
|
|
|
325 |
.TP
|
130 |
.BI "filtersdir = " directory
|
326 |
.BI "filtersdir = " directory
|
131 |
A directory to search for the external filter scripts used to index some
|
327 |
A directory to search for the external filter scripts used to index some
|
132 |
types of files. The value should not be changed, except if you want to
|
328 |
types of files. The value should not be changed, except if you want to
|
133 |
modify one of the default scripts. The value can be redefined for any
|
329 |
modify one of the default scripts. The value can be redefined for any
|
134 |
subdirectory.
|
330 |
subdirectory.
|
|
... |
|
... |
136 |
.BI "iconsdir = " directory
|
332 |
.BI "iconsdir = " directory
|
137 |
The name of the directory where
|
333 |
The name of the directory where
|
138 |
.B recoll
|
334 |
.B recoll
|
139 |
result list icons are stored. You can change this if you want different
|
335 |
result list icons are stored. You can change this if you want different
|
140 |
images.
|
336 |
images.
|
141 |
.TP
|
|
|
142 |
.BI "guesscharset = " boolean
|
|
|
143 |
Try to guess the character set of files if no internal value is available
|
|
|
144 |
(ie: for plain text files). This does not work well in general, and should
|
|
|
145 |
probably not be used.
|
|
|
146 |
.TP
|
|
|
147 |
.BI "usesystemfilecommand = " boolean
|
|
|
148 |
Decide if we use the
|
|
|
149 |
.B "file \-i"
|
|
|
150 |
system command as a final step for determining the mime type for a file
|
|
|
151 |
(the main procedure uses suffix associations as defined in the
|
|
|
152 |
.B mimemap
|
|
|
153 |
file). This can be useful for files with suffixless names, but it will
|
|
|
154 |
also cause the indexation of many bogus "text" files.
|
|
|
155 |
.TP
|
|
|
156 |
.BI "indexedmimetypes = " list
|
|
|
157 |
Recoll normally indexes any file which it knows how to read. This list lets
|
|
|
158 |
you restrict the indexed mime types to what you specify. If the variable is
|
|
|
159 |
unspecified or the list empty (the default), all supported types are
|
|
|
160 |
processed.
|
|
|
161 |
.TP
|
|
|
162 |
.BI "compressedfilemaxkbs = " value
|
|
|
163 |
Size limit for compressed (.gz or .bz2) files. These need to be
|
|
|
164 |
decompressed in a temporary directory for identification, which can be very
|
|
|
165 |
wasteful if 'uninteresting' big compressed files are present. Negative
|
|
|
166 |
means no limit, 0 means no processing of any compressed file. Defaults
|
|
|
167 |
to \-1.
|
|
|
168 |
.TP
|
|
|
169 |
.BI "indexallfilenames = " boolean
|
|
|
170 |
Recoll indexes file names into a special section of the database to allow
|
|
|
171 |
specific file names searches using wild cards. This parameter decides if
|
|
|
172 |
file name indexing is performed only for files with mime types that would
|
|
|
173 |
qualify them for full text indexation, or for all files inside
|
|
|
174 |
the selected subtrees, independent of mime type.
|
|
|
175 |
.TP
|
337 |
.TP
|
176 |
.BI "idxabsmlen = " value
|
338 |
.BI "idxabsmlen = " value
|
177 |
Recoll stores an abstract for each indexed file inside the database. The
|
339 |
Recoll stores an abstract for each indexed file inside the database. The
|
178 |
text can come from an actual 'abstract' section in the document or will
|
340 |
text can come from an actual 'abstract' section in the document or will
|
179 |
just be the beginning of the document. It is stored in the index so that it
|
341 |
just be the beginning of the document. It is stored in the index so that it
|
|
... |
|
... |
196 |
.BI "noaspell = " boolean
|
358 |
.BI "noaspell = " boolean
|
197 |
If this is set, the aspell dictionary generation is turned off. Useful for
|
359 |
If this is set, the aspell dictionary generation is turned off. Useful for
|
198 |
cases where you don't need the functionality or when it is unusable because
|
360 |
cases where you don't need the functionality or when it is unusable because
|
199 |
aspell crashes during dictionary generation.
|
361 |
aspell crashes during dictionary generation.
|
200 |
.TP
|
362 |
.TP
|
201 |
.BI "nocjk = " boolean
|
363 |
.BI "mhmboxquirks = " flags
|
202 |
If this set to true, specific east asian (Chinese Korean Japanese)
|
364 |
This allows definining location-related quirks for the mailbox
|
203 |
characters/word splitting is turned off. This will save a small amount of
|
365 |
handler. Currently only the tbird flag is defined, and it should be set for
|
204 |
cpu if you have no CJK documents. If your document base does include such
|
366 |
directories which hold Thunderbird data, as their folder format is weird.
|
205 |
text but you are not interested in searching it, setting
|
367 |
|
206 |
.I nocjk
|
|
|
207 |
may be a significant time and space saver.
|
|
|
208 |
.TP
|
|
|
209 |
.BI "cjkngramlen = " value
|
|
|
210 |
This lets you adjust the size of n-grams used for indexing CJK text. The
|
|
|
211 |
default value of 2 is probably appropriate in most cases. A value of 3
|
|
|
212 |
would allow more precision and efficiency on longer words, but the index
|
|
|
213 |
will be approximately twice as large.
|
|
|
214 |
.SH SEE ALSO
|
368 |
.SH SEE ALSO
|
215 |
.PP
|
369 |
.PP
|
216 |
recollindex(1) recoll(1)
|
370 |
recollindex(1) recoll(1)
|