|
a/src/README |
|
b/src/README |
|
... |
|
... |
31 |
|
31 |
|
32 |
2.1. Introduction
|
32 |
2.1. Introduction
|
33 |
|
33 |
|
34 |
2.2. Index storage
|
34 |
2.2. Index storage
|
35 |
|
35 |
|
|
|
36 |
2.2.1. Index formats
|
|
|
37 |
|
36 |
2.2.1. Security aspects
|
38 |
2.2.2. Security aspects
|
37 |
|
39 |
|
38 |
2.3. The indexing configuration
|
40 |
2.3. The indexing configuration
|
39 |
|
41 |
|
40 |
2.4. Periodic indexing
|
42 |
2.4. Periodic indexing
|
41 |
|
43 |
|
|
... |
|
... |
306 |
The index data directory (xapiandb) only contains data that can be
|
308 |
The index data directory (xapiandb) only contains data that can be
|
307 |
completely rebuilt by an index run, and it can always be destroyed safely.
|
309 |
completely rebuilt by an index run, and it can always be destroyed safely.
|
308 |
|
310 |
|
309 |
----------------------------------------------------------------------
|
311 |
----------------------------------------------------------------------
|
310 |
|
312 |
|
|
|
313 |
2.2.1. Index formats
|
|
|
314 |
|
|
|
315 |
Xapian has had two possible index formats for quite some time. The "old"
|
|
|
316 |
one named Quartz, and the new one named Flint. Xapian 0.9 used Quartz by
|
|
|
317 |
default, but could use Flint if a specific environment variable
|
|
|
318 |
(XAPIAN_PREFER_FLINT) was set. Xapian 1.0 still supports Quartz but will
|
|
|
319 |
use Flint by default for new index creations.
|
|
|
320 |
|
|
|
321 |
The number of disk accesses performed during indexing has been much
|
|
|
322 |
optimized in the new Flint engine and you may see indexing times improved
|
|
|
323 |
by 50% in some cases (compared to Quartz), typically for big indexes where
|
|
|
324 |
disk accesses dominate the indexing time. There is also a more modest
|
|
|
325 |
improvement of index size.
|
|
|
326 |
|
|
|
327 |
Xapian will not convert automatically an existing index from the Quartz to
|
|
|
328 |
the Flint format. If you have an older index and want to take advantage of
|
|
|
329 |
the new format (which can be done without setting the environment variable
|
|
|
330 |
as of Recoll 1.8.2 and Xapian 1.0.0), you will have to explicitely delete
|
|
|
331 |
the old index, then run a normal indexing process.
|
|
|
332 |
|
|
|
333 |
Unfortunately, using the -z option to recollindex is not sufficient to
|
|
|
334 |
change the format, you have to delete all files inside the index directory
|
|
|
335 |
(typically ~/.recoll/xapiandb) before starting indexing.
|
|
|
336 |
|
|
|
337 |
----------------------------------------------------------------------
|
|
|
338 |
|
311 |
2.2.1. Security aspects
|
339 |
2.2.2. Security aspects
|
312 |
|
340 |
|
313 |
The Recoll index does not hold copies of the indexed documents. But it
|
341 |
The Recoll index does not hold copies of the indexed documents. But it
|
314 |
does hold enough data to allow for an almost complete reconstruction. If
|
342 |
does hold enough data to allow for an almost complete reconstruction. If
|
315 |
confidential data is indexed, access to the database directory should be
|
343 |
confidential data is indexed, access to the database directory should be
|
316 |
restricted.
|
344 |
restricted.
|
|
... |
|
... |
745 |
Spelling/Phonetic
|
773 |
Spelling/Phonetic
|
746 |
|
774 |
|
747 |
In this mode, you enter the term as you think it is spelled, and
|
775 |
In this mode, you enter the term as you think it is spelled, and
|
748 |
Recoll will do its best to find index terms that sound like your
|
776 |
Recoll will do its best to find index terms that sound like your
|
749 |
entry. This mode uses the Aspell spelling application, which must
|
777 |
entry. This mode uses the Aspell spelling application, which must
|
750 |
be installed on your system for things to work. The language which
|
778 |
be installed on your system for things to work (if your documents
|
|
|
779 |
contain non-ascii characters, Recoll needs an aspell version newer
|
|
|
780 |
than 0.60 for UTF-8 support). The language which is used to build
|
751 |
is used to build the dictionary out of the index terms (which is
|
781 |
the dictionary out of the index terms (which is done at the end of
|
752 |
done at the end of an indexing pass) is the one defined by your
|
782 |
an indexing pass) is the one defined by your NLS environment.
|
753 |
NLS environment. Weird things will probably happen if languages
|
783 |
Weird things will probably happen if languages are mixed up.
|
754 |
are mixed up.
|
|
|
755 |
|
784 |
|
756 |
Note that in cases where Recoll does not know the beginning of the string
|
785 |
Note that in cases where Recoll does not know the beginning of the string
|
757 |
to search for (ie a wildcard expression like *coll), the expansion can
|
786 |
to search for (ie a wildcard expression like *coll), the expansion can
|
758 |
take quite a long time because the full index term list will have to be
|
787 |
take quite a long time because the full index term list will have to be
|
759 |
processed. The expansion is currently limited at 200 results for wildcards
|
788 |
processed. The expansion is currently limited at 200 results for wildcards
|
|
... |
|
... |
1251 |
of file, is encountered. Some of the parameters used for indexing are
|
1280 |
of file, is encountered. Some of the parameters used for indexing are
|
1252 |
looked up hierarchically from the current directory location upwards. Not
|
1281 |
looked up hierarchically from the current directory location upwards. Not
|
1253 |
all parameters can be meaningfully redefined, this is specified for each
|
1282 |
all parameters can be meaningfully redefined, this is specified for each
|
1254 |
in the next section.
|
1283 |
in the next section.
|
1255 |
|
1284 |
|
1256 |
The tilde character (~) is expanded in file names to the name of the
|
1285 |
When found at the beginning of a file path, the tilde character (~) is
|
1257 |
user's home directory.
|
1286 |
expanded to the name of the user's home directory, as a shell would do.
|
1258 |
|
1287 |
|
1259 |
White space is used for separation inside lists. List elements with
|
1288 |
White space is used for separation inside lists. List elements with
|
1260 |
embedded spaces can be quoted using double-quotes.
|
1289 |
embedded spaces can be quoted using double-quotes.
|
1261 |
|
1290 |
|
1262 |
----------------------------------------------------------------------
|
1291 |
----------------------------------------------------------------------
|
|
... |
|
... |
1398 |
|
1427 |
|
1399 |
iconsdir
|
1428 |
iconsdir
|
1400 |
|
1429 |
|
1401 |
The name of the directory where recoll result list icons are
|
1430 |
The name of the directory where recoll result list icons are
|
1402 |
stored. You can change this if you want different images.
|
1431 |
stored. You can change this if you want different images.
|
|
|
1432 |
|
|
|
1433 |
aspellLanguage
|
|
|
1434 |
|
|
|
1435 |
Language definitions to use when creating the aspell dictionary.
|
|
|
1436 |
The value must match a set of aspell language definition files.
|
|
|
1437 |
You can type "aspell config" to see where these are installed
|
|
|
1438 |
(look for data-dir). The default if the variable is not set is to
|
|
|
1439 |
use your desktop national language environment to guess the value.
|
|
|
1440 |
|
|
|
1441 |
noaspell
|
|
|
1442 |
|
|
|
1443 |
If this is set, the aspell dictionary generation is turned off.
|
|
|
1444 |
Useful for cases where you don't need the functionality or when it
|
|
|
1445 |
is unusable because aspell crashes during dictionary generation.
|
1403 |
|
1446 |
|
1404 |
----------------------------------------------------------------------
|
1447 |
----------------------------------------------------------------------
|
1405 |
|
1448 |
|
1406 |
4.4.2. The mimemap file
|
1449 |
4.4.2. The mimemap file
|
1407 |
|
1450 |
|