|
a/src/README |
|
b/src/README |
|
... |
|
... |
172 |
|
172 |
|
173 |
5.3.3. Installation
|
173 |
5.3.3. Installation
|
174 |
|
174 |
|
175 |
5.4. Configuration overview
|
175 |
5.4. Configuration overview
|
176 |
|
176 |
|
177 |
5.4.1. Main configuration file
|
177 |
5.4.1. The main configuration file, recoll.conf
|
178 |
|
178 |
|
179 |
5.4.2. The fields file
|
179 |
5.4.2. The fields file
|
180 |
|
180 |
|
181 |
5.4.3. The mimemap file
|
181 |
5.4.3. The mimemap file
|
182 |
|
182 |
|
|
... |
|
... |
414 |
read. This is sometimes not desirable, and there are ways to either
|
414 |
read. This is sometimes not desirable, and there are ways to either
|
415 |
exclude some types, or on the contrary to define a positive list of types
|
415 |
exclude some types, or on the contrary to define a positive list of types
|
416 |
to be indexed. In the latter case, any type not in the list will be
|
416 |
to be indexed. In the latter case, any type not in the list will be
|
417 |
ignored.
|
417 |
ignored.
|
418 |
|
418 |
|
419 |
Excluding types can be done by adding name patterns to the skippedNames
|
419 |
Excluding types can be done by adding wildcard name patterns to the
|
420 |
list, which can be done from the GUI Index configuration menu. It is also
|
420 |
skippedNames list, which can be done from the GUI Index configuration
|
421 |
possible to exclude a mime type independantly of the file name by
|
421 |
menu. It is also possible to exclude a mime type independantly of the file
|
422 |
associating it with the rclnull filter. This can be done by editing the
|
422 |
name by associating it with the rclnull filter. This can be done by
|
423 |
mimeconf configuration file.
|
423 |
editing the mimeconf configuration file.
|
424 |
|
424 |
|
425 |
In order to define a positive list, You need to edit the main
|
425 |
In order to define a positive list, You need to edit the main
|
426 |
configuration file (recoll.conf) and set the indexedmimetypes
|
426 |
configuration file (recoll.conf) and set the indexedmimetypes
|
427 |
configuration variable. Example:
|
427 |
configuration variable. Example:
|
428 |
|
428 |
|
|
... |
|
... |
625 |
As a cost for added capability, a raw index will be slightly bigger than a
|
625 |
As a cost for added capability, a raw index will be slightly bigger than a
|
626 |
stripped one (around 10%). Also, searches will be more complex, so
|
626 |
stripped one (around 10%). Also, searches will be more complex, so
|
627 |
probably slightly slower, and the feature is still young, so that a
|
627 |
probably slightly slower, and the feature is still young, so that a
|
628 |
certain amount of weirdness cannot be excluded.
|
628 |
certain amount of weirdness cannot be excluded.
|
629 |
|
629 |
|
|
|
630 |
One of the most adverse consequence of using a raw index is that some
|
|
|
631 |
phrase and proximity searches may become impossible: because each term
|
|
|
632 |
needs to be expanded, and all combinations searched for, the
|
|
|
633 |
multiplicative expansion may become unmanageable.
|
|
|
634 |
|
630 |
2.3.3. The index configuration GUI
|
635 |
2.3.3. The index configuration GUI
|
631 |
|
636 |
|
632 |
Most parameters for a given index configuration can be set from a recoll
|
637 |
Most parameters for a given index configuration can be set from a recoll
|
633 |
GUI running on this configuration (either as default, or by setting
|
638 |
GUI running on this configuration (either as default, or by setting
|
634 |
RECOLL_CONFDIR or the -c option.)
|
639 |
RECOLL_CONFDIR or the -c option.)
|
|
... |
|
... |
857 |
indexing can generate a significant load on the system when files such as
|
862 |
indexing can generate a significant load on the system when files such as
|
858 |
email folders change. Also, monitoring large file trees by itself
|
863 |
email folders change. Also, monitoring large file trees by itself
|
859 |
significantly taxes system resources. You probably do not want to enable
|
864 |
significantly taxes system resources. You probably do not want to enable
|
860 |
it if your system is short on resources. Periodic indexing is adequate in
|
865 |
it if your system is short on resources. Periodic indexing is adequate in
|
861 |
most cases.
|
866 |
most cases.
|
|
|
867 |
|
|
|
868 |
Increasing resources for inotify
|
|
|
869 |
|
|
|
870 |
On Linux systems, monitoring a big tree may imply increasing the resources
|
|
|
871 |
available to inotify, which are normally defined in /etc/sysctl.conf.
|
|
|
872 |
|
|
|
873 |
### inotify
|
|
|
874 |
#
|
|
|
875 |
# cat /proc/sys/fs/inotify/max_queued_events - 16384
|
|
|
876 |
# cat /proc/sys/fs/inotify/max_user_instances - 128
|
|
|
877 |
# cat /proc/sys/fs/inotify/max_user_watches - 16384
|
|
|
878 |
#
|
|
|
879 |
# -- Change to:
|
|
|
880 |
#
|
|
|
881 |
fs.inotify.max_queued_events=32768
|
|
|
882 |
fs.notify.max_user_instances=256
|
|
|
883 |
fs.inotify.max_user_watches=32768
|
|
|
884 |
|
862 |
|
885 |
|
863 |
2.8.1. Slowing down the reindexing rate for fast changing files
|
886 |
2.8.1. Slowing down the reindexing rate for fast changing files
|
864 |
|
887 |
|
865 |
When using the real time monitor, it may happen that some files need to be
|
888 |
When using the real time monitor, it may happen that some files need to be
|
866 |
indexed, but change so often that they impose an excessive load for the
|
889 |
indexed, but change so often that they impose an excessive load for the
|
|
... |
|
... |
2700 |
4.3.2. Python interface
|
2723 |
4.3.2. Python interface
|
2701 |
|
2724 |
|
2702 |
4.3.2.1. Introduction
|
2725 |
4.3.2.1. Introduction
|
2703 |
|
2726 |
|
2704 |
Recoll versions after 1.11 define a Python programming interface, both for
|
2727 |
Recoll versions after 1.11 define a Python programming interface, both for
|
2705 |
searching and indexing.
|
2728 |
searching and indexing. The indexing portion has seen little use, but the
|
|
|
2729 |
searching one is used in the Recoll Ubuntu Unity Lens and Recoll Web UI.
|
2706 |
|
2730 |
|
2707 |
The API is inspired by the Python database API specification, version 1.0
|
2731 |
The API is inspired by the Python database API specification. There were
|
|
|
2732 |
two major changes in recent Recoll versions:
|
|
|
2733 |
|
|
|
2734 |
o The basis for the Recoll API changed from Python database API version
|
2708 |
for Recoll versions up to 1.18, version 2.0 for Recoll versions 1.19 and
|
2735 |
1.0 (Recoll versions up to 1.18.1), to version 2.0 (Recoll 1.18.2 and
|
2709 |
later. The package structure changed with Recoll 1.19 too. We will mostly
|
2736 |
later).
|
2710 |
describe the new API and package structure here. A paragraph at the end of
|
2737 |
o The recoll module became a package (with an internal recoll module) as
|
2711 |
this section will explain a few differences and ways to write code
|
2738 |
of Recoll version 1.19, in order to add more functions. For existing
|
|
|
2739 |
code, this only changes the way the interface must be imported.
|
|
|
2740 |
|
|
|
2741 |
We will mostly describe the new API and package structure here. A
|
|
|
2742 |
paragraph at the end of this section will explain a few differences and
|
2712 |
compatible with both versions.
|
2743 |
ways to write code compatible with both versions.
|
2713 |
|
2744 |
|
2714 |
The Python interface can be found in the source package, under
|
2745 |
The Python interface can be found in the source package, under
|
2715 |
python/recoll.
|
2746 |
python/recoll.
|
2716 |
|
2747 |
|
2717 |
The python/recoll/ directory contains the usual setup.py. After
|
2748 |
The python/recoll/ directory contains the usual setup.py. After
|
|
... |
|
... |
2720 |
|
2751 |
|
2721 |
cd recoll-xxx/python/recoll
|
2752 |
cd recoll-xxx/python/recoll
|
2722 |
python setup.py build
|
2753 |
python setup.py build
|
2723 |
python setup.py install
|
2754 |
python setup.py install
|
2724 |
|
2755 |
|
|
|
2756 |
|
|
|
2757 |
The normal Recoll installer installs the Python API along with the main
|
|
|
2758 |
code.
|
|
|
2759 |
|
|
|
2760 |
When installing from a repository, and depending on the distribution, the
|
|
|
2761 |
Python API can sometimes be found in a separate package.
|
2725 |
|
2762 |
|
2726 |
4.3.2.2. Recoll package
|
2763 |
4.3.2.2. Recoll package
|
2727 |
|
2764 |
|
2728 |
The recoll package contains two modules:
|
2765 |
The recoll package contains two modules:
|
2729 |
|
2766 |
|
|
... |
|
... |
2764 |
|
2801 |
|
2765 |
Db.query(), Db.cursor()
|
2802 |
Db.query(), Db.cursor()
|
2766 |
These aliases return a blank Query object for this index.
|
2803 |
These aliases return a blank Query object for this index.
|
2767 |
|
2804 |
|
2768 |
Db.setAbstractParams(maxchars, contextwords)
|
2805 |
Db.setAbstractParams(maxchars, contextwords)
|
2769 |
Set the parameters used to build snippets.
|
2806 |
Set the parameters used to build snippets (sets of keywords in
|
|
|
2807 |
context text fragments). maxchars defines the maximum total size
|
|
|
2808 |
of the abstract. contextwords defines how many terms are shown
|
|
|
2809 |
around the keyword.
|
|
|
2810 |
|
|
|
2811 |
Db.termMatch(match_type, expr, field='', maxlen=-1, casesens=False,
|
|
|
2812 |
diacsens=False, lang='english')
|
|
|
2813 |
Expand an expression against the index term list. Performs the
|
|
|
2814 |
basic function from the GUI term explorer tool. match_type can be
|
|
|
2815 |
either of wildcard, regexp or stem. Returns a list of terms
|
|
|
2816 |
expanded from the input expression.
|
2770 |
|
2817 |
|
2771 |
The Query class
|
2818 |
The Query class
|
2772 |
|
2819 |
|
2773 |
A Query object (equivalent to a cursor in the Python DB API) is created by
|
2820 |
A Query object (equivalent to a cursor in the Python DB API) is created by
|
2774 |
a Db.query() call. It is used to execute index searches.
|
2821 |
a Db.query() call. It is used to execute index searches.
|
|
... |
|
... |
2792 |
|
2839 |
|
2793 |
Query.fetchone()
|
2840 |
Query.fetchone()
|
2794 |
Fetches the next Doc object from the current search results.
|
2841 |
Fetches the next Doc object from the current search results.
|
2795 |
|
2842 |
|
2796 |
Query.close()
|
2843 |
Query.close()
|
2797 |
Closes the connection. The object is unusable after the call.
|
2844 |
Closes the query. The object is unusable after the call.
|
2798 |
|
2845 |
|
2799 |
Query.scroll(value, mode='relative')
|
2846 |
Query.scroll(value, mode='relative')
|
2800 |
Adjusts the position in the current result set. mode can be
|
2847 |
Adjusts the position in the current result set. mode can be
|
2801 |
relative or absolute.
|
2848 |
relative or absolute.
|
2802 |
|
2849 |
|
2803 |
Query.getgroups()
|
2850 |
Query.getgroups()
|
2804 |
Retrieves the expanded query terms as a list of pairs. Meaningful
|
2851 |
Retrieves the expanded query terms as a list of pairs. Meaningful
|
2805 |
only after executexx In each pair, the first entry is a list of
|
2852 |
only after executexx In each pair, the first entry is a list of
|
|
|
2853 |
user terms (of size one for simple terms, or more for group and
|
2806 |
user terms, the second a list of query terms as derived from the
|
2854 |
phrase clauses), the second a list of query terms as derived from
|
2807 |
user terms and used in the Xapian Query. The size of each list is
|
2855 |
the user terms and used in the Xapian Query.
|
2808 |
one for simple terms, or more for group and phrase clauses.
|
|
|
2809 |
|
2856 |
|
2810 |
Query.getxquery()
|
2857 |
Query.getxquery()
|
2811 |
Return the Xapian query description as a Unicode string.
|
2858 |
Return the Xapian query description as a Unicode string.
|
2812 |
Meaningful only after executexx.
|
2859 |
Meaningful only after executexx.
|
2813 |
|
2860 |
|
|
... |
|
... |
2835 |
Query.rowcount
|
2882 |
Query.rowcount
|
2836 |
Number of records returned by the last execute.
|
2883 |
Number of records returned by the last execute.
|
2837 |
|
2884 |
|
2838 |
Query.rownumber
|
2885 |
Query.rownumber
|
2839 |
Next index to be fetched from results. Normally increments after
|
2886 |
Next index to be fetched from results. Normally increments after
|
2840 |
each fetchone() call, but can be set/reset before the call effect
|
2887 |
each fetchone() call, but can be set/reset before the call to
|
2841 |
seeking. Starts at 0.
|
2888 |
effect seeking (equivalent to using scroll()). Starts at 0.
|
2842 |
|
2889 |
|
2843 |
The Doc class
|
2890 |
The Doc class
|
2844 |
|
2891 |
|
2845 |
A Doc object contains index data for a given document. The data is
|
2892 |
A Doc object contains index data for a given document. The data is
|
2846 |
extracted from the index when searching, or set by the indexer program
|
2893 |
extracted from the index when searching, or set by the indexer program
|
|
... |
|
... |
2885 |
addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub', qstring=string,
|
2932 |
addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub', qstring=string,
|
2886 |
slack=0, field='', stemming=1, subSearch=SearchData)
|
2933 |
slack=0, field='', stemming=1, subSearch=SearchData)
|
2887 |
|
2934 |
|
2888 |
4.3.2.4. The rclextract module
|
2935 |
4.3.2.4. The rclextract module
|
2889 |
|
2936 |
|
2890 |
Document content is not provided by an index query. To access it, the data
|
2937 |
Index queries do not provide document content (only a partial and
|
2891 |
extraction part of the indexing process must be performed (subdocument
|
2938 |
unprecise reconstruction is performed to show the snippets text). In order
|
2892 |
access and format translation). This is not trivial in general. The
|
2939 |
to access the actual document data, the data extraction part of the
|
|
|
2940 |
indexing process must be performed (subdocument access and format
|
|
|
2941 |
translation). This is not trivial in general. The rclextract module
|
2893 |
rclextract module currently provides a single class which can be used to
|
2942 |
currently provides a single class which can be used to access the data
|
2894 |
access the data content for result documents.
|
2943 |
content for result documents.
|
2895 |
|
2944 |
|
2896 |
Classes
|
2945 |
Classes
|
2897 |
|
2946 |
|
2898 |
The Extractor class
|
2947 |
The Extractor class
|
2899 |
|
2948 |
|
|
... |
|
... |
2903 |
An Extractor object is built from a Doc object, output from a
|
2952 |
An Extractor object is built from a Doc object, output from a
|
2904 |
query.
|
2953 |
query.
|
2905 |
|
2954 |
|
2906 |
Extractor.textextract(ipath)
|
2955 |
Extractor.textextract(ipath)
|
2907 |
Extract document defined by ipath and return a Doc object. The
|
2956 |
Extract document defined by ipath and return a Doc object. The
|
2908 |
doc.text field has the document text as either text/plain or
|
2957 |
doc.text field has the document text converted to either
|
2909 |
text/html according to doc.mimetype.
|
2958 |
text/plain or text/html according to doc.mimetype. The typical use
|
|
|
2959 |
would be as follows:
|
2910 |
|
2960 |
|
2911 |
Extractor.idoctofile()
|
2961 |
qdoc = query.fetchone()
|
|
|
2962 |
extractor = recoll.Extractor(qdoc)
|
|
|
2963 |
doc = extractor.textextract(qdoc.ipath)
|
|
|
2964 |
# use doc.text, e.g. for previewing
|
|
|
2965 |
|
|
|
2966 |
Extractor.idoctofile(ipath, targetmtype, outfile='')
|
2912 |
Extracts document into an output file, which can be given
|
2967 |
Extracts document into an output file, which can be given
|
2913 |
explicitly or will be created as a temporary file to be deleted by
|
2968 |
explicitly or will be created as a temporary file to be deleted by
|
2914 |
the caller.
|
2969 |
the caller. Typical use:
|
|
|
2970 |
|
|
|
2971 |
qdoc = query.fetchone()
|
|
|
2972 |
extractor = recoll.Extractor(qdoc)
|
|
|
2973 |
filename = extractor.idoctofile(qdoc.ipath, qdoc.mimetype)
|
2915 |
|
2974 |
|
2916 |
4.3.2.5. Example code
|
2975 |
4.3.2.5. Example code
|
2917 |
|
2976 |
|
2918 |
The following sample would query the index with a user language string.
|
2977 |
The following sample would query the index with a user language string.
|
2919 |
See the python/samples directory inside the Recoll source for other
|
2978 |
See the python/samples directory inside the Recoll source for other
|
|
... |
|
... |
3222 |
one of the system-specific files in the mk directory to mk/sysconf. If
|
3281 |
one of the system-specific files in the mk directory to mk/sysconf. If
|
3223 |
your system is not known yet, it will tell you as much, and you may want
|
3282 |
your system is not known yet, it will tell you as much, and you may want
|
3224 |
to manually copy and modify one of the existing files (the new file name
|
3283 |
to manually copy and modify one of the existing files (the new file name
|
3225 |
should be the output of uname -s).
|
3284 |
should be the output of uname -s).
|
3226 |
|
3285 |
|
|
|
3286 |
5.3.2.1. Building on Solaris
|
|
|
3287 |
|
|
|
3288 |
We did not test building the GUI on Solaris for recent versions. You will
|
|
|
3289 |
need at least Qt 4.4. There are some hints on an old web site page, they
|
|
|
3290 |
may still be valid.
|
|
|
3291 |
|
|
|
3292 |
Someone did test the 1.19 indexer and Python module build, they do work,
|
|
|
3293 |
with a few minor glitches. Be sure to use GNU make and install.
|
|
|
3294 |
|
3227 |
5.3.3. Installation
|
3295 |
5.3.3. Installation
|
3228 |
|
3296 |
|
3229 |
Either type make install or execute recollinstall prefix, in the root of
|
3297 |
Either type make install or execute recollinstall prefix, in the root of
|
3230 |
the source tree. This will copy the commands to prefix/bin and the sample
|
3298 |
the source tree. This will copy the commands to prefix/bin and the sample
|
3231 |
configuration files, scripts and other shared data to prefix/share/recoll.
|
3299 |
configuration files, scripts and other shared data to prefix/share/recoll.
|
|
... |
|
... |
3257 |
|
3325 |
|
3258 |
The most accurate documentation for the configuration parameters is given
|
3326 |
The most accurate documentation for the configuration parameters is given
|
3259 |
by comments inside the default files, and we will just give a general
|
3327 |
by comments inside the default files, and we will just give a general
|
3260 |
overview here.
|
3328 |
overview here.
|
3261 |
|
3329 |
|
3262 |
For each index, there are two sets of configuration files. System-wide
|
3330 |
By default, for each index, there are two sets of configuration files.
|
3263 |
configuration files are kept in a directory named like
|
3331 |
System-wide configuration files are kept in a directory named like
|
3264 |
/usr/[local/]share/recoll/examples, and define default values, shared by
|
3332 |
/usr/[local/]share/recoll/examples, and define default values, shared by
|
3265 |
all indexes. For each index, a parallel set of files defines the
|
3333 |
all indexes. For each index, a parallel set of files defines the
|
3266 |
customized parameters.
|
3334 |
customized parameters.
|
|
|
3335 |
|
|
|
3336 |
In addition (as of Recoll version 1.19.7), it is possible to specify two
|
|
|
3337 |
additional configuration directories which will be stacked before and
|
|
|
3338 |
after the user configuration directory. These are defined by the
|
|
|
3339 |
RECOLL_CONFTOP and RECOLL_CONFMID environment variables. Values from
|
|
|
3340 |
configuration files inside the top directory will override user ones,
|
|
|
3341 |
values from configuration files inside the middle directory will override
|
|
|
3342 |
system ones and be overriden by user ones. These two variables may be of
|
|
|
3343 |
use to applications which augment Recoll functionality, and need to add
|
|
|
3344 |
configuration data without disturbing the user's files. Please note that
|
|
|
3345 |
the two, currently single, values will probably be interpreted as
|
|
|
3346 |
colon-separated lists in the future: do not use colon characters inside
|
|
|
3347 |
the directory paths.
|
3267 |
|
3348 |
|
3268 |
The default location of the configuration is the .recoll directory in your
|
3349 |
The default location of the configuration is the .recoll directory in your
|
3269 |
home. Most people will only use this directory.
|
3350 |
home. Most people will only use this directory.
|
3270 |
|
3351 |
|
3271 |
This location can be changed, or others can be added with the
|
3352 |
This location can be changed, or others can be added with the
|
|
... |
|
... |
3326 |
handle multiple encodings in a single file. In this relatively
|
3407 |
handle multiple encodings in a single file. In this relatively
|
3327 |
unlikely case, you can edit the configuration file as two separate
|
3408 |
unlikely case, you can edit the configuration file as two separate
|
3328 |
text files with appropriate encodings, and concatenate them to create
|
3409 |
text files with appropriate encodings, and concatenate them to create
|
3329 |
the complete configuration.
|
3410 |
the complete configuration.
|
3330 |
|
3411 |
|
3331 |
5.4.1. Main configuration file
|
3412 |
5.4.1. The main configuration file, recoll.conf
|
3332 |
|
3413 |
|
3333 |
recoll.conf is the main configuration file. It defines things like what to
|
3414 |
recoll.conf is the main configuration file. It defines things like what to
|
3334 |
index (top directories and things to ignore), and the default character
|
3415 |
index (top directories and things to ignore), and the default character
|
3335 |
set to use for document types which do not specify it internally.
|
3416 |
set to use for document types which do not specify it internally.
|
3336 |
|
3417 |
|
|
... |
|
... |
3352 |
list. See the followLinks option about following symbolic links
|
3433 |
list. See the followLinks option about following symbolic links
|
3353 |
found under the top elements (not followed by default).
|
3434 |
found under the top elements (not followed by default).
|
3354 |
|
3435 |
|
3355 |
skippedNames
|
3436 |
skippedNames
|
3356 |
|
3437 |
|
3357 |
A space-separated list of patterns for names of files or
|
3438 |
A space-separated list of wilcard patterns for names of files or
|
3358 |
directories that should be completely ignored. The list defined in
|
3439 |
directories that should be completely ignored. The list defined in
|
3359 |
the default file is:
|
3440 |
the default file is:
|
3360 |
|
3441 |
|
3361 |
skippedNames = #* bin CVS Cache cache* caughtspam tmp .thumbnails .svn \
|
3442 |
skippedNames = #* bin CVS Cache cache* caughtspam tmp .thumbnails .svn \
|
3362 |
*~ .beagle .git .hg .bzr loop.ps .xsession-errors \
|
3443 |
*~ .beagle .git .hg .bzr loop.ps .xsession-errors \
|
|
... |
|
... |
3402 |
The values in the *skippedPaths variables are matched by default
|
3483 |
The values in the *skippedPaths variables are matched by default
|
3403 |
with fnmatch(3), with the FNM_PATHNAME and FNM_LEADING_DIR flags.
|
3484 |
with fnmatch(3), with the FNM_PATHNAME and FNM_LEADING_DIR flags.
|
3404 |
This means that '/' characters must be matched explicitely. You
|
3485 |
This means that '/' characters must be matched explicitely. You
|
3405 |
can set skippedPathsFnmPathname to 0 to disable the use of
|
3486 |
can set skippedPathsFnmPathname to 0 to disable the use of
|
3406 |
FNM_PATHNAME (meaning that /*/dir3 will match /dir1/dir2/dir3).
|
3487 |
FNM_PATHNAME (meaning that /*/dir3 will match /dir1/dir2/dir3).
|
|
|
3488 |
|
|
|
3489 |
zipSkippedNames
|
|
|
3490 |
|
|
|
3491 |
A space-separated list of patterns for names of files or
|
|
|
3492 |
directories that should be ignored inside zip archives. This is
|
|
|
3493 |
used directly by the zip filter, and has a function similar to
|
|
|
3494 |
skippedNames, but works independantly. Can be redefined for
|
|
|
3495 |
filesystem subdirectories. For versions up to 1.19, you will need
|
|
|
3496 |
to update the Zip filter and install a supplementary Python
|
|
|
3497 |
module. The details are described on the Recoll wiki.
|
3407 |
|
3498 |
|
3408 |
followLinks
|
3499 |
followLinks
|
3409 |
|
3500 |
|
3410 |
Specifies if the indexer should follow symbolic links while
|
3501 |
Specifies if the indexer should follow symbolic links while
|
3411 |
walking the file tree. The default is to ignore symbolic links to
|
3502 |
walking the file tree. The default is to ignore symbolic links to
|
|
... |
|
... |
3594 |
character, which there is currently no way to escape. Also note
|
3685 |
character, which there is currently no way to escape. Also note
|
3595 |
the initial semi-colon. Example: localfields= ;rclaptg=gnus;other
|
3686 |
the initial semi-colon. Example: localfields= ;rclaptg=gnus;other
|
3596 |
= val, then select specifier viewer with mimetype|tag=... in
|
3687 |
= val, then select specifier viewer with mimetype|tag=... in
|
3597 |
mimeview.
|
3688 |
mimeview.
|
3598 |
|
3689 |
|
|
|
3690 |
noxattrfields
|
|
|
3691 |
|
|
|
3692 |
Recoll versions 1.19 and later automatically translate file
|
|
|
3693 |
extended attributes into document fields (to be processed
|
|
|
3694 |
according to the parameters from the fields file). Setting this
|
|
|
3695 |
variable to 1 will disable the behaviour.
|
|
|
3696 |
|
3599 |
metadatacmds
|
3697 |
metadatacmds
|
3600 |
|
3698 |
|
3601 |
This allows executing external commands for each file and storing
|
3699 |
This allows executing external commands for each file and storing
|
3602 |
the output in a Recoll field. This could be used for example to
|
3700 |
the output in Recoll document fields. This could be used for
|
3603 |
index external tag data. The value is a list of field names and
|
3701 |
example to index external tag data. The value is a list of field
|
3604 |
commands, don't forget an initial semi-colon. Example:
|
3702 |
names and commands, don't forget an initial semi-colon. Example:
|
3605 |
|
3703 |
|
3606 |
[/some/area/of/the/fs]
|
3704 |
[/some/area/of/the/fs]
|
3607 |
metadatacmds = ; tags = tmsu tags %f; otherfield = somecmd -xx %f
|
3705 |
metadatacmds = ; tags = tmsu tags %f; otherfield = somecmd -xx %f
|
3608 |
|
3706 |
|
|
|
3707 |
|
|
|
3708 |
As a specially disgusting hack brought by Recoll 1.19.7, if a
|
|
|
3709 |
"field name" begins with rclmulti, the data returned by the
|
|
|
3710 |
command is expected to contain multiple field values, in
|
|
|
3711 |
configuration file format. This allows setting several fields by
|
|
|
3712 |
executing a single command. Example:
|
|
|
3713 |
|
|
|
3714 |
metadatacmds = ; rclmulti1 = somecmd %f
|
|
|
3715 |
|
|
|
3716 |
|
|
|
3717 |
If somecmd returns data in the form of:
|
|
|
3718 |
|
|
|
3719 |
field1 = value1
|
|
|
3720 |
field2 = value for field2
|
|
|
3721 |
|
|
|
3722 |
|
|
|
3723 |
field1 and field2 will be set inside the document metadata.
|
3609 |
|
3724 |
|
3610 |
5.4.1.3. Parameters affecting where and how we store things:
|
3725 |
5.4.1.3. Parameters affecting where and how we store things:
|
3611 |
|
3726 |
|
3612 |
dbdir
|
3727 |
dbdir
|
3613 |
|
3728 |
|
|
... |
|
... |
3661 |
usage also depends on average document size. The default value is
|
3776 |
usage also depends on average document size. The default value is
|
3662 |
10, and it is probably a bit low. If your system usually has free
|
3777 |
10, and it is probably a bit low. If your system usually has free
|
3663 |
memory, you can try higher values between 20 and 80. In my
|
3778 |
memory, you can try higher values between 20 and 80. In my
|
3664 |
experience, values beyond 100 are always counterproductive.
|
3779 |
experience, values beyond 100 are always counterproductive.
|
3665 |
|
3780 |
|
3666 |
5.4.1.4. Indexing parallelism configuration
|
3781 |
5.4.1.4. Parameters affecting multithread processing
|
3667 |
|
3782 |
|
3668 |
The Recoll indexing process recollindex can use multiple threads to speed
|
3783 |
The Recoll indexing process recollindex can use multiple threads to speed
|
3669 |
up indexing on multiprocessor systems. The work done to index files is
|
3784 |
up indexing on multiprocessor systems. The work done to index files is
|
3670 |
divided in several stages and some of the stages can be executed by
|
3785 |
divided in several stages and some of the stages can be executed by
|
3671 |
multiple threads. The stages are:
|
3786 |
multiple threads. The stages are:
|
|
... |
|
... |
3689 |
integer values). If a value of -1 is used for a given stage, no
|
3804 |
integer values). If a value of -1 is used for a given stage, no
|
3690 |
queue is used, and the thread will go on performing the next
|
3805 |
queue is used, and the thread will go on performing the next
|
3691 |
stage. In practise, deep queues have not been shown to increase
|
3806 |
stage. In practise, deep queues have not been shown to increase
|
3692 |
performance. A value of 0 for the first queue tells Recoll to
|
3807 |
performance. A value of 0 for the first queue tells Recoll to
|
3693 |
perform autoconfiguration (no need for the two other values in
|
3808 |
perform autoconfiguration (no need for the two other values in
|
3694 |
this case)- this is the default configuration.
|
3809 |
this case) - this is the default configuration.
|
3695 |
|
3810 |
|
3696 |
thrTCounts
|
3811 |
thrTCounts
|
3697 |
|
3812 |
|
3698 |
This defines the number of threads used for each stage. If a value
|
3813 |
This defines the number of threads used for each stage. If a value
|
3699 |
of -1 is used for one of the queue depths, the corresponding
|
3814 |
of -1 is used for one of the queue depths, the corresponding
|
|
... |
|
... |
3718 |
sequentially), so the previous approach is preferred. YMMV... The 2 last
|
3833 |
sequentially), so the previous approach is preferred. YMMV... The 2 last
|
3719 |
values for thrTCounts are ignored.
|
3834 |
values for thrTCounts are ignored.
|
3720 |
|
3835 |
|
3721 |
thrQSizes = 2 -1 -1
|
3836 |
thrQSizes = 2 -1 -1
|
3722 |
thrTCounts = 6 1 1
|
3837 |
thrTCounts = 6 1 1
|
|
|
3838 |
|
|
|
3839 |
The following example would disable multithreading. Indexing will be
|
|
|
3840 |
performed by a single thread.
|
|
|
3841 |
|
|
|
3842 |
thrQSizes = -1 -1 -1
|
3723 |
|
3843 |
|
3724 |
5.4.1.5. Miscellaneous parameters:
|
3844 |
5.4.1.5. Miscellaneous parameters:
|
3725 |
|
3845 |
|
3726 |
autodiacsens
|
3846 |
autodiacsens
|
3727 |
|
3847 |
|