|
a/src/README |
|
b/src/README |
|
... |
|
... |
6 |
|
6 |
|
7 |
Jean-Francois Dockes
|
7 |
Jean-Francois Dockes
|
8 |
|
8 |
|
9 |
<jfd@recoll.org>
|
9 |
<jfd@recoll.org>
|
10 |
|
10 |
|
11 |
Copyright (c) 2005-2014 Jean-Francois Dockes
|
11 |
Copyright (c) 2005-2015 Jean-Francois Dockes
|
12 |
|
12 |
|
13 |
Permission is granted to copy, distribute and/or modify this document
|
13 |
Permission is granted to copy, distribute and/or modify this document
|
14 |
under the terms of the GNU Free Documentation License, Version 1.3 or any
|
14 |
under the terms of the GNU Free Documentation License, Version 1.3 or any
|
15 |
later version published by the Free Software Foundation; with no Invariant
|
15 |
later version published by the Free Software Foundation; with no Invariant
|
16 |
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the
|
16 |
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the
|
17 |
license can be found at the following location: GNU web site.
|
17 |
license can be found at the following location: GNU web site.
|
18 |
|
18 |
|
19 |
This document introduces full text search notions and describes the
|
19 |
This document introduces full text search notions and describes the
|
20 |
installation and use of the Recoll application. It currently describes
|
20 |
installation and use of the Recoll application. This version describes
|
21 |
Recoll 1.20.
|
21 |
Recoll 1.21.
|
22 |
|
22 |
|
23 |
----------------------------------------------------------------------
|
23 |
----------------------------------------------------------------------
|
24 |
|
24 |
|
25 |
Table of Contents
|
25 |
Table of Contents
|
26 |
|
26 |
|
|
... |
|
... |
40 |
|
40 |
|
41 |
2.1.2. Configurations, multiple indexes
|
41 |
2.1.2. Configurations, multiple indexes
|
42 |
|
42 |
|
43 |
2.1.3. Document types
|
43 |
2.1.3. Document types
|
44 |
|
44 |
|
|
|
45 |
2.1.4. Indexing failures
|
|
|
46 |
|
45 |
2.1.4. Recovery
|
47 |
2.1.5. Recovery
|
46 |
|
48 |
|
47 |
2.2. Index storage
|
49 |
2.2. Index storage
|
48 |
|
50 |
|
49 |
2.2.1. Xapian index formats
|
51 |
2.2.1. Xapian index formats
|
50 |
|
52 |
|
|
... |
|
... |
105 |
3.1.12. Sorting search results and collapsing
|
107 |
3.1.12. Sorting search results and collapsing
|
106 |
duplicates
|
108 |
duplicates
|
107 |
|
109 |
|
108 |
3.1.13. Search tips, shortcuts
|
110 |
3.1.13. Search tips, shortcuts
|
109 |
|
111 |
|
|
|
112 |
3.1.14. Saving and restoring queries (1.21 and
|
|
|
113 |
later)
|
|
|
114 |
|
110 |
3.1.14. Customizing the search interface
|
115 |
3.1.15. Customizing the search interface
|
111 |
|
116 |
|
112 |
3.2. Searching with the KDE KIO slave
|
117 |
3.2. Searching with the KDE KIO slave
|
113 |
|
118 |
|
114 |
3.2.1. What's this
|
119 |
3.2.1. What's this
|
115 |
|
120 |
|
|
... |
|
... |
161 |
|
166 |
|
162 |
5. Installation and configuration
|
167 |
5. Installation and configuration
|
163 |
|
168 |
|
164 |
5.1. Installing a binary copy
|
169 |
5.1. Installing a binary copy
|
165 |
|
170 |
|
166 |
5.1.1. Installing through a package system
|
|
|
167 |
|
|
|
168 |
5.1.2. Installing a prebuilt Recoll
|
|
|
169 |
|
|
|
170 |
5.2. Supporting packages
|
171 |
5.2. Supporting packages
|
171 |
|
172 |
|
172 |
5.3. Building from source
|
173 |
5.3. Building from source
|
173 |
|
174 |
|
174 |
5.3.1. Prerequisites
|
175 |
5.3.1. Prerequisites
|
|
... |
|
... |
177 |
|
178 |
|
178 |
5.3.3. Installation
|
179 |
5.3.3. Installation
|
179 |
|
180 |
|
180 |
5.4. Configuration overview
|
181 |
5.4. Configuration overview
|
181 |
|
182 |
|
|
|
183 |
5.4.1. Environment variables
|
|
|
184 |
|
182 |
5.4.1. The main configuration file, recoll.conf
|
185 |
5.4.2. The main configuration file, recoll.conf
|
183 |
|
186 |
|
184 |
5.4.2. The fields file
|
187 |
5.4.3. The fields file
|
185 |
|
188 |
|
186 |
5.4.3. The mimemap file
|
189 |
5.4.4. The mimemap file
|
187 |
|
190 |
|
188 |
5.4.4. The mimeconf file
|
191 |
5.4.5. The mimeconf file
|
189 |
|
192 |
|
190 |
5.4.5. The mimeview file
|
193 |
5.4.6. The mimeview file
|
191 |
|
194 |
|
192 |
5.4.6. The ptrans file
|
195 |
5.4.7. The ptrans file
|
193 |
|
196 |
|
194 |
5.4.7. Examples of configuration adjustments
|
197 |
5.4.8. Examples of configuration adjustments
|
195 |
|
198 |
|
196 |
Chapter 1. Introduction
|
199 |
Chapter 1. Introduction
|
197 |
|
200 |
|
198 |
1.1. Giving it a try
|
201 |
1.1. Giving it a try
|
199 |
|
202 |
|
|
... |
|
... |
350 |
documents will only be processed if they have been modified since the last
|
353 |
documents will only be processed if they have been modified since the last
|
351 |
run. On the first execution, all documents will need processing. A full
|
354 |
run. On the first execution, all documents will need processing. A full
|
352 |
index build can be forced later by specifying an option to the indexing
|
355 |
index build can be forced later by specifying an option to the indexing
|
353 |
command (recollindex -z or -Z).
|
356 |
command (recollindex -z or -Z).
|
354 |
|
357 |
|
|
|
358 |
recollindex skips files which caused an error during a previous pass. This
|
|
|
359 |
is a performance optimization, and a new behaviour in version 1.21 (failed
|
|
|
360 |
files were always retried by previous versions). The command line option
|
|
|
361 |
-k can be set to retry failed files, for example after updating a filter.
|
|
|
362 |
|
355 |
The following sections give an overview of different aspects of the
|
363 |
The following sections give an overview of different aspects of the
|
356 |
indexing processes and configuration, with links to detailed sections.
|
364 |
indexing processes and configuration, with links to detailed sections.
|
|
|
365 |
|
|
|
366 |
Depending on your data, temporary files may be needed during indexing,
|
|
|
367 |
some of them possibly quite big. You can use the RECOLL_TMPDIR or TMPDIR
|
|
|
368 |
environment variables to determine where they are created (the default is
|
|
|
369 |
to use /tmp). Using TMPDIR has the nice property that it may also be taken
|
|
|
370 |
into account by auxiliary commands executed by recollindex.
|
357 |
|
371 |
|
358 |
2.1.1. Indexing modes
|
372 |
2.1.1. Indexing modes
|
359 |
|
373 |
|
360 |
Recoll indexing can be performed along two different modes:
|
374 |
Recoll indexing can be performed along two different modes:
|
361 |
|
375 |
|
|
... |
|
... |
460 |
|
474 |
|
461 |
excludedmimetypes or indexedmimetypes, can be set either by editing the
|
475 |
excludedmimetypes or indexedmimetypes, can be set either by editing the
|
462 |
main configuration file (recoll.conf), or from the GUI index configuration
|
476 |
main configuration file (recoll.conf), or from the GUI index configuration
|
463 |
tool.
|
477 |
tool.
|
464 |
|
478 |
|
|
|
479 |
2.1.4. Indexing failures
|
|
|
480 |
|
|
|
481 |
Indexing may fail for some documents, for a number of reasons: a helper
|
|
|
482 |
program may be missing, the document may be corrupt, we may fail to
|
|
|
483 |
uncompress a file because no file system space is available, etc.
|
|
|
484 |
|
|
|
485 |
Recoll versions prior to 1.21 always retried to index files which had
|
|
|
486 |
previously caused an error. This guaranteed that anything that may have
|
|
|
487 |
become indexable (for example because a helper had been installed) would
|
|
|
488 |
be indexed. However this was bad for performance because some indexing
|
|
|
489 |
failures may be quite costly (for example failing to uncompress a big file
|
|
|
490 |
because of insufficient disk space).
|
|
|
491 |
|
|
|
492 |
The indexer in Recoll versions 1.21 and later do not retry failed file by
|
|
|
493 |
default. Retrying will only occur if an explicit option (-k) is set on the
|
|
|
494 |
recollindex command line, or if a script executed when recollindex starts
|
|
|
495 |
up says so. The script is defined by a configuration variable
|
|
|
496 |
(checkneedretryindexscript), and makes a rather lame attempt at deciding
|
|
|
497 |
if a helper command may have been installed, by checking if any of the
|
|
|
498 |
common bin directories have changed.
|
|
|
499 |
|
465 |
2.1.4. Recovery
|
500 |
2.1.5. Recovery
|
466 |
|
501 |
|
467 |
In the rare case where the index becomes corrupted (which can signal
|
502 |
In the rare case where the index becomes corrupted (which can signal
|
468 |
itself by weird search results or crashes), the index files need to be
|
503 |
itself by weird search results or crashes), the index files need to be
|
469 |
erased before restarting a clean indexing pass. Just delete the xapiandb
|
504 |
erased before restarting a clean indexing pass. Just delete the xapiandb
|
470 |
directory (see next section), or, alternatively, start the next
|
505 |
directory (see next section), or, alternatively, start the next
|
|
... |
|
... |
783 |
index first. This will not have the "clean start" aspect of -z, but the
|
818 |
index first. This will not have the "clean start" aspect of -z, but the
|
784 |
advantage is that the index will remain available for querying while it is
|
819 |
advantage is that the index will remain available for querying while it is
|
785 |
rebuilt, which can be a significant advantage if it is very big (some
|
820 |
rebuilt, which can be a significant advantage if it is very big (some
|
786 |
installations need days for a full index rebuild).
|
821 |
installations need days for a full index rebuild).
|
787 |
|
822 |
|
|
|
823 |
Option -k will force retrying files which previously failed to be indexed,
|
|
|
824 |
for example because of a missing helper program.
|
|
|
825 |
|
788 |
Of special interest also, maybe, are the -i and -f options. -i allows
|
826 |
Of special interest also, maybe, are the -i and -f options. -i allows
|
789 |
indexing an explicit list of files (given as command line parameters or
|
827 |
indexing an explicit list of files (given as command line parameters or
|
790 |
read on stdin). -f tells recollindex to ignore file selection parameters
|
828 |
read on stdin). -f tells recollindex to ignore file selection parameters
|
791 |
from the configuration. Together, these options allow building a custom
|
829 |
from the configuration. Together, these options allow building a custom
|
792 |
file selection process for some area of the file system, by adding the top
|
830 |
file selection process for some area of the file system, by adding the top
|
|
... |
|
... |
865 |
|
903 |
|
866 |
If you use the daemon completely out of an X11 session, you need to add
|
904 |
If you use the daemon completely out of an X11 session, you need to add
|
867 |
option -x to disable X11 session monitoring (else the daemon will not
|
905 |
option -x to disable X11 session monitoring (else the daemon will not
|
868 |
start).
|
906 |
start).
|
869 |
|
907 |
|
870 |
By default, the messages from the indexing daemon will be discarded. You
|
908 |
By default, the messages from the indexing daemon will be setn to the same
|
|
|
909 |
file as those from the interactive commands (logfilename). You may want to
|
871 |
may want to change this by setting the daemlogfilename and daemloglevel
|
910 |
change this by setting the daemlogfilename and daemloglevel configuration
|
872 |
configuration parameters. Also the log file will only be truncated when
|
911 |
parameters. Also the log file will only be truncated when the daemon
|
873 |
the daemon starts. If the daemon runs permanently, the log file may grow
|
912 |
starts. If the daemon runs permanently, the log file may grow quite big,
|
874 |
quite big, depending on the log level.
|
913 |
depending on the log level.
|
875 |
|
914 |
|
876 |
When building Recoll, the real time indexing support can be customised
|
915 |
When building Recoll, the real time indexing support can be customised
|
877 |
during package configuration with the --with[out]-fam or
|
916 |
during package configuration with the --with[out]-fam or
|
878 |
--with[out]-inotify options. The default is currently to include inotify
|
917 |
--with[out]-inotify options. The default is currently to include inotify
|
879 |
monitoring on systems that support it, and, as of Recoll 1.17, gamin
|
918 |
monitoring on systems that support it, and, as of Recoll 1.17, gamin
|
|
... |
|
... |
944 |
printed is for east-asian languages (Chinese, Japanese, Korean). Words
|
983 |
printed is for east-asian languages (Chinese, Japanese, Korean). Words
|
945 |
composed of single or multiple characters should be entered separated by
|
984 |
composed of single or multiple characters should be entered separated by
|
946 |
white space in this case (they would typically be printed without white
|
985 |
white space in this case (they would typically be printed without white
|
947 |
space).
|
986 |
space).
|
948 |
|
987 |
|
|
|
988 |
Some searches can be quite complex, and you may want to re-use them later,
|
|
|
989 |
perhaps with some tweaking. Recoll versions 1.21 and later can save and
|
|
|
990 |
restore searches, using XML files. See Saving and restoring queries.
|
|
|
991 |
|
949 |
3.1.1. Simple search
|
992 |
3.1.1. Simple search
|
950 |
|
993 |
|
951 |
1. Start the recoll program.
|
994 |
1. Start the recoll program.
|
952 |
|
995 |
|
953 |
2. Possibly choose a search mode: Any term, All terms, File name or Query
|
996 |
2. Possibly choose a search mode: Any term, All terms, File name or Query
|
|
... |
|
... |
1370 |
3.1.8. Complex/advanced search
|
1413 |
3.1.8. Complex/advanced search
|
1371 |
|
1414 |
|
1372 |
The advanced search dialog helps you build more complex queries without
|
1415 |
The advanced search dialog helps you build more complex queries without
|
1373 |
memorizing the search language constructs. It can be opened through the
|
1416 |
memorizing the search language constructs. It can be opened through the
|
1374 |
Tools menu or through the main toolbar.
|
1417 |
Tools menu or through the main toolbar.
|
|
|
1418 |
|
|
|
1419 |
Recoll keeps a history of searches. See Advanced search history.
|
1375 |
|
1420 |
|
1376 |
The dialog has two tabs:
|
1421 |
The dialog has two tabs:
|
1377 |
|
1422 |
|
1378 |
1. The first tab lets you specify terms to search for, and permits
|
1423 |
1. The first tab lets you specify terms to search for, and permits
|
1379 |
specifying multiple clauses which are combined to build the search.
|
1424 |
specifying multiple clauses which are combined to build the search.
|
|
... |
|
... |
1743 |
Printing previews. Entering Ctrl-P in a preview window will print the
|
1788 |
Printing previews. Entering Ctrl-P in a preview window will print the
|
1744 |
currently displayed text.
|
1789 |
currently displayed text.
|
1745 |
|
1790 |
|
1746 |
Quitting. Entering Ctrl-Q almost anywhere will close the application.
|
1791 |
Quitting. Entering Ctrl-Q almost anywhere will close the application.
|
1747 |
|
1792 |
|
|
|
1793 |
3.1.14. Saving and restoring queries (1.21 and later)
|
|
|
1794 |
|
|
|
1795 |
Both simple and advanced query dialogs save recent history, but the amount
|
|
|
1796 |
is limited: old queries will eventually be forgotten. Also, important
|
|
|
1797 |
queries may be difficult to find among others. This is why both types of
|
|
|
1798 |
queries can also be explicitely saved to files, from the GUI menus: File
|
|
|
1799 |
-> Save last query / Load last query
|
|
|
1800 |
|
|
|
1801 |
The default location for saved queries is a subdirectory of the current
|
|
|
1802 |
configuration directory, but saved queries are ordinary files and can be
|
|
|
1803 |
written or moved anywhere.
|
|
|
1804 |
|
|
|
1805 |
Some of the saved query parameters are part of the preferences (e.g.
|
|
|
1806 |
autophrase or the active external indexes), and may differ when the query
|
|
|
1807 |
is loaded from the time it was saved. In this case, Recoll will warn of
|
|
|
1808 |
the differences, but will not change the user preferences.
|
|
|
1809 |
|
1748 |
3.1.14. Customizing the search interface
|
1810 |
3.1.15. Customizing the search interface
|
1749 |
|
1811 |
|
1750 |
You can customize some aspects of the search interface by using the GUI
|
1812 |
You can customize some aspects of the search interface by using the GUI
|
1751 |
configuration entry in the Preferences menu.
|
1813 |
configuration entry in the Preferences menu.
|
1752 |
|
1814 |
|
1753 |
There are several tabs in the dialog, dealing with the interface itself,
|
1815 |
There are several tabs in the dialog, dealing with the interface itself,
|
|
... |
|
... |
1910 |
always implicitly active. If this is not desirable, you can set up your
|
1972 |
always implicitly active. If this is not desirable, you can set up your
|
1911 |
configuration so that it indexes, for example, an empty directory. An
|
1973 |
configuration so that it indexes, for example, an empty directory. An
|
1912 |
alternative indexer may also need to implement a way of purging the index
|
1974 |
alternative indexer may also need to implement a way of purging the index
|
1913 |
from stale data,
|
1975 |
from stale data,
|
1914 |
|
1976 |
|
1915 |
3.1.14.1. The result list format
|
1977 |
3.1.15.1. The result list format
|
|
|
1978 |
|
|
|
1979 |
Newer versions of Recoll (from 1.17) normally use WebKit HTML widgets for
|
|
|
1980 |
the result list and the snippets window (this may be disabled at build
|
|
|
1981 |
time). Total customisation is possible with full support for CSS and
|
|
|
1982 |
Javascript. Conversely, there are limits to what you can do with the older
|
|
|
1983 |
Qt QTextBrowser, but still, it is possible to decide what data each result
|
|
|
1984 |
will contain, and how it will be displayed.
|
1916 |
|
1985 |
|
1917 |
The result list presentation can be exhaustively customized by adjusting
|
1986 |
The result list presentation can be exhaustively customized by adjusting
|
1918 |
two elements:
|
1987 |
two elements:
|
1919 |
|
1988 |
|
1920 |
o The paragraph format
|
1989 |
o The paragraph format
|
1921 |
|
1990 |
|
1922 |
o HTML code inside the header section
|
1991 |
o HTML code inside the header section. For versions 1.21 and later, this
|
|
|
1992 |
is also used for the snippets window
|
1923 |
|
1993 |
|
1924 |
These can be edited from the Result list tab of the GUI configuration.
|
1994 |
The paragraph format and the header fragment can be edited from the Result
|
|
|
1995 |
list tab of the GUI configuration.
|
1925 |
|
1996 |
|
1926 |
Newer versions of Recoll (from 1.17) use a WebKit HTML object by default
|
1997 |
The header fragment is used both for the result list and the snippets
|
1927 |
(this may be disabled at build time), and total customisation is possible
|
1998 |
window. The snippets list is a table and has a snippets class attribute.
|
1928 |
with full support for CSS and Javascript. Conversely, there are limits to
|
1999 |
Each paragraph in the result list is a table, with class respar, but this
|
1929 |
what you can do with the older Qt QTextBrowser, but still, it is possible
|
2000 |
can be changed by editing the paragraph format.
|
1930 |
to decide what data each result will contain, and how it will be
|
|
|
1931 |
displayed.
|
|
|
1932 |
|
2001 |
|
1933 |
No more detail will be given about the header part (only useful with the
|
|
|
1934 |
WebKit build), if there are restrictions to what you can do, they are
|
|
|
1935 |
beyond this author's HTML/CSS/Javascript abilities... There are a few
|
|
|
1936 |
examples on the page about customising the result list on the Recoll web
|
2002 |
There are a few examples on the page about customising the result list on
|
1937 |
site.
|
2003 |
the Recoll web site.
|
1938 |
|
2004 |
|
1939 |
The paragraph format
|
2005 |
The paragraph format
|
1940 |
|
2006 |
|
1941 |
This is an arbitrary HTML string where the following printf-like %
|
2007 |
This is an arbitrary HTML string where the following printf-like %
|
1942 |
substitutions will be performed:
|
2008 |
substitutions will be performed:
|
|
... |
|
... |
1995 |
example candidate would be the recipient field which is generated by the
|
2061 |
example candidate would be the recipient field which is generated by the
|
1996 |
message input handlers.
|
2062 |
message input handlers.
|
1997 |
|
2063 |
|
1998 |
The default value for the paragraph format string is:
|
2064 |
The default value for the paragraph format string is:
|
1999 |
|
2065 |
|
2000 |
<img src="%I" align="left">%R %S %L <b>%T</b><br>
|
2066 |
"<table class=\"respar\">\n"
|
2001 |
%M %D <i>%U</i> %i<br>
|
2067 |
"<tr>\n"
|
2002 |
%A %K
|
2068 |
"<td><a href='%U'><img src='%I' width='64'></a></td>\n"
|
|
|
2069 |
"<td>%L <i>%S</i> <b>%T</b><br>\n"
|
|
|
2070 |
"<span style='white-space:nowrap'><i>%M</i> %D</span> <i>%U</i> %i<br>\n"
|
|
|
2071 |
"%A %K</td>\n"
|
|
|
2072 |
"</tr></table>\n"
|
2003 |
|
2073 |
|
2004 |
You may, for example, try the following for a more web-like experience:
|
2074 |
You may, for example, try the following for a more web-like experience:
|
2005 |
|
2075 |
|
2006 |
<u><b><a href="P%N">%T</a></b></u><br>
|
2076 |
<u><b><a href="P%N">%T</a></b></u><br>
|
2007 |
%A<font color=#008000>%U - %S</font> - %L
|
2077 |
%A<font color=#008000>%U - %S</font> - %L
|
|
... |
|
... |
2203 |
or lennon and either live or unplugged but not potatoes (in any part of
|
2273 |
or lennon and either live or unplugged but not potatoes (in any part of
|
2204 |
the document).
|
2274 |
the document).
|
2205 |
|
2275 |
|
2206 |
An element is composed of an optional field specification, and a value,
|
2276 |
An element is composed of an optional field specification, and a value,
|
2207 |
separated by a colon (the field separator is the last colon in the
|
2277 |
separated by a colon (the field separator is the last colon in the
|
2208 |
element). Example: Eugenie, author:balzac, dc:title:grandet
|
2278 |
element). Examples: Eugenie, author:balzac, dc:title:grandet
|
|
|
2279 |
dc:title:"eugenie grandet"
|
2209 |
|
2280 |
|
2210 |
The colon, if present, means "contains". Xesam defines other relations,
|
2281 |
The colon, if present, means "contains". Xesam defines other relations,
|
2211 |
which are mostly unsupported for now (except in special cases, described
|
2282 |
which are mostly unsupported for now (except in special cases, described
|
2212 |
further down).
|
2283 |
further down).
|
2213 |
|
2284 |
|
|
... |
|
... |
2216 |
Beatles OR Lennon. The OR must be entered literally (capitals), and it has
|
2287 |
Beatles OR Lennon. The OR must be entered literally (capitals), and it has
|
2217 |
priority over the AND associations: word1 word2 OR word3 means word1 AND
|
2288 |
priority over the AND associations: word1 word2 OR word3 means word1 AND
|
2218 |
(word2 OR word3) not (word1 AND word2) OR word3. Explicit parenthesis are
|
2289 |
(word2 OR word3) not (word1 AND word2) OR word3. Explicit parenthesis are
|
2219 |
not supported.
|
2290 |
not supported.
|
2220 |
|
2291 |
|
|
|
2292 |
As of Recoll 1.21, you can use parentheses to group elements, which will
|
|
|
2293 |
sometimes make things clearer, and may allow expressing combinations which
|
|
|
2294 |
would have been difficult otherwise.
|
|
|
2295 |
|
2221 |
An element preceded by a - specifies a term that should not appear. Pure
|
2296 |
An element preceded by a - specifies a term that should not appear.
|
2222 |
negative queries are forbidden.
|
|
|
2223 |
|
2297 |
|
2224 |
As usual, words inside quotes define a phrase (the order of words is
|
2298 |
As usual, words inside quotes define a phrase (the order of words is
|
2225 |
significant), so that title:"prejudice pride" is not the same as
|
2299 |
significant), so that title:"prejudice pride" is not the same as
|
2226 |
title:prejudice title:pride, and is unlikely to find a result.
|
2300 |
title:prejudice title:pride, and is unlikely to find a result.
|
2227 |
|
2301 |
|
|
|
2302 |
Words inside phrases and capitalized words are not stem-expanded.
|
|
|
2303 |
Wildcards may be used anywhere inside a term. Specifying a wild-card on
|
|
|
2304 |
the left of a term can produce a very slow search (or even an incorrect
|
|
|
2305 |
one if the expansion is truncated because of excessive size). Also see
|
|
|
2306 |
More about wildcards.
|
|
|
2307 |
|
2228 |
To save you some typing, recent Recoll versions (1.20 and later) interpret
|
2308 |
To save you some typing, recent Recoll versions (1.20 and later) interpret
|
2229 |
a comma-separated list of terms as an AND list inside the field. Use slash
|
2309 |
a comma-separated list of terms as an AND list inside the field. Use slash
|
2230 |
characters ('/') for an OR list. No white space is allowed. So
|
2310 |
characters ('/') for an OR list. No white space is allowed. So
|
2231 |
|
2311 |
|
2232 |
author:john,lennon
|
2312 |
author:john,lennon
|
|
... |
|
... |
2236 |
|
2316 |
|
2237 |
author:john/ringo
|
2317 |
author:john/ringo
|
2238 |
|
2318 |
|
2239 |
would search for john or ringo.
|
2319 |
would search for john or ringo.
|
2240 |
|
2320 |
|
2241 |
Modifiers can be set on a phrase clause, for example to specify a
|
2321 |
Modifiers can be set on a double-quote value, for example to specify a
|
2242 |
proximity search (unordered). See the modifier section.
|
2322 |
proximity search (unordered). See the modifier section. No space must
|
|
|
2323 |
separate the final double-quote and the modifiers value, e.g. "two
|
|
|
2324 |
one"po10
|
2243 |
|
2325 |
|
2244 |
Recoll currently manages the following default fields:
|
2326 |
Recoll currently manages the following default fields:
|
2245 |
|
2327 |
|
2246 |
o title, subject or caption are synonyms which specify data to be
|
2328 |
o title, subject or caption are synonyms which specify data to be
|
2247 |
searched for in the document title or subject.
|
2329 |
searched for in the document title or subject.
|
|
... |
|
... |
2353 |
text/media/presentation/etc.). The classification of MIME types in
|
2435 |
text/media/presentation/etc.). The classification of MIME types in
|
2354 |
categories is defined in the Recoll configuration (mimeconf), and can
|
2436 |
categories is defined in the Recoll configuration (mimeconf), and can
|
2355 |
be modified or extended. The default category names are those which
|
2437 |
be modified or extended. The default category names are those which
|
2356 |
permit filtering results in the main GUI screen. Categories are OR'ed
|
2438 |
permit filtering results in the main GUI screen. Categories are OR'ed
|
2357 |
like MIME types above. This can't be negated with - either.
|
2439 |
like MIME types above. This can't be negated with - either.
|
2358 |
|
|
|
2359 |
Words inside phrases and capitalized words are not stem-expanded.
|
|
|
2360 |
Wildcards may be used anywhere inside a term. Specifying a wild-card on
|
|
|
2361 |
the left of a term can produce a very slow search (or even an incorrect
|
|
|
2362 |
one if the expansion is truncated because of excessive size). Also see
|
|
|
2363 |
More about wildcards.
|
|
|
2364 |
|
2440 |
|
2365 |
The document input handlers used while indexing have the possibility to
|
2441 |
The document input handlers used while indexing have the possibility to
|
2366 |
create other fields with arbitrary names, and aliases may be defined in
|
2442 |
create other fields with arbitrary names, and aliases may be defined in
|
2367 |
the configuration, so that the exact field search possibilities may be
|
2443 |
the configuration, so that the exact field search possibilities may be
|
2368 |
different for you if someone took care of the customisation.
|
2444 |
different for you if someone took care of the customisation.
|
|
... |
|
... |
3247 |
|
3323 |
|
3248 |
Chapter 5. Installation and configuration
|
3324 |
Chapter 5. Installation and configuration
|
3249 |
|
3325 |
|
3250 |
5.1. Installing a binary copy
|
3326 |
5.1. Installing a binary copy
|
3251 |
|
3327 |
|
3252 |
There are three types of binary Recoll installations:
|
3328 |
Recoll binary copies are always distributed as regular packages for your
|
|
|
3329 |
system. They can be obtained either through the system's normal software
|
|
|
3330 |
distribution framework (e.g. Debian/Ubuntu apt, FreeBSD ports, etc.), or
|
|
|
3331 |
from some type of "backports" repository providing versions newer than the
|
|
|
3332 |
standard ones, or found on the Recoll WEB site in some cases.
|
3253 |
|
3333 |
|
3254 |
o Through your system normal software distribution framework (ie,
|
3334 |
There used to exist another form of binary install, as pre-compiled source
|
3255 |
Debian/Ubuntu apt, FreeBSD ports, etc.).
|
3335 |
trees, but these are just less convenient than the packages and don't
|
|
|
3336 |
exist any more.
|
3256 |
|
3337 |
|
3257 |
o From a package downloaded from the Recoll web site.
|
3338 |
The package management tools will usually automatically deal with hard
|
|
|
3339 |
dependancies for packages obtained from a proper package repository. You
|
|
|
3340 |
will have to deal with them by hand for downloaded packages (for example,
|
|
|
3341 |
when dpkg complains about missing dependancies).
|
3258 |
|
3342 |
|
3259 |
o From a prebuilt tree downloaded from the Recoll web site.
|
|
|
3260 |
|
|
|
3261 |
In all cases, the strict software dependancies (ie on Xapian or iconv)
|
|
|
3262 |
will be automatically satisfied, you should not have to worry about them.
|
|
|
3263 |
|
|
|
3264 |
You will only have to check or install supporting applications for the
|
3343 |
In all cases, you will have to check or install supporting applications
|
3265 |
file types that you want to index beyond those that are natively processed
|
3344 |
for the file types that you want to index beyond those that are natively
|
3266 |
by Recoll (text, HTML, email files, and a few others).
|
3345 |
processed by Recoll (text, HTML, email files, and a few others).
|
3267 |
|
3346 |
|
3268 |
You should also maybe have a look at the configuration section (but this
|
3347 |
You should also maybe have a look at the configuration section (but this
|
3269 |
may not be necessary for a quick test with default parameters). Most
|
3348 |
may not be necessary for a quick test with default parameters). Most
|
3270 |
parameters can be more conveniently set from the GUI interface.
|
3349 |
parameters can be more conveniently set from the GUI interface.
|
3271 |
|
|
|
3272 |
5.1.1. Installing through a package system
|
|
|
3273 |
|
|
|
3274 |
If you use a BSD-type port system or a prebuilt package (DEB, RPM,
|
|
|
3275 |
manually or through the system software configuration utility), just
|
|
|
3276 |
follow the usual procedure for your system.
|
|
|
3277 |
|
|
|
3278 |
5.1.2. Installing a prebuilt Recoll
|
|
|
3279 |
|
|
|
3280 |
The unpackaged binary versions on the Recoll web site are just compressed
|
|
|
3281 |
tar files of a build tree, where only the useful parts were kept
|
|
|
3282 |
(executables and sample configuration).
|
|
|
3283 |
|
|
|
3284 |
The executable binary files are built with a static link to libxapian and
|
|
|
3285 |
libiconv, to make installation easier (no dependencies).
|
|
|
3286 |
|
|
|
3287 |
After extracting the tar file, you can proceed with installation as if you
|
|
|
3288 |
had built the package from source (that is, just type make install). The
|
|
|
3289 |
binary trees are built for installation to /usr/local.
|
|
|
3290 |
|
3350 |
|
3291 |
5.2. Supporting packages
|
3351 |
5.2. Supporting packages
|
3292 |
|
3352 |
|
3293 |
Recoll uses external applications to index some file types. You need to
|
3353 |
Recoll uses external applications to index some file types. You need to
|
3294 |
install them for the file types that you wish to have indexed (these are
|
3354 |
install them for the file types that you wish to have indexed (these are
|
|
... |
|
... |
3485 |
o Of course the usual autoconf configure options, like --prefix apply.
|
3545 |
o Of course the usual autoconf configure options, like --prefix apply.
|
3486 |
|
3546 |
|
3487 |
Normal procedure:
|
3547 |
Normal procedure:
|
3488 |
|
3548 |
|
3489 |
cd recoll-xxx
|
3549 |
cd recoll-xxx
|
3490 |
configure
|
3550 |
./configure
|
3491 |
make
|
3551 |
make
|
3492 |
(practices usual hardship-repelling invocations)
|
3552 |
(practices usual hardship-repelling invocations)
|
3493 |
|
3553 |
|
3494 |
|
3554 |
|
3495 |
There is little auto-configuration. The configure script will mainly link
|
3555 |
There is little auto-configuration. The configure script will mainly link
|
|
... |
|
... |
3622 |
handle multiple encodings in a single file. In this relatively
|
3682 |
handle multiple encodings in a single file. In this relatively
|
3623 |
unlikely case, you can edit the configuration file as two separate
|
3683 |
unlikely case, you can edit the configuration file as two separate
|
3624 |
text files with appropriate encodings, and concatenate them to create
|
3684 |
text files with appropriate encodings, and concatenate them to create
|
3625 |
the complete configuration.
|
3685 |
the complete configuration.
|
3626 |
|
3686 |
|
|
|
3687 |
5.4.1. Environment variables
|
|
|
3688 |
|
|
|
3689 |
RECOLL_CONFDIR
|
|
|
3690 |
|
|
|
3691 |
Defines the main configuration directory.
|
|
|
3692 |
|
|
|
3693 |
RECOLL_TMPDIR, TMPDIR
|
|
|
3694 |
|
|
|
3695 |
Locations for temporary files, in this order of priority. The
|
|
|
3696 |
default if none of these is set is to use /tmp. Big temporary
|
|
|
3697 |
files may be created during indexing, mostly for decompressing,
|
|
|
3698 |
and also for processing, e.g. email attachments.
|
|
|
3699 |
|
|
|
3700 |
RECOLL_CONFTOP, RECOLL_CONFMID
|
|
|
3701 |
|
|
|
3702 |
Allow adding configuration directories with priorities below and
|
|
|
3703 |
above the user directory (see above the Configuration overview
|
|
|
3704 |
section for details).
|
|
|
3705 |
|
|
|
3706 |
RECOLL_EXTRA_DBS, RECOLL_ACTIVE_EXTRA_DBS
|
|
|
3707 |
|
|
|
3708 |
Help for setting up external indexes. See this paragraph for
|
|
|
3709 |
explanations.
|
|
|
3710 |
|
|
|
3711 |
RECOLL_DATADIR
|
|
|
3712 |
|
|
|
3713 |
Defines replacement for the default location of Recoll data files,
|
|
|
3714 |
normally found in, e.g., /usr/share/recoll).
|
|
|
3715 |
|
|
|
3716 |
RECOLL_FILTERSDIR
|
|
|
3717 |
|
|
|
3718 |
Defines replacement for the default location of Recoll filters,
|
|
|
3719 |
normally found in, e.g., /usr/share/recoll/filters).
|
|
|
3720 |
|
|
|
3721 |
ASPELL_PROG
|
|
|
3722 |
|
|
|
3723 |
aspell program to use for creating the spelling dictionary. The
|
|
|
3724 |
result has to be compatible with the libaspell which Recoll is
|
|
|
3725 |
using.
|
|
|
3726 |
|
|
|
3727 |
VARNAME
|
|
|
3728 |
|
|
|
3729 |
Blabla
|
|
|
3730 |
|
3627 |
5.4.1. The main configuration file, recoll.conf
|
3731 |
5.4.2. The main configuration file, recoll.conf
|
3628 |
|
3732 |
|
3629 |
recoll.conf is the main configuration file. It defines things like what to
|
3733 |
recoll.conf is the main configuration file. It defines things like what to
|
3630 |
index (top directories and things to ignore), and the default character
|
3734 |
index (top directories and things to ignore), and the default character
|
3631 |
set to use for document types which do not specify it internally.
|
3735 |
set to use for document types which do not specify it internally.
|
3632 |
|
3736 |
|
|
... |
|
... |
3637 |
|
3741 |
|
3638 |
Most of the following parameters can be changed from the Index
|
3742 |
Most of the following parameters can be changed from the Index
|
3639 |
Configuration menu in the recoll interface. Some can only be set by
|
3743 |
Configuration menu in the recoll interface. Some can only be set by
|
3640 |
editing the configuration file.
|
3744 |
editing the configuration file.
|
3641 |
|
3745 |
|
3642 |
5.4.1.1. Parameters affecting what documents we index:
|
3746 |
5.4.2.1. Parameters affecting what documents we index:
|
3643 |
|
3747 |
|
3644 |
topdirs
|
3748 |
topdirs
|
3645 |
|
3749 |
|
3646 |
Specifies the list of directories or files to index (recursively
|
3750 |
Specifies the list of directories or files to index (recursively
|
3647 |
for directories). You can use symbolic links as elements of this
|
3751 |
for directories). You can use symbolic links as elements of this
|
|
... |
|
... |
3671 |
hidden directories, and you probably want this indexed. One
|
3775 |
hidden directories, and you probably want this indexed. One
|
3672 |
possible solution is to have .* in skippedNames, and add things
|
3776 |
possible solution is to have .* in skippedNames, and add things
|
3673 |
like ~/.thunderbird or ~/.evolution in topdirs.
|
3777 |
like ~/.thunderbird or ~/.evolution in topdirs.
|
3674 |
|
3778 |
|
3675 |
Not even the file names are indexed for patterns in this list. See
|
3779 |
Not even the file names are indexed for patterns in this list. See
|
3676 |
the recoll_noindex variable in mimemap for an alternative approach
|
3780 |
the noContentSuffixes variable for an alternative approach which
|
3677 |
which indexes the file names.
|
3781 |
indexes the file names.
|
|
|
3782 |
|
|
|
3783 |
noContentSuffixes
|
|
|
3784 |
|
|
|
3785 |
This is a list of file name endings (not wildcard expressions, nor
|
|
|
3786 |
dot-delimited suffixes). Only the names of matching files will be
|
|
|
3787 |
indexed (no attempt at MIME type identification, no decompression,
|
|
|
3788 |
no content indexing). This can be redefined for subdirectories,
|
|
|
3789 |
and edited from the GUI. The default value is:
|
|
|
3790 |
|
|
|
3791 |
noContentSuffixes = .md5 .map \
|
|
|
3792 |
.o .lib .dll .a .sys .exe .com \
|
|
|
3793 |
.mpp .mpt .vsd \
|
|
|
3794 |
.img .img.gz .img.bz2 .img.xz .image .image.gz .image.bz2 .image.xz \
|
|
|
3795 |
.dat .bak .rdf .log.gz .log .db .msf .pid \
|
|
|
3796 |
,v ~ #
|
3678 |
|
3797 |
|
3679 |
skippedPaths and daemSkippedPaths
|
3798 |
skippedPaths and daemSkippedPaths
|
3680 |
|
3799 |
|
3681 |
A space-separated list of patterns for paths of files or
|
3800 |
A space-separated list of patterns for paths of files or
|
3682 |
directories that should be skipped. There is no default in the
|
3801 |
directories that should be skipped. There is no default in the
|
|
... |
|
... |
3792 |
|
3911 |
|
3793 |
The path to the web indexing queue. This is hard-coded in the
|
3912 |
The path to the web indexing queue. This is hard-coded in the
|
3794 |
Firefox plugin as ~/.recollweb/ToIndex so there should be no need
|
3913 |
Firefox plugin as ~/.recollweb/ToIndex so there should be no need
|
3795 |
to change it.
|
3914 |
to change it.
|
3796 |
|
3915 |
|
3797 |
5.4.1.2. Parameters affecting how we generate terms:
|
3916 |
5.4.2.2. Parameters affecting how we generate terms:
|
3798 |
|
3917 |
|
3799 |
Changing some of these parameters will imply a full reindex. Also, when
|
3918 |
Changing some of these parameters will imply a full reindex. Also, when
|
3800 |
using multiple indexes, it may not make sense to search indexes that don't
|
3919 |
using multiple indexes, it may not make sense to search indexes that don't
|
3801 |
share the values for these parameters, because they usually affect both
|
3920 |
share the values for these parameters, because they usually affect both
|
3802 |
search and index operations.
|
3921 |
search and index operations.
|
|
... |
|
... |
3967 |
field2 = value for field2
|
4086 |
field2 = value for field2
|
3968 |
|
4087 |
|
3969 |
|
4088 |
|
3970 |
field1 and field2 will be set inside the document metadata.
|
4089 |
field1 and field2 will be set inside the document metadata.
|
3971 |
|
4090 |
|
3972 |
5.4.1.3. Parameters affecting where and how we store things:
|
4091 |
5.4.2.3. Parameters affecting where and how we store things:
|
3973 |
|
4092 |
|
3974 |
dbdir
|
4093 |
dbdir
|
3975 |
|
4094 |
|
3976 |
The name of the Xapian data directory. It will be created if
|
4095 |
The name of the Xapian data directory. It will be created if
|
3977 |
needed when the index is initialized. If this is not an absolute
|
4096 |
needed when the index is initialized. If this is not an absolute
|
|
... |
|
... |
4026 |
usage also depends on average document size. The default value is
|
4145 |
usage also depends on average document size. The default value is
|
4027 |
10, and it is probably a bit low. If your system usually has free
|
4146 |
10, and it is probably a bit low. If your system usually has free
|
4028 |
memory, you can try higher values between 20 and 80. In my
|
4147 |
memory, you can try higher values between 20 and 80. In my
|
4029 |
experience, values beyond 100 are always counterproductive.
|
4148 |
experience, values beyond 100 are always counterproductive.
|
4030 |
|
4149 |
|
4031 |
5.4.1.4. Parameters affecting multithread processing
|
4150 |
5.4.2.4. Parameters affecting multithread processing
|
4032 |
|
4151 |
|
4033 |
The Recoll indexing process recollindex can use multiple threads to speed
|
4152 |
The Recoll indexing process recollindex can use multiple threads to speed
|
4034 |
up indexing on multiprocessor systems. The work done to index files is
|
4153 |
up indexing on multiprocessor systems. The work done to index files is
|
4035 |
divided in several stages and some of the stages can be executed by
|
4154 |
divided in several stages and some of the stages can be executed by
|
4036 |
multiple threads. The stages are:
|
4155 |
multiple threads. The stages are:
|
|
... |
|
... |
4089 |
The following example would disable multithreading. Indexing will be
|
4208 |
The following example would disable multithreading. Indexing will be
|
4090 |
performed by a single thread.
|
4209 |
performed by a single thread.
|
4091 |
|
4210 |
|
4092 |
thrQSizes = -1 -1 -1
|
4211 |
thrQSizes = -1 -1 -1
|
4093 |
|
4212 |
|
4094 |
5.4.1.5. Miscellaneous parameters:
|
4213 |
5.4.2.5. Miscellaneous parameters:
|
4095 |
|
4214 |
|
4096 |
autodiacsens
|
4215 |
autodiacsens
|
4097 |
|
4216 |
|
4098 |
IF the index is not stripped, decide if we automatically trigger
|
4217 |
IF the index is not stripped, decide if we automatically trigger
|
4099 |
diacritics sensitivity if the search term has accented characters
|
4218 |
diacritics sensitivity if the search term has accented characters
|
|
... |
|
... |
4118 |
logfilename, daemlogfilename
|
4237 |
logfilename, daemlogfilename
|
4119 |
|
4238 |
|
4120 |
Where the messages should go. 'stderr' can be used as a special
|
4239 |
Where the messages should go. 'stderr' can be used as a special
|
4121 |
value, and is the default. The daemversion is specific to the
|
4240 |
value, and is the default. The daemversion is specific to the
|
4122 |
indexing monitor daemon.
|
4241 |
indexing monitor daemon.
|
|
|
4242 |
|
|
|
4243 |
checkneedretryindexscript
|
|
|
4244 |
|
|
|
4245 |
This defines the name for a command executed by recollindex when
|
|
|
4246 |
starting indexing. If the exit status of the command is 0,
|
|
|
4247 |
recollindex retries to index all files which previously could not
|
|
|
4248 |
be indexed because of data extraction errors. The default value is
|
|
|
4249 |
a script which checks if any of the common bin directories have
|
|
|
4250 |
changed (indicating that a helper program may have been
|
|
|
4251 |
installed).
|
4123 |
|
4252 |
|
4124 |
mondelaypatterns
|
4253 |
mondelaypatterns
|
4125 |
|
4254 |
|
4126 |
This allows specify wildcard path patterns (processed with
|
4255 |
This allows specify wildcard path patterns (processed with
|
4127 |
fnmatch(3) with 0 flag), to match files which change too often and
|
4256 |
fnmatch(3) with 0 flag), to match files which change too often and
|
|
... |
|
... |
4209 |
This allows definining location-related quirks for the mailbox
|
4338 |
This allows definining location-related quirks for the mailbox
|
4210 |
handler. Currently only the tbird flag is defined, and it should
|
4339 |
handler. Currently only the tbird flag is defined, and it should
|
4211 |
be set for directories which hold Thunderbird data, as their
|
4340 |
be set for directories which hold Thunderbird data, as their
|
4212 |
folder format is weird.
|
4341 |
folder format is weird.
|
4213 |
|
4342 |
|
4214 |
5.4.2. The fields file
|
4343 |
5.4.3. The fields file
|
4215 |
|
4344 |
|
4216 |
This file contains information about dynamic fields handling in Recoll.
|
4345 |
This file contains information about dynamic fields handling in Recoll.
|
4217 |
Some very basic fields have hard-wired behaviour, and, mostly, you should
|
4346 |
Some very basic fields have hard-wired behaviour, and, mostly, you should
|
4218 |
not change the original data inside the fields file. But you can create
|
4347 |
not change the original data inside the fields file. But you can create
|
4219 |
custom fields fitting your data and handle them just like they were native
|
4348 |
custom fields fitting your data and handle them just like they were native
|
|
... |
|
... |
4280 |
[mail]
|
4409 |
[mail]
|
4281 |
# Extract the X-My-Tag mail header, and use it internally with the
|
4410 |
# Extract the X-My-Tag mail header, and use it internally with the
|
4282 |
# mailmytag field name
|
4411 |
# mailmytag field name
|
4283 |
x-my-tag = mailmytag
|
4412 |
x-my-tag = mailmytag
|
4284 |
|
4413 |
|
4285 |
5.4.2.1. Extended attributes in the fields file
|
4414 |
5.4.3.1. Extended attributes in the fields file
|
4286 |
|
4415 |
|
4287 |
Recoll versions 1.19 and later process user extended file attributes as
|
4416 |
Recoll versions 1.19 and later process user extended file attributes as
|
4288 |
documents fields by default.
|
4417 |
documents fields by default.
|
4289 |
|
4418 |
|
4290 |
Attributes are processed as fields of the same name, after removing the
|
4419 |
Attributes are processed as fields of the same name, after removing the
|
|
... |
|
... |
4292 |
|
4421 |
|
4293 |
The [xattrtofields] section of the fields file allows specifying
|
4422 |
The [xattrtofields] section of the fields file allows specifying
|
4294 |
translations from extended attributes names to Recoll field names. An
|
4423 |
translations from extended attributes names to Recoll field names. An
|
4295 |
empty translation disables use of the corresponding attribute data.
|
4424 |
empty translation disables use of the corresponding attribute data.
|
4296 |
|
4425 |
|
4297 |
5.4.3. The mimemap file
|
4426 |
5.4.4. The mimemap file
|
4298 |
|
4427 |
|
4299 |
mimemap specifies the file name extension to MIME type mappings.
|
4428 |
mimemap specifies the file name extension to MIME type mappings.
|
4300 |
|
4429 |
|
4301 |
For file names without an extension, or with an unknown one, the system's
|
4430 |
For file names without an extension, or with an unknown one, the system's
|
4302 |
file -i command will be executed to determine the MIME type (this can be
|
4431 |
file -i command will be executed to determine the MIME type (this can be
|
|
... |
|
... |
4305 |
The mappings can be specified on a per-subtree basis, which may be useful
|
4434 |
The mappings can be specified on a per-subtree basis, which may be useful
|
4306 |
in some cases. Example: gaim logs have a .txt extension but should be
|
4435 |
in some cases. Example: gaim logs have a .txt extension but should be
|
4307 |
handled specially, which is possible because they are usually all located
|
4436 |
handled specially, which is possible because they are usually all located
|
4308 |
in one place.
|
4437 |
in one place.
|
4309 |
|
4438 |
|
4310 |
mimemap also has a recoll_noindex variable which is a list of suffixes.
|
4439 |
The recoll_noindex mimemap variable has been moved to recoll.conf and
|
4311 |
Matching files will be skipped (which avoids unnecessary decompressions or
|
4440 |
renamed to noContentSuffixes, while keeping the same function, as of
|
4312 |
file executions). This is partially redundant with skippedNames in the
|
4441 |
Recoll version 1.21. For older Recoll versions, see the documentation for
|
4313 |
main configuration file, with a few differences: it will not affect
|
4442 |
noContentSuffixes but use recoll_noindex in mimemap.
|
4314 |
directories, it cannot be made dependant on the file-system location (it
|
|
|
4315 |
is a configuration-wide parameter), and the file names will still be
|
|
|
4316 |
indexed (not even the file names are indexed for patterns in skippedNames.
|
|
|
4317 |
recoll_noindex is used mostly for things known to be unindexable by a
|
|
|
4318 |
given Recoll version. Having it there avoids cluttering the more
|
|
|
4319 |
user-oriented and locally customized skippedNames.
|
|
|
4320 |
|
4443 |
|
4321 |
5.4.4. The mimeconf file
|
4444 |
5.4.5. The mimeconf file
|
4322 |
|
4445 |
|
4323 |
mimeconf specifies how the different MIME types are handled for indexing,
|
4446 |
mimeconf specifies how the different MIME types are handled for indexing,
|
4324 |
and which icons are displayed in the recoll result lists.
|
4447 |
and which icons are displayed in the recoll result lists.
|
4325 |
|
4448 |
|
4326 |
Changing the parameters in the [index] section is probably not a good idea
|
4449 |
Changing the parameters in the [index] section is probably not a good idea
|
|
... |
|
... |
4328 |
|
4451 |
|
4329 |
The [icons] section allows you to change the icons which are displayed by
|
4452 |
The [icons] section allows you to change the icons which are displayed by
|
4330 |
recoll in the result lists (the values are the basenames of the png images
|
4453 |
recoll in the result lists (the values are the basenames of the png images
|
4331 |
inside the iconsdir directory (specified in recoll.conf).
|
4454 |
inside the iconsdir directory (specified in recoll.conf).
|
4332 |
|
4455 |
|
4333 |
5.4.5. The mimeview file
|
4456 |
5.4.6. The mimeview file
|
4334 |
|
4457 |
|
4335 |
mimeview specifies which programs are started when you click on an Open
|
4458 |
mimeview specifies which programs are started when you click on an Open
|
4336 |
link in a result list. Ie: HTML is normally displayed using firefox, but
|
4459 |
link in a result list. Ie: HTML is normally displayed using firefox, but
|
4337 |
you may prefer Konqueror, your openoffice.org program might be named
|
4460 |
you may prefer Konqueror, your openoffice.org program might be named
|
4338 |
oofice instead of openoffice etc.
|
4461 |
oofice instead of openoffice etc.
|
|
... |
|
... |
4397 |
In addition to the predefined values above, all strings like %(fieldname)
|
4520 |
In addition to the predefined values above, all strings like %(fieldname)
|
4398 |
will be replaced by the value of the field named fieldname for the
|
4521 |
will be replaced by the value of the field named fieldname for the
|
4399 |
document. This could be used in combination with field customisation to
|
4522 |
document. This could be used in combination with field customisation to
|
4400 |
help with opening the document.
|
4523 |
help with opening the document.
|
4401 |
|
4524 |
|
4402 |
5.4.6. The ptrans file
|
4525 |
5.4.7. The ptrans file
|
4403 |
|
4526 |
|
4404 |
ptrans specifies query-time path translations. These can be useful in
|
4527 |
ptrans specifies query-time path translations. These can be useful in
|
4405 |
multiple cases.
|
4528 |
multiple cases.
|
4406 |
|
4529 |
|
4407 |
The file has a section for any index which needs translations, either the
|
4530 |
The file has a section for any index which needs translations, either the
|
|
... |
|
... |
4416 |
[/path/to/additional/xapiandb]
|
4539 |
[/path/to/additional/xapiandb]
|
4417 |
/server/volume1/docdir = /net/server/volume1/docdir
|
4540 |
/server/volume1/docdir = /net/server/volume1/docdir
|
4418 |
/server/volume2/docdir = /net/server/volume2/docdir
|
4541 |
/server/volume2/docdir = /net/server/volume2/docdir
|
4419 |
|
4542 |
|
4420 |
|
4543 |
|
4421 |
5.4.7. Examples of configuration adjustments
|
4544 |
5.4.8. Examples of configuration adjustments
|
4422 |
|
4545 |
|
4423 |
5.4.7.1. Adding an external viewer for an non-indexed type
|
4546 |
5.4.8.1. Adding an external viewer for an non-indexed type
|
4424 |
|
4547 |
|
4425 |
Imagine that you have some kind of file which does not have indexable
|
4548 |
Imagine that you have some kind of file which does not have indexable
|
4426 |
content, but for which you would like to have a functional Open link in
|
4549 |
content, but for which you would like to have a functional Open link in
|
4427 |
the result list (when found by file name). The file names end in .blob and
|
4550 |
the result list (when found by file name). The file names end in .blob and
|
4428 |
can be displayed by application blobviewer.
|
4551 |
can be displayed by application blobviewer.
|
|
... |
|
... |
4448 |
MIME type which it already knows, you would just need to edit mimeview.
|
4571 |
MIME type which it already knows, you would just need to edit mimeview.
|
4449 |
The entries you add in your personal file override those in the central
|
4572 |
The entries you add in your personal file override those in the central
|
4450 |
configuration, which you do not need to alter. mimeview can also be
|
4573 |
configuration, which you do not need to alter. mimeview can also be
|
4451 |
modified from the Gui.
|
4574 |
modified from the Gui.
|
4452 |
|
4575 |
|
4453 |
5.4.7.2. Adding indexing support for a new file type
|
4576 |
5.4.8.2. Adding indexing support for a new file type
|
4454 |
|
4577 |
|
4455 |
Let us now imagine that the above .blob files actually contain indexable
|
4578 |
Let us now imagine that the above .blob files actually contain indexable
|
4456 |
text and that you know how to extract it with a command line program.
|
4579 |
text and that you know how to extract it with a command line program.
|
4457 |
Getting Recoll to index the files is easy. You need to perform the above
|
4580 |
Getting Recoll to index the files is easy. You need to perform the above
|
4458 |
alteration, and also to add data to the mimeconf file (typically in
|
4581 |
alteration, and also to add data to the mimeconf file (typically in
|