Switch to side-by-side view

--- a/src/INSTALL
+++ b/src/INSTALL
@@ -2,1076 +2,3 @@
 More documentation can be found in the doc/ directory or at http://www.recoll.org
 
 
-   Link: HOME
-   Link: PREVIOUS
-   Link: NEXT
-
-                               Recoll user manual
-   Prev                                                                  Next 
-
-   --------------------------------------------------------------------------
-
-                   Chapter 5. Installation and configuration
-
-   Table of Contents
-
-   5.1. Installing a binary copy
-
-   5.2. Supporting packages
-
-   5.3. Building from source
-
-   5.4. Configuration overview
-
-                         5.1. Installing a binary copy
-
-   There are three types of binary Recoll installations:
-
-     * Through your system normal software distribution framework (ie,
-       Debian/Ubuntu apt, FreeBSD ports, etc.).
-
-     * From a package downloaded from the Recoll web site.
-
-     * From a prebuilt tree downloaded from the Recoll web site.
-
-   In all cases, the strict software dependancies (ie on Xapian or iconv)
-   will be automatically satisfied, you should not have to worry about them.
-
-   You will only have to check or install supporting applications for the
-   file types that you want to index beyond those that are natively processed
-   by Recoll (text, HTML, email files, and a few others).
-
-   You should also maybe have a look at the configuration section (but this
-   may not be necessary for a quick test with default parameters). Most
-   parameters can be more conveniently set from the GUI interface.
-
-5.1.1. Installing through a package system
-
-   If you use a BSD-type port system or a prebuilt package (DEB, RPM,
-   manually or through the system software configuration utility), just
-   follow the usual procedure for your system.
-
-5.1.2. Installing a prebuilt Recoll
-
-   The unpackaged binary versions on the Recoll web site are just compressed
-   tar files of a build tree, where only the useful parts were kept
-   (executables and sample configuration).
-
-   The executable binary files are built with a static link to libxapian and
-   libiconv, to make installation easier (no dependencies).
-
-   After extracting the tar file, you can proceed with installation as if you
-   had built the package from source (that is, just type make install). The
-   binary trees are built for installation to /usr/local.
-
-   --------------------------------------------------------------------------
-
-   Prev                               Home                               Next 
-   API                                                    Supporting packages 
-   Link: HOME
-   Link: UP
-   Link: PREVIOUS
-   Link: NEXT
-
-                               Recoll user manual
-   Prev            Chapter 5. Installation and configuration             Next 
-
-   --------------------------------------------------------------------------
-
-                            5.2. Supporting packages
-
-   Recoll uses external applications to index some file types. You need to
-   install them for the file types that you wish to have indexed (these are
-   run-time optional dependencies. None is needed for building or running
-   Recoll except for indexing their specific file type).
-
-   After an indexing pass, the commands that were found missing can be
-   displayed from the recoll File menu. The list is stored in the missing
-   text file inside the configuration directory.
-
-   A list of common file types which need external commands follows. Many of
-   the filters need the iconv command, which is not always listed as a
-   dependancy.
-
-   Please note that, due to the relatively dynamic nature of this
-   information, the most up to date version is now kept on the Recoll helper
-   applications page along with links to the home pages or best
-   source/patches pages, and misc tips. The list below is not updated often
-   and may be quite stale.
-
-   For many Linux distributions, most of the commands listed can be installed
-   from the package repositories. However, the packages are sometimes
-   outdated, or not the best version for Recoll, so you should take a look at
-   the Recoll helper applications page if a file type is important to you.
-
-   As of Recoll release 1.14, a number of XML-based formats that were handled
-   by ad hoc filter code now use the xsltproc command, which usually comes
-   with libxslt. These are: abiword, fb2 (ebooks), kword, openoffice, svg.
-
-   Now for the list:
-
-     * Openoffice files need unzip and xsltproc.
-
-     * PDF files need pdftotext which is part of the Xpdf or Poppler
-       packages.
-
-     * Postscript files need pstotext. The original version has an issue with
-       shell character in file names, which is corrected in recent packages.
-       See the the Recoll helper applications page for more detail.
-
-     * MS Word needs antiword. It is also useful to have wvWare installed as
-       it may be be used as a fallback for some files which antiword does not
-       handle.
-
-     * MS Excel and PowerPoint need catdoc.
-
-     * MS Open XML (docx) needs xsltproc.
-
-     * Wordperfect files need wpd2html from the libwpd (or libwpd-tools on
-       Ubuntu) package.
-
-     * RTF files need unrtf, which, in its standard version, has much trouble
-       with non-western character sets. Check the Recoll helper applications
-       page.
-
-     * TeX files need untex or detex. Check the Recoll helper applications
-       page for sources if it's not packaged for your distribution.
-
-     * dvi files need dvips.
-
-     * djvu files need djvutxt and djvused from the DjVuLibre package.
-
-     * Audio files: Recoll releases before 1.13 used the id3info command from
-       the id3lib package to extract mp3 tag information, metaflac (standard
-       flac tools) for flac files, and ogginfo (vorbis tools) for ogg files.
-       Releases 1.14 and later use a single Python filter based on mutagen
-       for all audio file types.
-
-     * Pictures: Recoll uses the Exiftool Perl package to extract tag
-       information. Most image file formats are supported. Note that there
-       may not be much interest in indexing the technical tags (image size,
-       aperture, etc.). This is only of interest if you store personal tags
-       or textual descriptions inside the image files.
-
-     * chm: files in microsoft help format need Python and the pychm module
-       (which needs chmlib).
-
-     * ICS: up to Recoll 1.13, iCalendar files need Python and the icalendar
-       module. icalendar is not needed for newer versions, which use internal
-       code.
-
-     * Zip archives need Python (and the standard zipfile module).
-
-     * Rar archives need Python, the rarfile Python module and the unrar
-       utility.
-
-     * Midi karaoke files need Python and the Midi module
-
-     * Konqueror webarchive format with Python (uses the Tarfile module).
-
-     * mimehtml web archive format (support based on the email filter, which
-       introduces some mild weirdness, but still usable).
-
-   Text, HTML, email folders, and Scribus files are processed internally. Lyx
-   is used to index Lyx files. Many filters need iconv and the standard sed
-   and awk.
-
-   --------------------------------------------------------------------------
-
-   Prev                                  Home                            Next 
-   Installation and configuration         Up             Building from source 
-   Link: HOME
-   Link: UP
-   Link: PREVIOUS
-   Link: NEXT
-
-                               Recoll user manual
-   Prev            Chapter 5. Installation and configuration             Next 
-
-   --------------------------------------------------------------------------
-
-                           5.3. Building from source
-
-5.3.1. Prerequisites
-
-   C++ compiler. Up to Recoll version 1.13.04, its absence can manifest
-   itself by strange messages about a missing iconv_open.
-
-   Development files for Xapian core.
-
-     Important: If you are building Xapian for an older CPU (before Pentium 4
-     or Athlon 64), you need to add the --disable-sse flag to the configure
-     command. Else all Xapian application will crash with an illegal
-     instruction error.
-
-   Development files for Qt .
-
-   Development files for X11 and zlib.
-
-   Check the Recoll download page for up to date version information.
-
-   You will most probably be able to find a binary package for Qt for your
-   system. You may have to compile Xapian but this is not difficult (if you
-   are using FreeBSD, there is a port).
-
-   You may also need libiconv. Recoll currently uses version 1.9 (this should
-   not be critical). On Linux systems, the iconv interface is part of libc
-   and you should not need to do anything special.
-
-5.3.2. Building
-
-   Recoll has been built on Linux, FreeBSD, Mac OS X, and Solaris, most
-   versions after 2005 should be ok, maybe some older ones too (Solaris 8 is
-   ok). If you build on another system, and need to modify things, I would
-   very much welcome patches.
-
-   Depending on the Qt 3 configuration on your system, you may have to set
-   the QTDIR and QMAKESPECS variables in your environment:
-
-     * QTDIR should point to the directory above the one that holds the qt
-       include files (ie: if qt.h is /usr/local/qt/include/qt.h, QTDIR should
-       be /usr/local/qt).
-
-     * QMAKESPECS should be set to the name of one of the Qt mkspecs
-       sub-directories (ie: linux-g++).
-
-   On many Linux systems, QTDIR is set by the login scripts, and QMAKESPECS
-   is not needed because there is a default link in mkspecs/.
-
-   Neither QTDIR nor QMAKESPECS should be needed with Qt 4, configuration
-   details are entirely determined by qmake (which is quite often installed
-   as qmake-qt4).
-
-   Configure options:
-
-     * --without-aspell will disable the code for phonetic matching of search
-       terms.
-
-     * --with-fam or --with-inotify will enable the code for real time
-       indexing. Inotify support is enabled by default on recent Linux
-       systems.
-
-     * --disable-webkit is available from version 1.17 to implement the
-       result list with a Qt QTextBrowser instead of a WebKit widget if you
-       do not or can't depend on the latter.
-
-     * --enable-xattr will enable code to fetch data from file extended
-       attributes. This is only useful is some application stores data in
-       there, and also needs some simple configuration (see comments in the
-       fields configuration file).
-
-     * --enable-camelcase will enable splitting camelCase words. This is not
-       enabled by default as it has the unfortunate side-effect of making
-       some phrase searches quite confusing: ie, "MySQL manual" would be
-       matched by "MySQL manual" and "my sql manual" but not "mysql manual"
-       (only inside phrase searches).
-
-     * --with-file-command Specify the version of the 'file' command to use
-       (ie: --with-file-command=/usr/local/bin/file). Can be useful to enable
-       the gnu version on systems where the native one is bad.
-
-     * --disable-qtgui Disable the Qt interface. Will allow building the
-       indexer and the command line search program in absence of a Qt
-       environment.
-
-     * --disable-x11mon Disable X11 connection monitoring inside recollindex.
-       Together with --disable-qtgui, this allows building recoll without Qt
-       and X11.
-
-     * Of course the usual autoconf configure options, like --prefix apply.
-
-   Normal procedure:
-
-         cd recoll-xxx
-         configure
-         make
-         (practices usual hardship-repelling invocations)
-      
-
-   There is little auto-configuration. The configure script will mainly link
-   one of the system-specific files in the mk directory to mk/sysconf. If
-   your system is not known yet, it will tell you as much, and you may want
-   to manually copy and modify one of the existing files (the new file name
-   should be the output of uname -s).
-
-5.3.3. Installation
-
-   Either type make install or execute recollinstall prefix, in the root of
-   the source tree. This will copy the commands to prefix/bin and the sample
-   configuration files, scripts and other shared data to prefix/share/recoll.
-
-   If the installation prefix given to recollinstall is different from either
-   the system default or the value which was specified when executing
-   configure (as in configure --prefix /some/path), you will have to set the
-   RECOLL_DATADIR environment variable to indicate where the shared data is
-   to be found (ie for (ba)sh: export
-   RECOLL_DATADIR=/some/path/share/recoll).
-
-   You can then proceed to configuration.
-
-   --------------------------------------------------------------------------
-
-   Prev                               Home                               Next 
-   Supporting packages                 Up              Configuration overview 
-   Link: HOME
-   Link: UP
-   Link: PREVIOUS
-
-                               Recoll user manual
-   Prev            Chapter 5. Installation and configuration                  
-
-   --------------------------------------------------------------------------
-
-                          5.4. Configuration overview
-
-   Most of the parameters specific to the recoll GUI are set through the
-   Preferences menu and stored in the standard Qt place
-   ($HOME/.config/Recoll.org/recoll.conf). You probably do not want to edit
-   this by hand.
-
-   Recoll indexing options are set inside text configuration files located in
-   a configuration directory. There can be several such directories, each of
-   which define the parameters for one index.
-
-   The configuration files can be edited by hand or through the Index
-   configuration dialog (Preferences menu). The GUI tool will try to respect
-   your formatting and comments as much as possible, so it is quite possible
-   to use both ways.
-
-   The most accurate documentation for the configuration parameters is given
-   by comments inside the default files, and we will just give a general
-   overview here.
-
-   For each index, there are two sets of configuration files. System-wide
-   configuration files are kept in a directory named like
-   /usr/[local/]share/recoll/examples, and define default values, shared by
-   all indexes. For each index, a parallel set of files defines the
-   customized parameters.
-
-   The default location of the configuration is the .recoll directory in your
-   home. Most people will only use this directory.
-
-   This location can be changed, or others can be added with the
-   RECOLL_CONFDIR environment variable or the -c option parameter to recoll
-   and recollindex.
-
-   If the .recoll directory does not exist when recoll or recollindex are
-   started, it will be created with a set of empty configuration files.
-   recoll will give you a chance to edit the configuration file before
-   starting indexing. recollindex will proceed immediately. To avoid
-   mistakes, the automatic directory creation will only occur for the default
-   location, not if -c or RECOLL_CONFDIR were used (in the latter cases, you
-   will have to create the directory).
-
-   All configuration files share the same format. For example, a short
-   extract of the main configuration file might look as follows:
-
-         # Space-separated list of directories to index.
-         topdirs =  ~/docs /usr/share/doc
-
-         [~/somedirectory-with-utf8-txt-files]
-         defaultcharset = utf-8
-        
-
-   There are three kinds of lines:
-
-     * Comment (starts with #) or empty.
-
-     * Parameter affectation (name = value).
-
-     * Section definition ([somedirname]).
-
-   Depending on the type of configuration file, section definitions either
-   separate groups of parameters or allow redefining some parameters for a
-   directory sub-tree. They stay in effect until another section definition,
-   or the end of file, is encountered. Some of the parameters used for
-   indexing are looked up hierarchically from the current directory location
-   upwards. Not all parameters can be meaningfully redefined, this is
-   specified for each in the next section.
-
-   When found at the beginning of a file path, the tilde character (~) is
-   expanded to the name of the user's home directory, as a shell would do.
-
-   White space is used for separation inside lists. List elements with
-   embedded spaces can be quoted using double-quotes.
-
-   Encoding issues. Most of the configuration parameters are plain ASCII. Two
-   particular sets of values may cause encoding issues:
-
-     * File path parameters may contain non-ascii characters and should use
-       the exact same byte values as found in the file system directory.
-       Usually, this means that the configuration file should use the system
-       default locale encoding.
-
-     * The unac_except_trans parameter should be encoded in UTF-8. If your
-       system locale is not UTF-8, and you need to also specify non-ascii
-       file paths, this poses a difficulty because common text editors cannot
-       handle multiple encodings in a single file. In this relatively
-       unlikely case, you can edit the configuration file as two separate
-       text files with appropriate encodings, and concatenate them to create
-       the complete configuration.
-
-5.4.1. Main configuration file
-
-   recoll.conf is the main configuration file. It defines things like what to
-   index (top directories and things to ignore), and the default character
-   set to use for document types which do not specify it internally.
-
-   The default configuration will index your home directory. If this is not
-   appropriate, start recoll to create a blank configuration, click Cancel,
-   and edit the configuration file before restarting the command. This will
-   start the initial indexing, which may take some time.
-
-   Most of the following parameters can be changed from the Index
-   Configuration menu in the recoll interface. Some can only be set by
-   editing the configuration file.
-
-  5.4.1.1. Parameters affecting what documents we index:
-
-   topdirs
-
-           Specifies the list of directories or files to index (recursively
-           for directories). You can use symbolic links as elements of this
-           list. See the followLinks option about following symbolic links
-           found under the top elements (not followed by default).
-
-   skippedNames
-
-           A space-separated list of patterns for names of files or
-           directories that should be completely ignored. The list defined in
-           the default file is:
-
- skippedNames = #* bin CVS  Cache cache* caughtspam  tmp .thumbnails .svn \
-                *~ .beagle .git .hg .bzr loop.ps .xsession-errors \
-                .recoll* xapiandb recollrc recoll.conf
-
-           The list can be redefined at any sub-directory in the indexed
-           area.
-
-           The top-level directories are not affected by this list (that is,
-           a directory in topdirs might match and would still be indexed).
-
-           The list in the default configuration does not exclude hidden
-           directories (names beginning with a dot), which means that it may
-           index quite a few things that you do not want. On the other hand,
-           email user agents like thunderbird usually store messages in
-           hidden directories, and you probably want this indexed. One
-           possible solution is to have .* in skippedNames, and add things
-           like ~/.thunderbird or ~/.evolution in topdirs.
-
-           Not even the file names are indexed for patterns in this list. See
-           the recoll_noindex variable in mimemap for an alternative approach
-           which indexes the file names.
-
-   skippedPaths and daemSkippedPaths
-
-           A space-separated list of patterns for paths of files or
-           directories that should be skipped. There is no default in the
-           sample configuration file, but the code always adds the
-           configuration and database directories in there.
-
-           skippedPaths is used both by batch and real time indexing.
-           daemSkippedPaths can be used to specify things that should be
-           indexed at startup, but not monitored.
-
-           Example of use for skipping text files only in a specific
-           directory:
-
- skippedPaths = ~/somedir/..txt
-              
-
-   skippedPathsFnmPathname
-
-           The values in the *skippedPaths variables are matched by default
-           with fnmatch(3), with the FNM_PATHNAME and FNM_LEADING_DIR flags.
-           This means that '/' characters must be matched explicitely. You
-           can set skippedPathsFnmPathname to 0 to disable the use of
-           FNM_PATHNAME (meaning that /*/dir3 will match /dir1/dir2/dir3).
-
-   followLinks
-
-           Specifies if the indexer should follow symbolic links while
-           walking the file tree. The default is to ignore symbolic links to
-           avoid multiple indexing of linked files. No effort is made to
-           avoid duplication when this option is set to true. This option can
-           be set individually for each of the topdirs members by using
-           sections. It can not be changed below the topdirs level.
-
-   indexedmimetypes
-
-           Recoll normally indexes any file which it knows how to read. This
-           list lets you restrict the indexed mime types to what you specify.
-           If the variable is unspecified or the list empty (the default),
-           all supported types are processed.
-
-   compressedfilemaxkbs
-
-           Size limit for compressed (.gz or .bz2) files. These need to be
-           decompressed in a temporary directory for identification, which
-           can be very wasteful if 'uninteresting' big compressed files are
-           present. Negative means no limit, 0 means no processing of any
-           compressed file. Defaults to -1.
-
-   textfilemaxmbs
-
-           Maximum size for text files. Very big text files are often
-           uninteresting logs. Set to -1 to disable (default 20MB).
-
-   textfilepagekbs
-
-           If set to other than -1, text files will be indexed as multiple
-           documents of the given page size. This may be useful if you do
-           want to index very big text files as it will both reduce memory
-           usage at index time and help with loading data to the preview
-           window. A size of a few megabytes would seem reasonable (default:
-           1MB).
-
-   membermaxkbs
-
-           This defines the maximum size in kilobytes for an archive member
-           (zip, tar or rar at the moment). Bigger entries will be skipped.
-
-   indexallfilenames
-
-           Recoll indexes file names in a special section of the database to
-           allow specific file names searches using wild cards. This
-           parameter decides if file name indexing is performed only for
-           files with mime types that would qualify them for full text
-           indexing, or for all files inside the selected subtrees,
-           independently of mime type.
-
-   usesystemfilecommand
-
-           Decide if we use the file -i system command as a final step for
-           determining the mime type for a file (the main procedure uses
-           suffix associations as defined in the mimemap file). This can be
-           useful for files with suffix-less names, but it will also cause
-           the indexing of many bogus "text" files.
-
-   processbeaglequeue
-
-           If this is set, process the directory where Beagle Web browser
-           plugins copy visited pages for indexing. Of course, Beagle MUST
-           NOT be running, else things will behave strangely.
-
-   beaglequeuedir
-
-           The path to the Beagle indexing queue. This is hard-coded in the
-           Beagle plugin as ~/.beagle/ToIndex so there should be no need to
-           change it.
-
-  5.4.1.2. Parameters affecting how we generate terms:
-
-   Changing some of these parameters will imply a full reindex. Also, when
-   using multiple indexes, it may not make sense to search indexes that don't
-   share the values for these parameters, because they usually affect both
-   search and index operations.
-
-   indexStripChars
-
-           Decide if we strip characters of diacritics and convert them to
-           lower-case before terms are indexed. If we don't, searches
-           sensitive to case and diacritics can be performed, but the index
-           will be bigger, and some marginal weirdness may sometimes occur.
-           The default is a stripped index (indexStripChars = 1) for now.
-           When using multiple indexes for a search, this parameter must be
-           defined identically for all. Changing the value implies an index
-           reset.
-
-   maxTermExpand
-
-           Maximum expansion count for a single term (e.g.: when using
-           wildcards). The default of 10000 is reasonable and will avoid
-           queries that appear frozen while the engine is walking the term
-           list.
-
-   maxXapianClauses
-
-           Maximum number of elementary clauses we can add to a single Xapian
-           query. In some cases, the result of term expansion can be
-           multiplicative, and we want to avoid using excessive memory. The
-           default of 100 000 should be both high enough in most cases and
-           compatible with current typical hardware configurations.
-
-   nonumbers
-
-           If this set to true, no terms will be generated for numbers. For
-           example "123", "1.5e6", 192.168.1.4, would not be indexed
-           ("value123" would still be). Numbers are often quite interesting
-           to search for, and this should probably not be set except for
-           special situations, ie, scientific documents with huge amounts of
-           numbers in them. This can only be set for a whole index, not for a
-           subtree.
-
-   nocjk
-
-           If this set to true, specific east asian (Chinese Korean Japanese)
-           characters/word splitting is turned off. This will save a small
-           amount of cpu if you have no CJK documents. If your document base
-           does include such text but you are not interested in searching it,
-           setting nocjk may be a significant time and space saver.
-
-   cjkngramlen
-
-           This lets you adjust the size of n-grams used for indexing CJK
-           text. The default value of 2 is probably appropriate in most
-           cases. A value of 3 would allow more precision and efficiency on
-           longer words, but the index will be approximately twice as large.
-
-   indexstemminglanguages
-
-           A list of languages for which the stem expansion databases will be
-           built. See recollindex(1) or use the recollindex -l command for
-           possible values. You can add a stem expansion database for a
-           different language by using recollindex -s, but it will be deleted
-           during the next indexing. Only languages listed in the
-           configuration file are permanent.
-
-   defaultcharset
-
-           The name of the character set used for files that do not contain a
-           character set definition (ie: plain text files). This can be
-           redefined for any sub-directory. If it is not set at all, the
-           character set used is the one defined by the nls environment (
-           LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
-
-   unac_except_trans
-
-           This is a list of characters, encoded in UTF-8, which should be
-           handled specially when converting text to unaccented lowercase.
-           For example, in Swedish, the letter a with diaeresis has full
-           alphabet citizenship and should not be turned into an a. Each
-           element in the space-separated list has the special character as
-           first element and the translation following. The handling of both
-           the lowercase and upper-case versions of a character should be
-           specified, as appartenance to the list will turn-off both standard
-           accent and case processing. Example for Swedish:
-
- unac_except_trans =  aaaa AAaa a:a: A:a: o:o: O:o:
-            
-
-           Note that the translation is not limited to a single character,
-           you could very well have something like u:ue in the list.
-
-           The default value set for unac_except_trans can't be listed here
-           because I have trouble with SGML and UTF-8, but it only contains
-           ligature decompositions: german ss, oe, ae, fi, fl.
-
-           This parameter can't be defined for subdirectories, it is global,
-           because there is no way to do otherwise when querying. If you have
-           document sets which would need different values, you will have to
-           index and query them separately.
-
-   maildefcharset
-
-           This can be used to define the default character set specifically
-           for email messages which don't specify it. This is mainly useful
-           for readpst (libpst) dumps, which are utf-8 but do not say so.
-
-   localfields
-
-           This allows setting fields for all documents under a given
-           directory. Typical usage would be to set an "rclaptg" field, to be
-           used in mimeview to select a specific viewer. If several fields
-           are to be set, they should be separated with a colon (':')
-           character (which there is currently no way to escape). Ie:
-           localfields= rclaptg=gnus:other = val, then select specifier
-           viewer with mimetype|tag=... in mimeview.
-
-  5.4.1.3. Parameters affecting where and how we store things:
-
-   dbdir
-
-           The name of the Xapian data directory. It will be created if
-           needed when the index is initialized. If this is not an absolute
-           path, it will be interpreted relative to the configuration
-           directory. The value can have embedded spaces but starting or
-           trailing spaces will be trimmed. You cannot use quotes here.
-
-   idxstatusfile
-
-           The name of the scratch file where the indexer process updates its
-           status. Default: idxstatus.txt inside the configuration directory.
-
-   maxfsoccuppc
-
-           Maximum file system occupation before we stop indexing. The value
-           is a percentage, corresponding to what the "Capacity" df output
-           column shows. The default value is 0, meaning no checking.
-
-   mboxcachedir
-
-           The directory where mbox message offsets cache files are held.
-           This is normally $RECOLL_CONFDIR/mboxcache, but it may be useful
-           to share a directory between different configurations.
-
-   mboxcacheminmbs
-
-           The minimum mbox file size over which we cache the offsets. There
-           is really no sense in caching offsets for small files. The default
-           is 5 MB.
-
-   webcachedir
-
-           This is only used by the Beagle web browser plugin indexing code,
-           and defines where the cache for visited pages will live. Default:
-           $RECOLL_CONFDIR/webcache
-
-   webcachemaxmbs
-
-           This is only used by the Beagle web browser plugin indexing code,
-           and defines the maximum size for the web page cache. Default: 40
-           MB.
-
-   idxflushmb
-
-           Threshold (megabytes of new text data) where we flush from memory
-           to disk index. Setting this can help control memory usage. A value
-           of 0 means no explicit flushing, letting Xapian use its own
-           default, which is flushing every 10000 (or XAPIAN_FLUSH_THRESHOLD)
-           documents, which gives little memory usage control, as memory
-           usage depends on average document size. The default value is 10.
-
-  5.4.1.4. Miscellaneous parameters:
-
-   autodiacsens
-
-           IF the index is not stripped, decide if we automatically trigger
-           diacritics sensitivity if the search term has accented characters
-           (not in unac_except_trans). Else you need to use the query
-           language and the D modifier to specify diacritics sensitivity.
-           Default is no.
-
-   autocasesens
-
-           IF the index is not stripped, decide if we automatically trigger
-           character case sensitivity if the search term has upper-case
-           characters in any but the first position. Else you need to use the
-           query language and the C modifier to specify character-case
-           sensitivity. Default is yes.
-
-   loglevel,daemloglevel
-
-           Verbosity level for recoll and recollindex. A value of 4 lists
-           quite a lot of debug/information messages. 2 only lists errors.
-           The daemversion is specific to the indexing monitor daemon.
-
-   logfilename, daemlogfilename
-
-           Where the messages should go. 'stderr' can be used as a special
-           value, and is the default. The daemversion is specific to the
-           indexing monitor daemon.
-
-   mondelaypatterns
-
-           This allows specify wildcard path patterns (processed with
-           fnmatch(3) with 0 flag), to match files which change too often and
-           for which a delay should be observed before re-indexing. This is a
-           space-separated list, each entry being a pattern and a time in
-           seconds, separated by a colon. You can use double quotes if a path
-           entry contains white space. Example:
-
- mondelaypatterns = *.log:20 "this one has spaces*:10"
-              
-
-   monixinterval
-
-           Minimum interval (seconds) for processing the indexing queue. The
-           real time monitor does not process each event when it comes in,
-           but will wait this time for the queue to accumulate to diminish
-           overhead and in order to aggregate multiple events to the same
-           file. Default 30 S.
-
-   monauxinterval
-
-           Period (in seconds) at which the real time monitor will regenerate
-           the auxiliary databases (spelling, stemming) if needed. The
-           default is one hour.
-
-   monioniceclass, monioniceclassdata
-
-           These allow defining the ionice class and data used by the indexer
-           (default class 3, no data).
-
-   filtermaxseconds
-
-           Maximum filter execution time, after which it is aborted. Some
-           postscript programs just loop...
-
-   filtersdir
-
-           A directory to search for the external filter scripts used to
-           index some types of files. The value should not be changed, except
-           if you want to modify one of the default scripts. The value can be
-           redefined for any sub-directory.
-
-   iconsdir
-
-           The name of the directory where recoll result list icons are
-           stored. You can change this if you want different images.
-
-   idxabsmlen
-
-           Recoll stores an abstract for each indexed file inside the
-           database. The text can come from an actual 'abstract' section in
-           the document or will just be the beginning of the document. It is
-           stored in the index so that it can be displayed inside the result
-           lists without decoding the original file. The idxabsmlen parameter
-           defines the size of the stored abstract. The default value is 250
-           bytes. The search interface gives you the choice to display this
-           stored text or a synthetic abstract built by extracting text
-           around the search terms. If you always prefer the synthetic
-           abstract, you can reduce this value and save a little space.
-
-   aspellLanguage
-
-           Language definitions to use when creating the aspell dictionary.
-           The value must match a set of aspell language definition files.
-           You can type "aspell config" to see where these are installed
-           (look for data-dir). The default if the variable is not set is to
-           use your desktop national language environment to guess the value.
-
-   noaspell
-
-           If this is set, the aspell dictionary generation is turned off.
-           Useful for cases where you don't need the functionality or when it
-           is unusable because aspell crashes during dictionary generation.
-
-   mhmboxquirks
-
-           This allows definining location-related quirks for the mailbox
-           handler. Currently only the tbird flag is defined, and it should
-           be set for directories which hold Thunderbird data, as their
-           folder format is weird.
-
-5.4.2. The fields file
-
-   This file contains information about dynamic fields handling in Recoll.
-   Some very basic fields have hard-wired behaviour, and, mostly, you should
-   not change the original data inside the fields file. But you can create
-   custom fields fitting your data and handle them just like they were native
-   ones.
-
-   The fields file has several sections, which each define an aspect of
-   fields processing. Quite often, you'll have to modify several sections to
-   obtain the desired behaviour.
-
-   We will only give a short description here, you should refer to the
-   comments inside the file for more detailed information.
-
-   Field names should be lowercase alphabetic ASCII.
-
-   [prefixes]
-
-           A field becomes indexed (searchable) by having a prefix defined in
-           this section.
-
-   [stored]
-
-           A field becomes stored (displayable inside results) by having its
-           name listed in this section (typically with an empty value).
-
-   [aliases]
-
-           This section defines lists of synonyms for the canonical names
-           used inside the [prefixes] and [stored] sections
-
-   filter-specific sections
-
-           Some filters may need specific configuration for handling fields.
-           Only the email message filter currently has such a section (named
-           [mail]). It allows indexing arbitrary email headers in addition to
-           the ones indexed by default. Other such sections may appear in the
-           future.
-
-   Here follows a small example of a personal fields file. This would extract
-   a specific email header and use it as a searchable field, with data
-   displayable inside result lists. (Side note: as the email filter does no
-   decoding on the values, only plain ascii headers can be indexed, and only
-   the first occurrence will be used for headers that occur several times).
-
- [prefixes]
- # Index mailmytag contents (with the given prefix)
- mailmytag = XMTAG
-
- [stored]
- # Store mailmytag inside the document data record (so that it can be
- # displayed - as %(mailmytag) - in result lists).
- mailmytag =
-
- [mail]
- # Extract the X-My-Tag mail header, and use it internally with the
- # mailmytag field name
- x-my-tag = mailmytag
-
-5.4.3. The mimemap file
-
-   mimemap specifies the file name extension to mime type mappings.
-
-   For file names without an extension, or with an unknown one, the system's
-   file -i command will be executed to determine the mime type (this can be
-   switched off inside the main configuration file).
-
-   The mappings can be specified on a per-subtree basis, which may be useful
-   in some cases. Example: gaim logs have a .txt extension but should be
-   handled specially, which is possible because they are usually all located
-   in one place.
-
-   mimemap also has a recoll_noindex variable which is a list of suffixes.
-   Matching files will be skipped (which avoids unnecessary decompressions or
-   file executions). This is partially redundant with skippedNames in the
-   main configuration file, with a few differences: it will not affect
-   directories, it cannot be made dependant on the file-system location (it
-   is a configuration-wide parameter), and the file names will still be
-   indexed (not even the file names are indexed for patterns in skippedNames.
-   recoll_noindex is used mostly for things known to be unindexable by a
-   given Recoll version. Having it there avoids cluttering the more
-   user-oriented and locally customized skippedNames.
-
-5.4.4. The mimeconf file
-
-   mimeconf specifies how the different mime types are handled for indexing,
-   and which icons are displayed in the recoll result lists.
-
-   Changing the parameters in the [index] section is probably not a good idea
-   except if you are a Recoll developer.
-
-   The [icons] section allows you to change the icons which are displayed by
-   recoll in the result lists (the values are the basenames of the png images
-   inside the iconsdir directory (specified in recoll.conf).
-
-5.4.5. The mimeview file
-
-   mimeview specifies which programs are started when you click on an Open
-   link in a result list. Ie: HTML is normally displayed using firefox, but
-   you may prefer Konqueror, your openoffice.org program might be named
-   oofice instead of openoffice etc.
-
-   Changes to this file can be done by direct editing, or through the recoll
-   GUI preferences dialog.
-
-   If Use desktop preferences to choose document editor is checked in the
-   Recoll GUI preferences, all mimeview entries will be ignored except the
-   one labelled application/x-all (which is set to use xdg-open by default).
-
-   In this case, the xallexcepts top level variable defines a list of mime
-   type exceptions which will be processed according to the local entries
-   instead of being passed to the desktop. This is so that specific Recoll
-   options such as a page number or a search string can be passed to
-   applications that support them, such as the evince viewer.
-
-   As for the other configuration files, the normal usage is to have a
-   mimeview inside your own configuration directory, with just the
-   non-default entries, which will override those from the central
-   configuration file.
-
-   All viewer definition entries must be placed under a [view] section.
-
-   The keys in the file are normally mime types. You can add an application
-   tag to specialize the choice for an area of the filesystem (using a
-   localfields specification in mimeconf). The syntax for the key is
-   mimetype|tag
-
-   The nouncompforviewmts entry, (placed at the top level, outside of the
-   [view] section), holds a list of mime types that should not be
-   uncompressed before starting the viewer (if they are found compressed, ie:
-   mydoc.doc.gz).
-
-   The right side of each assignment holds a command to be executed for
-   opening the file. The following substitutions are performed:
-
-     * %D. Document date
-
-     * %f. File name. This may be the name of a temporary file if it was
-       necessary to create one (ie: to extract a subdocument from a
-       container).
-
-     * %F. Original file name. Same as %f except if a temporary file is used.
-
-     * %i. Internal path, for subdocuments of containers. The format depends
-       on the container type. If this appears in the command line, Recoll
-       will not create a temporary file to extract the subdocument, expecting
-       the called application (possibly a script) to be able to handle it.
-
-     * %M. Mime type
-
-     * %p. Page index. Only significant for a subset of document types,
-       currently only PDF, Postscript and DVI files. Can be used to start the
-       editor at the right page for a match or snippet.
-
-     * %s. Search term. The value will only be set for documents with indexed
-       page numbers (ie: PDF). The value will be one of the matched search
-       terms. It would allow pre-setting the value in the "Find" entry inside
-       Evince for example, for easy highlighting of the term.
-
-     * %U, %u. Url.
-
-   In addition to the predefined values above, all strings like %(fieldname)
-   will be replaced by the value of the field named fieldname for the
-   document. This could be used in combination with field customisation to
-   help with opening the document.
-
-5.4.6. Examples of configuration adjustments
-
-  5.4.6.1. Adding an external viewer for an non-indexed type
-
-   Imagine that you have some kind of file which does not have indexable
-   content, but for which you would like to have a functional Open link in
-   the result list (when found by file name). The file names end in .blob and
-   can be displayed by application blobviewer.
-
-   You need two entries in the configuration files for this to work:
-
-     * In $RECOLL_CONFDIR/mimemap (typically ~/.recoll/mimemap), add the
-       following line:
-
- .blob = application/x-blobapp
-
-       Note that the mime type is made up here, and you could call it
-       diesel/oil just the same.
-     * In $RECOLL_CONFDIR/mimeview under the [view] section, add:
-
- application/x-blobapp = blobviewer %f
-
-       We are supposing that blobviewer wants a file name parameter here, you
-       would use %u if it liked URLs better.
-
-   If you just wanted to change the application used by Recoll to display a
-   mime type which it already knows, you would just need to edit mimeview.
-   The entries you add in your personal file override those in the central
-   configuration, which you do not need to alter. mimeview can also be
-   modified from the Gui.
-
-  5.4.6.2. Adding indexing support for a new file type
-
-   Let us now imagine that the above .blob files actually contain indexable
-   text and that you know how to extract it with a command line program.
-   Getting Recoll to index the files is easy. You need to perform the above
-   alteration, and also to add data to the mimeconf file (typically in
-   ~/.recoll/mimeconf):
-
-     * Under the [index] section, add the following line (more about the
-       rclblob indexing script later):
-
- application/x-blobapp = exec rclblob
-
-     * Under the [icons] section, you should choose an icon to be displayed
-       for the files inside the result lists. Icons are normally 64x64 pixels
-       PNG files which live in /usr/[local/]share/recoll/images.
-
-     * Under the [categories] section, you should add the mime type where it
-       makes sense (you can also create a category). Categories may be used
-       for filtering in advanced search.
-
-   The rclblob filter should be an executable program or script which exists
-   inside /usr/[local/]share/recoll/filters. It will be given a file name as
-   argument and should output the text or html contents on the standard
-   output.
-
-   The filter programming section describes in more detail how to write a
-   filter.
-
-   --------------------------------------------------------------------------
-
-   Prev                               Home                                    
-   Building from source                Up