Switch to unified view

a/src/INSTALL b/src/INSTALL
...
...
14
14
15
Chapter 5. Installation and configuration
15
Chapter 5. Installation and configuration
16
16
17
5.1. Installing a binary copy
17
5.1. Installing a binary copy
18
18
19
   There are three types of binary Recoll installations:
19
   Recoll binary copies are always distributed as regular packages for your
20
   system. They can be obtained either through the system's normal software
21
   distribution framework (e.g. Debian/Ubuntu apt, FreeBSD ports, etc.), or
22
   from some type of "backports" repository providing versions newer than the
23
   standard ones, or found on the Recoll WEB site in some cases.
20
24
21
     o Through your system normal software distribution framework (ie,
25
   There used to exist another form of binary install, as pre-compiled source
22
       Debian/Ubuntu apt, FreeBSD ports, etc.).
26
   trees, but these are just less convenient than the packages and don't
27
   exist any more.
23
28
24
     o From a package downloaded from the Recoll web site.
29
   The package management tools will usually automatically deal with hard
30
   dependancies for packages obtained from a proper package repository. You
31
   will have to deal with them by hand for downloaded packages (for example,
32
   when dpkg complains about missing dependancies).
25
33
26
     o From a prebuilt tree downloaded from the Recoll web site.
27
28
   In all cases, the strict software dependancies (ie on Xapian or iconv)
29
   will be automatically satisfied, you should not have to worry about them.
30
31
   You will only have to check or install supporting applications for the
34
   In all cases, you will have to check or install supporting applications
32
   file types that you want to index beyond those that are natively processed
35
   for the file types that you want to index beyond those that are natively
33
   by Recoll (text, HTML, email files, and a few others).
36
   processed by Recoll (text, HTML, email files, and a few others).
34
37
35
   You should also maybe have a look at the configuration section (but this
38
   You should also maybe have a look at the configuration section (but this
36
   may not be necessary for a quick test with default parameters). Most
39
   may not be necessary for a quick test with default parameters). Most
37
   parameters can be more conveniently set from the GUI interface.
40
   parameters can be more conveniently set from the GUI interface.
38
39
  5.1.1. Installing through a package system
40
41
   If you use a BSD-type port system or a prebuilt package (DEB, RPM,
42
   manually or through the system software configuration utility), just
43
   follow the usual procedure for your system.
44
45
  5.1.2. Installing a prebuilt Recoll
46
47
   The unpackaged binary versions on the Recoll web site are just compressed
48
   tar files of a build tree, where only the useful parts were kept
49
   (executables and sample configuration).
50
51
   The executable binary files are built with a static link to libxapian and
52
   libiconv, to make installation easier (no dependencies).
53
54
   After extracting the tar file, you can proceed with installation as if you
55
   had built the package from source (that is, just type make install). The
56
   binary trees are built for installation to /usr/local.
57
41
58
     ----------------------------------------------------------------------
42
     ----------------------------------------------------------------------
59
43
60
   Prev                                                                  Next 
44
   Prev                                                                  Next 
61
   4.3. API                           Home           5.2. Supporting packages 
45
   4.3. API                           Home           5.2. Supporting packages 
...
...
280
     o Of course the usual autoconf configure options, like --prefix apply.
264
     o Of course the usual autoconf configure options, like --prefix apply.
281
265
282
   Normal procedure:
266
   Normal procedure:
283
267
284
         cd recoll-xxx
268
         cd recoll-xxx
285
         configure
269
         ./configure
286
         make
270
         make
287
         (practices usual hardship-repelling invocations)
271
         (practices usual hardship-repelling invocations)
288
      
272
      
289
273
290
   There is little auto-configuration. The configure script will mainly link
274
   There is little auto-configuration. The configure script will mainly link
...
...
430
       handle multiple encodings in a single file. In this relatively
414
       handle multiple encodings in a single file. In this relatively
431
       unlikely case, you can edit the configuration file as two separate
415
       unlikely case, you can edit the configuration file as two separate
432
       text files with appropriate encodings, and concatenate them to create
416
       text files with appropriate encodings, and concatenate them to create
433
       the complete configuration.
417
       the complete configuration.
434
418
419
  5.4.1. Environment variables
420
421
   RECOLL_CONFDIR
422
423
           Defines the main configuration directory.
424
425
   RECOLL_TMPDIR, TMPDIR
426
427
           Locations for temporary files, in this order of priority. The
428
           default if none of these is set is to use /tmp. Big temporary
429
           files may be created during indexing, mostly for decompressing,
430
           and also for processing, e.g. email attachments.
431
432
   RECOLL_CONFTOP, RECOLL_CONFMID
433
434
           Allow adding configuration directories with priorities below and
435
           above the user directory (see above the Configuration overview
436
           section for details).
437
438
   RECOLL_EXTRA_DBS, RECOLL_ACTIVE_EXTRA_DBS
439
440
           Help for setting up external indexes. See this paragraph for
441
           explanations.
442
443
   RECOLL_DATADIR
444
445
           Defines replacement for the default location of Recoll data files,
446
           normally found in, e.g., /usr/share/recoll).
447
448
   RECOLL_FILTERSDIR
449
450
           Defines replacement for the default location of Recoll filters,
451
           normally found in, e.g., /usr/share/recoll/filters).
452
453
   ASPELL_PROG
454
455
           aspell program to use for creating the spelling dictionary. The
456
           result has to be compatible with the libaspell which Recoll is
457
           using.
458
459
   VARNAME
460
461
           Blabla
462
435
  5.4.1. The main configuration file, recoll.conf
463
  5.4.2. The main configuration file, recoll.conf
436
464
437
   recoll.conf is the main configuration file. It defines things like what to
465
   recoll.conf is the main configuration file. It defines things like what to
438
   index (top directories and things to ignore), and the default character
466
   index (top directories and things to ignore), and the default character
439
   set to use for document types which do not specify it internally.
467
   set to use for document types which do not specify it internally.
440
468
...
...
445
473
446
   Most of the following parameters can be changed from the Index
474
   Most of the following parameters can be changed from the Index
447
   Configuration menu in the recoll interface. Some can only be set by
475
   Configuration menu in the recoll interface. Some can only be set by
448
   editing the configuration file.
476
   editing the configuration file.
449
477
450
    5.4.1.1. Parameters affecting what documents we index:
478
    5.4.2.1. Parameters affecting what documents we index:
451
479
452
   topdirs
480
   topdirs
453
481
454
           Specifies the list of directories or files to index (recursively
482
           Specifies the list of directories or files to index (recursively
455
           for directories). You can use symbolic links as elements of this
483
           for directories). You can use symbolic links as elements of this
...
...
479
           hidden directories, and you probably want this indexed. One
507
           hidden directories, and you probably want this indexed. One
480
           possible solution is to have .* in skippedNames, and add things
508
           possible solution is to have .* in skippedNames, and add things
481
           like ~/.thunderbird or ~/.evolution in topdirs.
509
           like ~/.thunderbird or ~/.evolution in topdirs.
482
510
483
           Not even the file names are indexed for patterns in this list. See
511
           Not even the file names are indexed for patterns in this list. See
484
           the recoll_noindex variable in mimemap for an alternative approach
512
           the noContentSuffixes variable for an alternative approach which
485
           which indexes the file names.
513
           indexes the file names.
514
515
   noContentSuffixes
516
517
           This is a list of file name endings (not wildcard expressions, nor
518
           dot-delimited suffixes). Only the names of matching files will be
519
           indexed (no attempt at MIME type identification, no decompression,
520
           no content indexing). This can be redefined for subdirectories,
521
           and edited from the GUI. The default value is:
522
523
 noContentSuffixes = .md5 .map \
524
        .o .lib .dll .a .sys .exe .com \
525
        .mpp .mpt .vsd \
526
            .img .img.gz .img.bz2 .img.xz .image .image.gz .image.bz2 .image.xz \
527
        .dat .bak .rdf .log.gz .log .db .msf .pid \
528
        ,v ~ #
486
529
487
   skippedPaths and daemSkippedPaths
530
   skippedPaths and daemSkippedPaths
488
531
489
           A space-separated list of patterns for paths of files or
532
           A space-separated list of patterns for paths of files or
490
           directories that should be skipped. There is no default in the
533
           directories that should be skipped. There is no default in the
...
...
600
643
601
           The path to the web indexing queue. This is hard-coded in the
644
           The path to the web indexing queue. This is hard-coded in the
602
           Firefox plugin as ~/.recollweb/ToIndex so there should be no need
645
           Firefox plugin as ~/.recollweb/ToIndex so there should be no need
603
           to change it.
646
           to change it.
604
647
605
    5.4.1.2. Parameters affecting how we generate terms:
648
    5.4.2.2. Parameters affecting how we generate terms:
606
649
607
   Changing some of these parameters will imply a full reindex. Also, when
650
   Changing some of these parameters will imply a full reindex. Also, when
608
   using multiple indexes, it may not make sense to search indexes that don't
651
   using multiple indexes, it may not make sense to search indexes that don't
609
   share the values for these parameters, because they usually affect both
652
   share the values for these parameters, because they usually affect both
610
   search and index operations.
653
   search and index operations.
...
...
775
 field2 = value for field2
818
 field2 = value for field2
776
                
819
                
777
820
778
           field1 and field2 will be set inside the document metadata.
821
           field1 and field2 will be set inside the document metadata.
779
822
780
    5.4.1.3. Parameters affecting where and how we store things:
823
    5.4.2.3. Parameters affecting where and how we store things:
781
824
782
   dbdir
825
   dbdir
783
826
784
           The name of the Xapian data directory. It will be created if
827
           The name of the Xapian data directory. It will be created if
785
           needed when the index is initialized. If this is not an absolute
828
           needed when the index is initialized. If this is not an absolute
...
...
834
           usage also depends on average document size. The default value is
877
           usage also depends on average document size. The default value is
835
           10, and it is probably a bit low. If your system usually has free
878
           10, and it is probably a bit low. If your system usually has free
836
           memory, you can try higher values between 20 and 80. In my
879
           memory, you can try higher values between 20 and 80. In my
837
           experience, values beyond 100 are always counterproductive.
880
           experience, values beyond 100 are always counterproductive.
838
881
839
    5.4.1.4. Parameters affecting multithread processing
882
    5.4.2.4. Parameters affecting multithread processing
840
883
841
   The Recoll indexing process recollindex can use multiple threads to speed
884
   The Recoll indexing process recollindex can use multiple threads to speed
842
   up indexing on multiprocessor systems. The work done to index files is
885
   up indexing on multiprocessor systems. The work done to index files is
843
   divided in several stages and some of the stages can be executed by
886
   divided in several stages and some of the stages can be executed by
844
   multiple threads. The stages are:
887
   multiple threads. The stages are:
...
...
897
   The following example would disable multithreading. Indexing will be
940
   The following example would disable multithreading. Indexing will be
898
   performed by a single thread.
941
   performed by a single thread.
899
942
900
 thrQSizes = -1 -1 -1
943
 thrQSizes = -1 -1 -1
901
944
902
    5.4.1.5. Miscellaneous parameters:
945
    5.4.2.5. Miscellaneous parameters:
903
946
904
   autodiacsens
947
   autodiacsens
905
948
906
           IF the index is not stripped, decide if we automatically trigger
949
           IF the index is not stripped, decide if we automatically trigger
907
           diacritics sensitivity if the search term has accented characters
950
           diacritics sensitivity if the search term has accented characters
...
...
926
   logfilename, daemlogfilename
969
   logfilename, daemlogfilename
927
970
928
           Where the messages should go. 'stderr' can be used as a special
971
           Where the messages should go. 'stderr' can be used as a special
929
           value, and is the default. The daemversion is specific to the
972
           value, and is the default. The daemversion is specific to the
930
           indexing monitor daemon.
973
           indexing monitor daemon.
974
975
   checkneedretryindexscript
976
977
           This defines the name for a command executed by recollindex when
978
           starting indexing. If the exit status of the command is 0,
979
           recollindex retries to index all files which previously could not
980
           be indexed because of data extraction errors. The default value is
981
           a script which checks if any of the common bin directories have
982
           changed (indicating that a helper program may have been
983
           installed).
931
984
932
   mondelaypatterns
985
   mondelaypatterns
933
986
934
           This allows specify wildcard path patterns (processed with
987
           This allows specify wildcard path patterns (processed with
935
           fnmatch(3) with 0 flag), to match files which change too often and
988
           fnmatch(3) with 0 flag), to match files which change too often and
...
...
1017
           This allows definining location-related quirks for the mailbox
1070
           This allows definining location-related quirks for the mailbox
1018
           handler. Currently only the tbird flag is defined, and it should
1071
           handler. Currently only the tbird flag is defined, and it should
1019
           be set for directories which hold Thunderbird data, as their
1072
           be set for directories which hold Thunderbird data, as their
1020
           folder format is weird.
1073
           folder format is weird.
1021
1074
1022
  5.4.2. The fields file
1075
  5.4.3. The fields file
1023
1076
1024
   This file contains information about dynamic fields handling in Recoll.
1077
   This file contains information about dynamic fields handling in Recoll.
1025
   Some very basic fields have hard-wired behaviour, and, mostly, you should
1078
   Some very basic fields have hard-wired behaviour, and, mostly, you should
1026
   not change the original data inside the fields file. But you can create
1079
   not change the original data inside the fields file. But you can create
1027
   custom fields fitting your data and handle them just like they were native
1080
   custom fields fitting your data and handle them just like they were native
...
...
1088
 [mail]
1141
 [mail]
1089
 # Extract the X-My-Tag mail header, and use it internally with the
1142
 # Extract the X-My-Tag mail header, and use it internally with the
1090
 # mailmytag field name
1143
 # mailmytag field name
1091
 x-my-tag = mailmytag
1144
 x-my-tag = mailmytag
1092
1145
1093
    5.4.2.1. Extended attributes in the fields file
1146
    5.4.3.1. Extended attributes in the fields file
1094
1147
1095
   Recoll versions 1.19 and later process user extended file attributes as
1148
   Recoll versions 1.19 and later process user extended file attributes as
1096
   documents fields by default.
1149
   documents fields by default.
1097
1150
1098
   Attributes are processed as fields of the same name, after removing the
1151
   Attributes are processed as fields of the same name, after removing the
...
...
1100
1153
1101
   The [xattrtofields] section of the fields file allows specifying
1154
   The [xattrtofields] section of the fields file allows specifying
1102
   translations from extended attributes names to Recoll field names. An
1155
   translations from extended attributes names to Recoll field names. An
1103
   empty translation disables use of the corresponding attribute data.
1156
   empty translation disables use of the corresponding attribute data.
1104
1157
1105
  5.4.3. The mimemap file
1158
  5.4.4. The mimemap file
1106
1159
1107
   mimemap specifies the file name extension to MIME type mappings.
1160
   mimemap specifies the file name extension to MIME type mappings.
1108
1161
1109
   For file names without an extension, or with an unknown one, the system's
1162
   For file names without an extension, or with an unknown one, the system's
1110
   file -i command will be executed to determine the MIME type (this can be
1163
   file -i command will be executed to determine the MIME type (this can be
...
...
1113
   The mappings can be specified on a per-subtree basis, which may be useful
1166
   The mappings can be specified on a per-subtree basis, which may be useful
1114
   in some cases. Example: gaim logs have a .txt extension but should be
1167
   in some cases. Example: gaim logs have a .txt extension but should be
1115
   handled specially, which is possible because they are usually all located
1168
   handled specially, which is possible because they are usually all located
1116
   in one place.
1169
   in one place.
1117
1170
1118
   mimemap also has a recoll_noindex variable which is a list of suffixes.
1171
   The recoll_noindex mimemap variable has been moved to recoll.conf and
1119
   Matching files will be skipped (which avoids unnecessary decompressions or
1172
   renamed to noContentSuffixes, while keeping the same function, as of
1120
   file executions). This is partially redundant with skippedNames in the
1173
   Recoll version 1.21. For older Recoll versions, see the documentation for
1121
   main configuration file, with a few differences: it will not affect
1174
   noContentSuffixes but use recoll_noindex in mimemap.
1122
   directories, it cannot be made dependant on the file-system location (it
1123
   is a configuration-wide parameter), and the file names will still be
1124
   indexed (not even the file names are indexed for patterns in skippedNames.
1125
   recoll_noindex is used mostly for things known to be unindexable by a
1126
   given Recoll version. Having it there avoids cluttering the more
1127
   user-oriented and locally customized skippedNames.
1128
1175
1129
  5.4.4. The mimeconf file
1176
  5.4.5. The mimeconf file
1130
1177
1131
   mimeconf specifies how the different MIME types are handled for indexing,
1178
   mimeconf specifies how the different MIME types are handled for indexing,
1132
   and which icons are displayed in the recoll result lists.
1179
   and which icons are displayed in the recoll result lists.
1133
1180
1134
   Changing the parameters in the [index] section is probably not a good idea
1181
   Changing the parameters in the [index] section is probably not a good idea
...
...
1136
1183
1137
   The [icons] section allows you to change the icons which are displayed by
1184
   The [icons] section allows you to change the icons which are displayed by
1138
   recoll in the result lists (the values are the basenames of the png images
1185
   recoll in the result lists (the values are the basenames of the png images
1139
   inside the iconsdir directory (specified in recoll.conf).
1186
   inside the iconsdir directory (specified in recoll.conf).
1140
1187
1141
  5.4.5. The mimeview file
1188
  5.4.6. The mimeview file
1142
1189
1143
   mimeview specifies which programs are started when you click on an Open
1190
   mimeview specifies which programs are started when you click on an Open
1144
   link in a result list. Ie: HTML is normally displayed using firefox, but
1191
   link in a result list. Ie: HTML is normally displayed using firefox, but
1145
   you may prefer Konqueror, your openoffice.org program might be named
1192
   you may prefer Konqueror, your openoffice.org program might be named
1146
   oofice instead of openoffice etc.
1193
   oofice instead of openoffice etc.
...
...
1205
   In addition to the predefined values above, all strings like %(fieldname)
1252
   In addition to the predefined values above, all strings like %(fieldname)
1206
   will be replaced by the value of the field named fieldname for the
1253
   will be replaced by the value of the field named fieldname for the
1207
   document. This could be used in combination with field customisation to
1254
   document. This could be used in combination with field customisation to
1208
   help with opening the document.
1255
   help with opening the document.
1209
1256
1210
  5.4.6. The ptrans file
1257
  5.4.7. The ptrans file
1211
1258
1212
   ptrans specifies query-time path translations. These can be useful in
1259
   ptrans specifies query-time path translations. These can be useful in
1213
   multiple cases.
1260
   multiple cases.
1214
1261
1215
   The file has a section for any index which needs translations, either the
1262
   The file has a section for any index which needs translations, either the
...
...
1224
           [/path/to/additional/xapiandb]
1271
           [/path/to/additional/xapiandb]
1225
           /server/volume1/docdir = /net/server/volume1/docdir
1272
           /server/volume1/docdir = /net/server/volume1/docdir
1226
           /server/volume2/docdir = /net/server/volume2/docdir
1273
           /server/volume2/docdir = /net/server/volume2/docdir
1227
        
1274
        
1228
1275
1229
  5.4.7. Examples of configuration adjustments
1276
  5.4.8. Examples of configuration adjustments
1230
1277
1231
    5.4.7.1. Adding an external viewer for an non-indexed type
1278
    5.4.8.1. Adding an external viewer for an non-indexed type
1232
1279
1233
   Imagine that you have some kind of file which does not have indexable
1280
   Imagine that you have some kind of file which does not have indexable
1234
   content, but for which you would like to have a functional Open link in
1281
   content, but for which you would like to have a functional Open link in
1235
   the result list (when found by file name). The file names end in .blob and
1282
   the result list (when found by file name). The file names end in .blob and
1236
   can be displayed by application blobviewer.
1283
   can be displayed by application blobviewer.
...
...
1256
   MIME type which it already knows, you would just need to edit mimeview.
1303
   MIME type which it already knows, you would just need to edit mimeview.
1257
   The entries you add in your personal file override those in the central
1304
   The entries you add in your personal file override those in the central
1258
   configuration, which you do not need to alter. mimeview can also be
1305
   configuration, which you do not need to alter. mimeview can also be
1259
   modified from the Gui.
1306
   modified from the Gui.
1260
1307
1261
    5.4.7.2. Adding indexing support for a new file type
1308
    5.4.8.2. Adding indexing support for a new file type
1262
1309
1263
   Let us now imagine that the above .blob files actually contain indexable
1310
   Let us now imagine that the above .blob files actually contain indexable
1264
   text and that you know how to extract it with a command line program.
1311
   text and that you know how to extract it with a command line program.
1265
   Getting Recoll to index the files is easy. You need to perform the above
1312
   Getting Recoll to index the files is easy. You need to perform the above
1266
   alteration, and also to add data to the mimeconf file (typically in
1313
   alteration, and also to add data to the mimeconf file (typically in