Switch to unified view

a/src/INSTALL b/src/INSTALL
...
...
13
13
14
                            Chapter 7. Installation
14
                            Chapter 7. Installation
15
15
16
   Table of Contents
16
   Table of Contents
17
17
18
   7.1. Installing a prebuilt copy
18
   7.1. Installing a binary copy
19
19
20
   7.2. Supporting packages
20
   7.2. Supporting packages
21
21
22
   7.3. Building from source
22
   7.3. Building from source
23
23
24
   7.4. Configuration overview
24
   7.4. Configuration overview
25
25
26
   7.5. The KDE Kicker Recoll applet
26
   7.5. The KDE Kicker Recoll applet
27
27
28
                        7.1. Installing a prebuilt copy
28
                         7.1. Installing a binary copy
29
29
30
   Recoll binary packages from the Recoll web site are always linked
30
   There are three types of binary Recoll installations:
31
   statically to the Xapian libraries, and have no other dependencies. You
31
32
     * Through your system normal software distribution framework (ie,
33
       Debian/Ubuntu apt, FreeBSD ports, etc.).
34
35
     * From a package downloaded from the Recoll web site.
36
37
     * From a prebuilt tree downloaded from the Recoll web site.
38
39
   In all cases, the strict software dependancies (ie on Xapian or iconv)
40
   will be automatically satisfied, you should not have to worry about them.
41
32
   will only have to check or install supporting applications for the file
42
   You will only have to check or install supporting applications for the
33
   types that you want to index beyond text, HTML and mail files, and maybe
43
   file types that you want to index beyond those that are natively processed
34
   have a look at the configuration section (but this may not be necessary
44
   by Recoll (text, HTML, mail files, and a few others).
45
46
   You should also maybe have a look at the configuration section (but this
35
   for a quick test with default parameters).
47
   may not be necessary for a quick test with default parameters). Most
48
   parameters can be more conveniently set from the GUI interface.
36
49
37
7.1.1. Installing through a package system
50
7.1.1. Installing through a package system
38
51
39
   If you use a BSD-type port system or a prebuilt package (RPM or other),
52
   If you use a BSD-type port system or a prebuilt package (DEB, RPM,
53
   manually or through the system software configuration utility), just
40
   just follow the usual procedure for your system.
54
   follow the usual procedure for your system.
41
55
42
7.1.2. Installing a prebuilt Recoll
56
7.1.2. Installing a prebuilt Recoll
43
57
44
   The unpackaged binary versions on the Recoll web site are just compressed
58
   The unpackaged binary versions on the Recoll web site are just compressed
45
   tar files of a build tree, where only the useful parts were kept
59
   tar files of a build tree, where only the useful parts were kept
...
...
68
82
69
                            7.2. Supporting packages
83
                            7.2. Supporting packages
70
84
71
   Recoll uses external applications to index some file types. You need to
85
   Recoll uses external applications to index some file types. You need to
72
   install them for the file types that you wish to have indexed (these are
86
   install them for the file types that you wish to have indexed (these are
73
   run-time dependencies. None is needed for building Recoll).
87
   run-time optional dependencies. None is needed for building or running
88
   Recoll except for indexing their specific file type).
74
89
75
   After an indexing pass, the commands that were found missing can be
90
   After an indexing pass, the commands that were found missing can be
76
   displayed from the recoll File menu. The list is stored in the missing
91
   displayed from the recoll File menu. The list is stored in the missing
77
   text file inside the configuration directory.
92
   text file inside the configuration directory.
78
93
...
...
100
115
101
     * dvi: dvips
116
     * dvi: dvips
102
117
103
     * djvu: DjVuLibre
118
     * djvu: DjVuLibre
104
119
105
     * MP3: Recoll will use the id3info command from the id3lib package to
120
     * mp3: Recoll will use the id3info command from the id3lib package to
106
       extract tag information. Without it, only the file names will be
121
       extract tag information. Without it, only the file names will be
107
       indexed.
122
       indexed.
108
123
124
     * flac files need metaflac.
125
126
     * ogg files need ogginfo.
127
109
     * Pictures: Recoll uses the Exiftool Perl package to extract tag
128
     * Pictures: Recoll uses the Exiftool Perl package to extract tag
110
       information. Most image file formats are supported.
129
       information. Most image file formats are supported. Note that there
130
       may not be much interest in indexing the technical tags (image size,
131
       aperture, etc.). This is only of interest if you store personal tags
132
       or textual descriptions inside the image files.
111
133
134
     * chm: files in microsoft help format need Python and the pychm module
135
       (which needs chmlib).
136
137
     * ics: iCalendar files need Python and the icalendar module.
138
139
     * zip: Zip archives need Python (and the standard zipfile module).
140
112
   Text, HTML, mail folders Openoffice and Scribus files are processed
141
   Text, HTML, mail folders, Openoffice and Scribus files are processed
113
   internally. Lyx is used to index Lyx files. Many filters need sed and awk.
142
   internally. Lyx is used to index Lyx files. Many filters need sed and awk.
114
143
115
   --------------------------------------------------------------------------
144
   --------------------------------------------------------------------------
116
145
117
   Prev                               Home                               Next 
146
   Prev                               Home                               Next 
...
...
129
                           7.3. Building from source
158
                           7.3. Building from source
130
159
131
7.3.1. Prerequisites
160
7.3.1. Prerequisites
132
161
133
   At the very least, you will need to download and install the xapian core
162
   At the very least, you will need to download and install the xapian core
134
   package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
163
   package and the qt run-time and development packages. Check the Recoll
135
   version will work too), and the qt run-time and development packages
164
   download page for up to date version information.
136
   (Recoll development currently uses version 3.3.5, but any 3.3 version is
137
   probably OK).
138
165
139
   You will most probably be able to find a binary package for qt for your
166
   You will most probably be able to find a binary package for qt for your
140
   system. You may have to compile Xapian but this is not difficult (if you
167
   system. You may have to compile Xapian but this is not difficult (if you
141
   are using FreeBSD, there is a port).
168
   are using FreeBSD, there is a port).
142
169
...
...
144
   not be critical). On Linux systems, the iconv interface is part of libc
171
   not be critical). On Linux systems, the iconv interface is part of libc
145
   and you should not need to do anything special.
172
   and you should not need to do anything special.
146
173
147
7.3.2. Building
174
7.3.2. Building
148
175
149
   Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
176
   Recoll has been built on Linux, FreeBSD, macosx, and Solaris, most
150
   3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
177
   versions after 2005 should be ok, maybe some older ones too (Solaris 8 is
151
   system, and need to modify things, I would very much welcome patches.
178
   ok). If you build on another system, and need to modify things, I would
179
   very much welcome patches.
152
180
153
   Depending on the qt configuration on your system, you may have to set the
181
   Depending on the qt configuration on your system, you may have to set the
154
   QTDIR and QMAKESPECS variables in your environment:
182
   QTDIR and QMAKESPECS variables in your environment:
155
183
156
     * QTDIR should point to the directory above the one that holds the qt
184
     * QTDIR should point to the directory above the one that holds the qt
...
...
159
187
160
     * QMAKESPECS should be set to the name of one of the qt mkspecs
188
     * QMAKESPECS should be set to the name of one of the qt mkspecs
161
       sub-directories (ie: linux-g++).
189
       sub-directories (ie: linux-g++).
162
190
163
   On many Linux systems, QTDIR is set by the login scripts, and QMAKESPECS
191
   On many Linux systems, QTDIR is set by the login scripts, and QMAKESPECS
164
   is not needed because there is a default link in mkspecs/.
192
   is not needed because there is a default link in mkspecs/. Neither should
193
   be needed with Qt 4.
165
194
166
   Configure options: --without-aspell will disable the code for phonetic
195
   Configure options:
167
   matching of search terms. --with-fam or --with-inotify will enable the
196
197
     * --without-aspell will disable the code for phonetic matching of search
198
       terms.
199
200
     * --with-fam or --with-inotify will enable the code for real time
168
   code for real time indexing. Inotify support is enabled by default on
201
       indexing. Inotify support is enabled by default on recent Linux
169
   recent Linux systems.
202
       systems.
203
204
     * --enable-xattr will enable code to fetch data from file extended
205
       attributes. This is only useful is some application stores data in
206
       there, and also needs some simple configuration (see comments in the
207
       fields configuration file).
208
209
     * --with-file-command Specify the version of the 'file' command to use
210
       (ie: --with-file-command=/usr/local/bin/file). Can be useful to enable
211
       the gnu version on systems where the native one is bad.
212
213
     * --without-gui Disable the Qt interface, and auxiliary uses of X11, and
214
       compile the command line version.
170
215
171
   Normal procedure:
216
   Normal procedure:
172
217
173
         cd recoll-xxx
218
         cd recoll-xxx
174
         configure
219
         configure
175
         make
220
         make
176
         (practices usual hardship-repelling invocations)
221
         (practices usual hardship-repelling invocations)
177
     
222
     
178
223
179
   There little auto-configuration. The configure script will mainly link one
224
   There is little auto-configuration. The configure script will mainly link
180
   of the system-specific files in the mk directory to mk/sysconf. If your
225
   one of the system-specific files in the mk directory to mk/sysconf. If
181
   system is not known yet, it will tell you as much, and you may want to
226
   your system is not known yet, it will tell you as much, and you may want
182
   manually copy and modify one of the existing files (the new file name
227
   to manually copy and modify one of the existing files (the new file name
183
   should be the output of uname -s).
228
   should be the output of uname -s).
184
229
185
7.3.3. Installation
230
7.3.3. Installation
186
231
187
   Either type make install or execute recollinstall prefix, in the root of
232
   Either type make install or execute recollinstall prefix, in the root of
...
...
289
   The default configuration will index your home directory. If this is not
334
   The default configuration will index your home directory. If this is not
290
   appropriate, start recoll to create a blank configuration, click Cancel,
335
   appropriate, start recoll to create a blank configuration, click Cancel,
291
   and edit the configuration file before restarting the command. This will
336
   and edit the configuration file before restarting the command. This will
292
   start the initial indexing, which may take some time.
337
   start the initial indexing, which may take some time.
293
338
294
   Paramers:
339
   Paramers affecting what we index:
295
340
296
   topdirs
341
   topdirs
297
342
298
           Specifies the list of directories or files to index (recursively
343
           Specifies the list of directories or files to index (recursively
299
           for directories). The indexer will not follow symbolic links
344
           for directories). The indexer will not follow symbolic links
300
           inside the indexed trees by default (see the followLinks options
345
           inside the indexed trees by default (see the followLinks options
301
           though).
346
           though).
302
347
303
   dbdir
304
305
           The name of the Xapian data directory. It will be created if
306
           needed when the index is initialized. If this is not an absolute
307
           path, it will be interpreted relative to the configuration
308
           directory. The value can have embedded spaces but starting or
309
           trailing spaces will be trimmed. You cannot use quotes here.
310
311
   skippedNames
348
   skippedNames
312
349
313
           A space-separated list of patterns for names of files or
350
           A space-separated list of patterns for names of files or
314
           directories that should be completely ignored. The list defined in
351
           directories that should be completely ignored. The list defined in
315
           the default file is:
352
           the default file is:
316
353
317
 skippedNames = #* bin CVS  Cache cache* caughtspam  tmp .thumbnails .svn \
354
 skippedNames = #* bin CVS  Cache cache* caughtspam  tmp .thumbnails .svn \
318
          *~ recollrc
355
            *~ .beagle .git .hg .bzr loop.ps .xsession-errors \
356
            .recoll* xapiandb recollrc recoll.conf
319
357
320
           The list can be redefined for sub-directories, but is only
358
           The list can be redefined at any sub-directory in the indexed
321
           actually changed for the top level ones in topdirs.
359
           area.
322
360
323
           The top-level directories are not affected by this list (that is,
361
           The top-level directories are not affected by this list (that is,
324
           a directory in topdirs might match and would still be indexed).
362
           a directory in topdirs might match and would still be indexed).
325
363
326
           The list in the default configuration does not exclude hidden
364
           The list in the default configuration does not exclude hidden
...
...
359
           avoid multiple indexing of linked files. No effort is made to
397
           avoid multiple indexing of linked files. No effort is made to
360
           avoid duplication when this option is set to true. This option can
398
           avoid duplication when this option is set to true. This option can
361
           be set individually for each of the topdirs members by using
399
           be set individually for each of the topdirs members by using
362
           sections. It can not be changed below the topdirs level.
400
           sections. It can not be changed below the topdirs level.
363
401
402
   indexedmimetypes
403
404
           Recoll normally indexes any file which it knows how to read. This
405
           list lets you restrict the indexed mime types to what you specify.
406
           If the variable is unspecified or the list empty (the default),
407
           all supported types are processed.
408
409
   compressedfilemaxkbs
410
411
           Size limit for compressed (.gz or .bz2) files. These need to be
412
           decompressed in a temporary directory for identification, which
413
           can be very wasteful if 'uninteresting' big compressed files are
414
           present. Negative means no limit, 0 means no processing of any
415
           compressed file. Defaults to -1.
416
417
   textfilemaxmbs
418
419
           Maximum size for text files. Very big text files are often
420
           uninteresting logs. Set to -1 to disable (default 20MB).
421
422
   textfilepagekbs
423
424
           If set to other than -1, text files will be indexed as multiple
425
           documents of the given page size. This may be useful if you do
426
           want to index very big text files as it will both reduce memory
427
           usage at index time and help with loading data to the preview
428
           window. A size of a few megabytes would seem reasonable (default:
429
           1MB).
430
431
   indexallfilenames
432
433
           Recoll indexes file names in a special section of the database to
434
           allow specific file names searches using wild cards. This
435
           parameter decides if file name indexing is performed only for
436
           files with mime types that would qualify them for full text
437
           indexing, or for all files inside the selected subtrees,
438
           independently of mime type.
439
440
   usesystemfilecommand
441
442
           Decide if we use the file -i system command as a final step for
443
           determining the mime type for a file (the main procedure uses
444
           suffix associations as defined in the mimemap file). This can be
445
           useful for files with suffix-less names, but it will also cause
446
           the indexing of many bogus "text" files.
447
448
   processbeaglequeue
449
450
           If this is set, process the directory where Beagle Web browser
451
           plugins copy visited pages for indexing. Of course, Beagle MUST
452
           NOT be running, else things will behave strangely.
453
454
   beaglequeuedir
455
456
           The path to the Beagle indexing queue. This is hard-coded in the
457
           Beagle plugin as ~/.beagle/ToIndex so there should be no need to
458
           change it.
459
460
   Parameters affecting where and how we store things:
461
462
   dbdir
463
464
           The name of the Xapian data directory. It will be created if
465
           needed when the index is initialized. If this is not an absolute
466
           path, it will be interpreted relative to the configuration
467
           directory. The value can have embedded spaces but starting or
468
           trailing spaces will be trimmed. You cannot use quotes here.
469
470
   maxfsoccuppc
471
472
           Maximum file system occupation before we stop indexing. The value
473
           is a percentage, corresponding to what the "Capacity" df output
474
           column shows. The default value is 0, meaning no checking.
475
476
   mboxcachedir
477
478
           The directory where mbox message offsets cache files are held.
479
           This is normally $RECOLL_CONFDIR/mboxcache, but it may be useful
480
           to share a directory between different configurations.
481
482
   mboxcacheminmbs
483
484
           The minimum mbox file size over which we cache the offsets. There
485
           is really no sense in caching offsets for small files. The default
486
           is 5 MB.
487
488
   webcachedir
489
490
           This is only used by the Beagle web browser plugin indexing code,
491
           and defines where the cache for visited pages will live. Default:
492
           $RECOLL_CONFDIR/webcache
493
494
   webcachemaxmbs
495
496
           This is only used by the Beagle web browser plugin indexing code,
497
           and defines the maximum size for the web page cache. Default: 40
498
           MB.
499
500
   idxflushmb
501
502
           Threshold (megabytes of new text data) where we flush from memory
503
           to disk index. Setting this can help control memory usage. A value
504
           of 0 means no explicit flushing, letting Xapian use its own
505
           default, which is flushing every 10000 documents (memory usage
506
           depends on average document size). The default value is 10.
507
508
   Miscellani:
509
364
   loglevel,daemloglevel
510
   loglevel,daemloglevel
365
511
366
           Verbosity level for recoll and recollindex. A value of 4 lists
512
           Verbosity level for recoll and recollindex. A value of 4 lists
367
           quite a lot of debug/information messages. 2 only lists errors.
513
           quite a lot of debug/information messages. 2 only lists errors.
368
           The daemversion is specific to the indexing monitor daemon.
514
           The daemversion is specific to the indexing monitor daemon.
...
...
388
           character set definition (ie: plain text files). This can be
534
           character set definition (ie: plain text files). This can be
389
           redefined for any sub-directory. If it is not set at all, the
535
           redefined for any sub-directory. If it is not set at all, the
390
           character set used is the one defined by the nls environment
536
           character set used is the one defined by the nls environment
391
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
537
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
392
538
393
   maxfsoccuppc
539
   filtermaxseconds
394
540
395
           Maximum file system occupation before we stop indexing. The value
541
           Maximum filter execution time, after which it is aborted. Some
396
           is a percentage, corresponding to what the "Capacity" df output
542
           postscript programs just loop...
397
           column shows. The default value is 0, meaning no checking.
398
543
399
   idxflushmb
544
   maildefcharset
400
545
401
           Threshold (megabytes of new text data) where we flush from memory
546
           This can be used to define the default character set specifically
402
           to disk index. Setting this can help control memory usage. A value
547
           for mail messages which don't specify it. This is mainly useful
403
           of 0 means no explicit flushing, letting Xapian use its own
548
           for readpst (libpst) dumps, which are utf-8 but do not say so.
404
           default, which is flushing every 10000 documents (memory usage
549
405
           depends on average document size). The default value is 10.
550
   localfields
551
552
           This allows setting fields for all documents under a given
553
           directory. Typical usage would be to set an "rclaptg" field, to be
554
           used in mimeview to select a specific viewer. Ie:
555
           localfields=rclaptg=gnus;other=val, then select specifier viewer
556
           with mimetype|tag=... in mimeview.
406
557
407
   filtersdir
558
   filtersdir
408
559
409
           A directory to search for the external filter scripts used to
560
           A directory to search for the external filter scripts used to
410
           index some types of files. The value should not be changed, except
561
           index some types of files. The value should not be changed, except
...
...
413
564
414
   iconsdir
565
   iconsdir
415
566
416
           The name of the directory where recoll result list icons are
567
           The name of the directory where recoll result list icons are
417
           stored. You can change this if you want different images.
568
           stored. You can change this if you want different images.
418
419
   guesscharset
420
421
           Decide if we try to guess the character set of files if no
422
           internal value is available (ie: for plain text files). This does
423
           not work well in general, and should probably not be used.
424
425
   usesystemfilecommand
426
427
           Decide if we use the file -i system command as a final step for
428
           determining the mime type for a file (the main procedure uses
429
           suffix associations as defined in the mimemap file). This can be
430
           useful for files with suffix-less names, but it will also cause
431
           the indexing of many bogus "text" files.
432
433
   indexedmimetypes
434
435
           Recoll normally indexes any file which it knows how to read. This
436
           list lets you restrict the indexed mime types to what you specify.
437
           If the variable is unspecified or the list empty (the default),
438
           all supported types are processed.
439
440
   compressedfilemaxkbs
441
442
           Size limit for compressed (.gz or .bz2) files. These need to be
443
           decompressed in a temporary directory for identification, which
444
           can be very wasteful if 'uninteresting' big compressed files are
445
           present. Negative means no limit, 0 means no processing of any
446
           compressed file. Defaults to -1.
447
448
   indexallfilenames
449
450
           Recoll indexes file names in a special section of the database to
451
           allow specific file names searches using wild cards. This
452
           parameter decides if file name indexing is performed only for
453
           files with mime types that would qualify them for full text
454
           indexing, or for all files inside the selected subtrees,
455
           independently of mime type.
456
569
457
   idxabsmlen
570
   idxabsmlen
458
571
459
           Recoll stores an abstract for each indexed file inside the
572
           Recoll stores an abstract for each indexed file inside the
460
           database. The text can come from an actual 'abstract' section in
573
           database. The text can come from an actual 'abstract' section in
...
...
494
           This lets you adjust the size of n-grams used for indexing CJK
607
           This lets you adjust the size of n-grams used for indexing CJK
495
           text. The default value of 2 is probably appropriate in most
608
           text. The default value of 2 is probably appropriate in most
496
           cases. A value of 3 would allow more precision and efficiency on
609
           cases. A value of 3 would allow more precision and efficiency on
497
           longer words, but the index will be approximately twice as large.
610
           longer words, but the index will be approximately twice as large.
498
611
612
   guesscharset
613
614
           Decide if we try to guess the character set of files if no
615
           internal value is available (ie: for plain text files). This does
616
           not work well in general, and should probably not be used.
617
499
7.4.2. The mimemap file
618
7.4.2. The mimemap file
500
619
501
   mimemap specifies the file name extension to mime type mappings.
620
   mimemap specifies the file name extension to mime type mappings.
502
621
503
   For file names without an extension, or with an unknown one, the system's
622
   For file names without an extension, or with an unknown one, the system's
...
...
547
   non-default entries, which will override those from the central
666
   non-default entries, which will override those from the central
548
   configuration file.
667
   configuration file.
549
668
550
   Please note that these entries must be placed under a [view] section.
669
   Please note that these entries must be placed under a [view] section.
551
670
671
   The keys in the file are normally mime types. You can add an application
672
   tag to specialize the choice for an area of the filesystem (using a
673
   localfields specification in mimeconf). The syntax for the key is
674
   mimetype|tag
675
552
   If Use desktop preferences to choose document editor is checked in the
676
   If Use desktop preferences to choose document editor is checked in the
553
   user preferences, all mimeview entries will be ignored except the one
677
   user preferences, all mimeview entries will be ignored except the one
554
   labelled application/x-all (which is set to use xdg-open by default).
678
   labelled application/x-all (which is set to use xdg-open by default).
555
679
556
7.4.5. Examples of configuration adjustments
680
7.4.5. Examples of configuration adjustments