Switch to unified view

a/src/README b/src/README
...
...
25
25
26
                1.2. Full text search
26
                1.2. Full text search
27
27
28
                1.3. Recoll overview
28
                1.3. Recoll overview
29
29
30
   2. Indexation
30
   2. Indexing
31
31
32
                2.1. Introduction
32
                2.1. Introduction
33
33
34
                2.2. Index storage
35
36
                             2.2.1. Security aspects
37
34
                2.2. The indexation configuration
38
                2.3. The indexing configuration
35
39
36
                2.3. Starting indexation
40
                2.4. Starting indexing
37
41
38
                2.4. Using cron to automate indexation
42
                2.5. Using cron to automate indexing
39
43
40
   3. Search
44
   3. Search
41
45
42
                3.1. Simple search
46
                3.1. Simple search
43
47
44
                3.2. Complex/advanced search
48
                3.2. Complex/advanced search
45
49
50
                3.3. Multiple databases
51
46
                3.3. Document history
52
                3.4. Document history
47
53
48
                3.4. Result list sorting
54
                3.5. Result list sorting
49
55
56
                3.6. Additional result list functionality
57
50
                3.5. Search tips, shortcuts
58
                3.7. Search tips, shortcuts
51
59
52
                3.6. Customising the search interface
60
                3.8. Customising the search interface
53
61
54
   4. Installation
62
   4. Installation
55
63
56
                4.1. Building from source
64
                4.1. Building from source
57
65
...
...
134
1.3. Recoll overview
142
1.3. Recoll overview
135
143
136
   Recoll uses the Xapian information retrieval library as its storage and
144
   Recoll uses the Xapian information retrieval library as its storage and
137
   retrieval engine. Xapian is a very mature package using a sophisticated
145
   retrieval engine. Xapian is a very mature package using a sophisticated
138
   probabilistic ranking model. Recoll provides the interface to get data
146
   probabilistic ranking model. Recoll provides the interface to get data
139
   into (indexation) and out (searching) of the system.
147
   into (indexing) and out (searching) of the system.
140
148
141
   In practice, Xapian works by remembering where terms appear in your
149
   In practice, Xapian works by remembering where terms appear in your
142
   document files. The acquisition process is called indexation.
150
   document files. The acquisition process is called indexing.
143
151
144
   The resulting database can be big (roughly the size of the original
152
   The resulting index can be big (roughly the size of the original document
145
   document set), but it is not a document archive. Recoll can only display
153
   set), but it is not a document archive. Recoll can only display documents
146
   documents that still exist at the place from which they were indexed.
154
   that still exist at the place from which they were indexed. (Actually,
147
   (Actually, there is a way to reconstruct a document from the information
155
   there is a way to reconstruct a document from the information in the
148
   in the database, but the result is not nice, as all formatting,
156
   index, but the result is not nice, as all formatting, punctuation and
149
   punctuation and capitalisation are lost).
157
   capitalisation are lost).
150
158
151
   Recoll stores all internal data in Unicode UTF-8 format, and it can index
159
   Recoll stores all internal data in Unicode UTF-8 format, and it can index
152
   files with different character sets, encodings, and languages into the
160
   files with different character sets, encodings, and languages into the
153
   same database. It has input filters for many document types.
161
   same index. It has input filters for many document types.
154
162
155
   Stemming depends on the document language. Recoll stores the unstemmed
163
   Stemming depends on the document language. Recoll stores the unstemmed
156
   versions of terms and uses auxiliary databases for term expansion. It can
164
   versions of terms and uses auxiliary databases for term expansion. It can
157
   switch stemming languages, or add a language, without reindexing. Storing
165
   switch stemming languages, or add a language, without reindexing. Storing
158
   documents in different languages in the same database is possible, and
166
   documents in different languages in the same index is possible, and useful
159
   useful in practice, but does introduce possibilities of confusion. Recoll
167
   in practice, but does introduce possibilities of confusion. Recoll
160
   currently makes no attempt at automatic language recognition.
168
   currently makes no attempt at automatic language recognition.
161
169
162
   Recoll has many parameters which define exactly what to index, and how to
170
   Recoll has many parameters which define exactly what to index, and how to
163
   classify and decode the source documents. These are kept in a
171
   classify and decode the source documents. These are kept in a
164
   configuration file. A default configuration is copied into a standard
172
   configuration file. A default configuration is copied into a standard
...
...
168
   by default in the .recoll subdirectory of your home directory. The default
176
   by default in the .recoll subdirectory of your home directory. The default
169
   configuration will index your home directory with default parameters and
177
   configuration will index your home directory with default parameters and
170
   should be sufficient for giving Recoll a try, but you may want to adjust
178
   should be sufficient for giving Recoll a try, but you may want to adjust
171
   it later.
179
   it later.
172
180
173
   Indexation is started automatically the first time you execute the recoll
181
   Indexing is started automatically the first time you execute the recoll
174
   search graphical user interface, or by executing the recollindex command.
182
   search graphical user interface, or by executing the recollindex command.
175
183
176
   Searches are performed inside the recoll program, which has many options
184
   Searches are performed inside the recoll program, which has many options
177
   to help you find what you are looking for.
185
   to help you find what you are looking for.
178
186
179
     ----------------------------------------------------------------------
187
     ----------------------------------------------------------------------
180
188
181
                             Chapter 2. Indexation
189
                              Chapter 2. Indexing
182
190
183
2.1. Introduction
191
2.1. Introduction
184
192
185
   Indexation is the process by which the set of documents is analyzed and
193
   Indexing is the process by which the set of documents is analyzed and the
186
   the data entered into the database. Recoll indexation is normally
194
   data entered into the database. Recoll indexing is normally incremental:
187
   incremental: documents will only be processed if they have been modified.
195
   documents will only be processed if they have been modified. On the first
188
   On the first execution, of course, all documents will need processing. A
196
   execution, of course, all documents will need processing. A full index
189
   full index build can be forced later on by specifying an option to the
197
   build can be forced later on by specifying an option to the indexing
190
   indexation command (recollindex -z).
198
   command (recollindex -z).
191
199
192
   Recoll indexation takes place at discrete times. There is currently no
200
   Recoll indexing takes place at discrete times. There is currently no
193
   interface to real time file modification monitors. The typical usage is to
201
   interface to real time file modification monitors. The typical usage is to
194
   have a nightly indexation run programmed into your cron file.
202
   have a nightly indexing run programmed into your cron file.
195
203
196
   +------------------------------------------------------------------------+
204
   +------------------------------------------------------------------------+
197
   | Side note: there is nothing in Recoll and Xapian that would prevent    |
205
   | Side note: there is nothing in Recoll and Xapian that would prevent    |
198
   | interfacing with a real time file modification monitor, but this would |
206
   | interfacing with a real time file modification monitor, but this would |
199
   | tend to consume significant system resources for dubious gain, because |
207
   | tend to consume significant system resources for dubious gain, because |
...
...
206
   for document types recognition and processing are set in configuration
214
   for document types recognition and processing are set in configuration
207
   files Most file types, like HTML or word processing files, only hold one
215
   files Most file types, like HTML or word processing files, only hold one
208
   document. Some file types, like mail folder files can hold many
216
   document. Some file types, like mail folder files can hold many
209
   individually indexed documents.
217
   individually indexed documents.
210
218
211
   Recoll indexation processes plain text, HTML, openoffice and e-mail files
219
   Recoll indexing processes plain text, HTML, openoffice and e-mail files
212
   internally. Other types (ie: postscript, pdf, ms-word, rtf) need external
220
   internally. Other types (ie: postscript, pdf, ms-word, rtf) need external
213
   applications for preprocessing. The list is in the installation section.
221
   applications for preprocessing. The list is in the installation section.
214
222
215
   Without further configuration, Recoll will index all appropriate files
223
   Without further configuration, Recoll will index all appropriate files
216
   from your home directory, with a reasonable set of defaults.
224
   from your home directory, with a reasonable set of defaults.
217
225
218
     ----------------------------------------------------------------------
226
     ----------------------------------------------------------------------
219
227
228
2.2. Index storage
229
230
   The default location for the index data is the $HOME/.recoll/xapiandb/
231
   directory. This can be changed by setting the RECOLL_CONFDIR environment
232
   variable, or by specifying the dbdir parameter in the configuration file
233
   (see the configuration section).
234
235
   The size of the index is determined by the size of the set of documents,
236
   but the ratio can vary a lot. For a typical mixed set of documents, the
237
   index size will often be close to the data set size. In specific cases (a
238
   set of compressed mbox files for example), the index can become much
239
   bigger than the documents. It may also be much smaller if the documents
240
   contain a lot of images or other non-indexed data (an extreme example
241
   being a set of mp3 files where only the tags would be indexed).
242
243
   Of course, images, sound and video do not increase the index size, which
244
   means that it will be quite typical nowadays (2006), that even a big index
245
   will be negligible against the total amount of data on the computer.
246
247
   The index data directory only contains data that will be rebuilt by an
248
   index run, so that it can be destroyed safely.
249
250
     ----------------------------------------------------------------------
251
252
  2.2.1. Security aspects
253
254
   The Recoll index does not hold copies of the indexed documents. But it
255
   does hold enough data to allow for an almost complete reconstruction. If
256
   confidential data is indexed, access to the database directory should be
257
   restricted.
258
259
   As of version 1.4, Recoll will create the configuration directory with a
260
   mode of 0700 (access by owner only). As the index directory is by default
261
   a subdirectory of the configuration directory, this should result in
262
   appropriate protection.
263
264
   If you use another setup, you should think of the kind of protection you
265
   need for your index, and set the directory access modes appropriately.
266
267
     ----------------------------------------------------------------------
268
220
2.2. The indexation configuration
269
2.3. The indexing configuration
221
270
222
   Values set in the system-wide configuration file (named like
271
   Values set in the system-wide configuration file (named like
223
   /usr/[local/]share/recoll/examples/recoll.conf) can be overriden by those
272
   /usr/[local/]share/recoll/examples/recoll.conf) can be overriden by those
224
   set in the personal one, named $HOME/.recoll/recoll.conf by default or
273
   set in the personal one, named $HOME/.recoll/recoll.conf by default or
225
   $RECOLL_CONFDIR/recoll.conf if RECOLL_CONFDIR is set.
274
   $RECOLL_CONFDIR/recoll.conf if RECOLL_CONFDIR is set.
226
275
227
   The most accurate documentation for editing the file is given by comments
276
   The most accurate documentation for editing the file is given by comments
228
   inside the central one. If you want to adjust the configuration before
277
   inside the central one. If you want to adjust the configuration before
229
   indexation, just click Cancel when the program asks if it should start
278
   indexing, just click Cancel when the program asks if it should start
230
   initial indexation. This will have created a .recoll directory containing
279
   initial indexing. This will have created a .recoll directory containing
231
   empty configuration files.
280
   empty configuration files.
232
281
233
   The configuration is also documented inside the installation chapter of
282
   The configuration is also documented inside the installation chapter of
234
   this document, or in the recoll.conf(5) man page.
283
   this document, or in the recoll.conf(5) man page.
235
284
236
     ----------------------------------------------------------------------
285
     ----------------------------------------------------------------------
237
286
238
2.3. Starting indexation
287
2.4. Starting indexing
239
288
240
   Indexation is performed either by the recollindex program, or by the
289
   Indexing is performed either by the recollindex program, or by the
241
   indexation thread inside the recoll program (use the File menu).
290
   indexing thread inside the recoll program (use the File menu).
242
291
243
   If the recoll program finds no database when it starts, it will
292
   If the recoll program finds no index when it starts, it will automatically
244
   automatically start indexation (except if cancelled).
293
   start indexing (except if cancelled).
245
294
246
   It is best to avoid interrupting the indexation process, as this may
295
   It is best to avoid interrupting the indexing process, as this may
247
   sometimes leave the database in a bad state. This is not a serious
296
   sometimes leave the database in a bad state. This is not a serious
248
   problem, as you then just need to clear everything and restart the
297
   problem, as you then just need to clear everything and restart the
249
   indexation: the database files are normally stored in the
298
   indexing: the index files are normally stored in the
250
   $HOME/.recoll/xapiandb directory, which you can just delete if needed.
299
   $HOME/.recoll/xapiandb directory, which you can just delete if needed.
251
   Alternatively, you can start recollindex -z, which will reset the database
300
   Alternatively, you can start recollindex -z, which will reset the database
252
   before indexation.
301
   before indexing.
253
302
254
     ----------------------------------------------------------------------
303
     ----------------------------------------------------------------------
255
304
256
2.4. Using cron to automate indexation
305
2.5. Using cron to automate indexing
257
306
258
   The most common way to set up indexation is to have a cron task execute it
307
   The most common way to set up indexing is to have a cron task execute it
259
   every night. For example the following crontab entry would do it every day
308
   every night. For example the following crontab entry would do it every day
260
   at 3:30AM (supposing recollindex is in your PATH):
309
   at 3:30AM (supposing recollindex is in your PATH):
261
310
262
 30 3 * * * recollindex > /tmp/recolltrace 2>&1
311
 30 3 * * * recollindex > /tmp/recolltrace 2>&1
263
312
...
...
333
   Click on the Show query details link at the top of the result page to see
382
   Click on the Show query details link at the top of the result page to see
334
   the query expansion.
383
   the query expansion.
335
384
336
     ----------------------------------------------------------------------
385
     ----------------------------------------------------------------------
337
386
387
3.3. Multiple databases
388
389
   Your Recoll configuration always defines a main index. This is what gets
390
   updated, for example, when you execute recollindex.
391
392
   You can use the search configuration tool to define additional databases
393
   to be searched. These databases can be made active or inactive at any
394
   moment.
395
396
   The typical use of this feature is for a system administrator to set up a
397
   central index, that you may choose to search, or not, in addition to your
398
   personal data. Of course, there are other possibilities.
399
400
   The main index (defined by your personal configuration) is always active.
401
402
   The list of searchable databases may also be defined by the
403
   RECOLL_EXTRA_DBS environment variable. This should hold a colon-separated
404
   list of index directories, ie:
405
406
 export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
407
408
     ----------------------------------------------------------------------
409
338
3.3. Document history
410
3.4. Document history
339
411
340
   Documents that you actually view (with the internal preview or an external
412
   Documents that you actually view (with the internal preview or an external
341
   tool) are entered into the document history, which is remembered. You can
413
   tool) are entered into the document history, which is remembered. You can
342
   display the history list by using the Tools/Doc History menu entry.
414
   display the history list by using the Tools/Doc History menu entry.
343
415
344
     ----------------------------------------------------------------------
416
     ----------------------------------------------------------------------
345
417
346
3.4. Result list sorting
418
3.5. Result list sorting
347
419
348
   The documents in a result list are normally sorted in order of relevance.
420
   The documents in a result list are normally sorted in order of relevance.
349
   It is possible to specify different sort parameters by using the Sort
421
   It is possible to specify different sort parameters by using the Sort
350
   parameters dialog (located in the Tools menu).
422
   parameters dialog (located in the Tools menu).
351
423
...
...
357
   the program exits. An activated sort is indicated in the result list
429
   the program exits. An activated sort is indicated in the result list
358
   header.
430
   header.
359
431
360
     ----------------------------------------------------------------------
432
     ----------------------------------------------------------------------
361
433
434
3.6. Additional result list functionality
435
436
   Apart from the preview and edit links, you can display a popup menu by
437
   right-clicking over a paragraph in the result list. This menu has the
438
   following entries:
439
440
     * Preview
441
442
     * Edit
443
444
     * Copy File Name
445
446
     * Copy Url
447
448
     * More like this
449
450
   The Preview and Edit entries do the same thing as the corresponding links.
451
   The two following entries will copy either an url or the file path to the
452
   clipboard, for pasting into another application.
453
454
   The More like this entry will select a number of relevant term from the
455
   current document and enter them into the simple search field. You can then
456
   start a simple search, with a good chance of finding documents related to
457
   the current result.
458
459
     ----------------------------------------------------------------------
460
362
3.5. Search tips, shortcuts
461
3.7. Search tips, shortcuts
363
462
364
   Disabling stem expansion. Entering a capitalized word in any search field
463
   Disabling stem expansion. Entering a capitalized word in any search field
365
   will prevent stem expansion (no search for gardening if you enter Garden
464
   will prevent stem expansion (no search for gardening if you enter Garden
366
   instead of garden). This is the only case where character case should make
465
   instead of garden). This is the only case where character case should make
367
   a difference for a Recoll search.
466
   a difference for a Recoll search.
...
...
369
   Phrases. A phrase can be looked for by enclosing it in double quotes.
468
   Phrases. A phrase can be looked for by enclosing it in double quotes.
370
   Example: "user manual" will look only for occurrences of user immediately
469
   Example: "user manual" will look only for occurrences of user immediately
371
   followed by manual. You can use the This exact phrase field of the
470
   followed by manual. You can use the This exact phrase field of the
372
   advanced search dialog to the same effect.
471
   advanced search dialog to the same effect.
373
472
473
   Term completion. Typing ^TAB (Control+Tab) in the simple search entry
474
   field while entering a word will either complete the current word if its
475
   beginning matches a unique term in the index, or open a window to propose
476
   a list of completions
477
478
   Picking up new terms for search from displayed documents. Double-clicking
479
   on a word in the result list or in a preview window will copy it to the
480
   simple search entry field.
481
482
   Finding related documents. Selecting the More like this entry in the
483
   result list paragraph right-click menu will select a set of "interesting"
484
   terms from the current result, and insert them into the simple search
485
   entry field. You can then possibly edit the list and start a search to
486
   find documents which may be apparented to the current result.
487
374
   Query explanation. You can get an exact description of what the query
488
   Query explanation. You can get an exact description of what the query
375
   looked for, including stem expansion, and boolean operators used, by
489
   looked for, including stem expansion, and boolean operators used, by
376
   clicking on the result list header.
490
   clicking on the result list header.
377
491
378
   File names. All file name elements (the broken up file path) are entered
492
   File names. File names are added as terms during indexing, and you can
379
   as terms during indexation, and you can specify them as ordinary terms in
493
   specify them as ordinary terms in normal search fields (Recoll used to
380
   normal search fields. Alternatively, you can use specific file name search
494
   index all directories in the file path as terms. This has been abandonned
495
   as it did not seem really useful). Alternatively, you can use specific
381
   which will only look for file names and can use wildcard expansion.
496
   file name search which will only look for file names and can use wildcard
497
   expansion.
382
498
383
   Quitting. Entering ^Q almost anywhere will close the application.
499
   Quitting. Entering ^Q almost anywhere will close the application.
384
500
385
   Closing previews. Entering ^W in a preview tab will close it (and, for the
501
   Closing previews. Entering ^W in a preview tab will close it (and, for the
386
   last tab, close the preview window).
502
   last tab, close the preview window).
387
503
388
     ----------------------------------------------------------------------
504
     ----------------------------------------------------------------------
389
505
390
3.6. Customising the search interface
506
3.8. Customising the search interface
391
507
392
   It is possible to customise some aspects of the search interface by using
508
   It is possible to customise some aspects of the search interface by using
393
   Query configuration entry in the Preferences menu.
509
   Query configuration entry in the Preferences menu.
394
510
395
   There are two tabs in the dialog, dealing with the interface itself, and
511
   There are two tabs in the dialog, dealing with the interface itself, and
...
...
402
     * Result list font: There is quite a lot of information shown in the
518
     * Result list font: There is quite a lot of information shown in the
403
       result list, and you may want to customise the font and/or font size.
519
       result list, and you may want to customise the font and/or font size.
404
       The rest of the fonts used by Recoll are determined by your generic QT
520
       The rest of the fonts used by Recoll are determined by your generic QT
405
       config (try the qtconfig command.
521
       config (try the qtconfig command.
406
522
407
     * Html help browser: this will let you chose your the preferred browser
523
     * Html help browser: this will let you chose your preferred browser
408
       which will be started from the Help menu to read the user manual. You
524
       which will be started from the Help menu to read the user manual. You
409
       can enter a simple name if the command is in your PATH, or browse for
525
       can enter a simple name if the command is in your PATH, or browse for
410
       a full pathname.
526
       a full pathname.
411
527
412
     * Show document type icons in result list: icons in the result list can
528
     * Show document type icons in result list: icons in the result list can
413
       be turned off. They take quite a lot of space and convey relatively
529
       be turned off. They take quite a lot of space and convey relatively
414
       little useful information.
530
       little useful information.
531
532
     * Auto-start simple search on whitespace entry: if this is checked, a
533
       search will be executed each time you enter a space in the simple
534
       search input field. This lets you look at the result list as you enter
535
       new terms. This is off by default, you may like it or not...
415
536
416
   Search parameters:
537
   Search parameters:
417
538
418
     * Stemming language: stemming obviously depends on the document's
539
     * Stemming language: stemming obviously depends on the document's
419
       language. This listbox will let you chose among the stemming databases
540
       language. This listbox will let you chose among the stemming databases
420
       which were built during indexing (this is set in the main
541
       which were built during indexing (this is set in the main
421
       configuration file), or later added with recollindex -s (See the
542
       configuration file), or later added with recollindex -s (See the
422
       recollindex manual). Stemming languages which are dynamically added
543
       recollindex manual). Stemming languages which are dynamically added
423
       will be deleted at the next indexation pass unless they are also added
544
       will be deleted at the next indexing pass unless they are also added
424
       in the configuration file.
545
       in the configuration file.
425
546
426
     * Dynamically build abstracts: this decides if Recoll tries to build
547
     * Dynamically build abstracts: this decides if Recoll tries to build
427
       document abstracts when displaying the result list. Abstracts are
548
       document abstracts when displaying the result list. Abstracts are
428
       constructed by taking context from the document information, around
549
       constructed by taking context from the document information, around
...
...
431
552
432
     * Replace abstracts from documents: this decides if we should synthetize
553
     * Replace abstracts from documents: this decides if we should synthetize
433
       and display an abstract in place of an explicit abstract found within
554
       and display an abstract in place of an explicit abstract found within
434
       the document itself.
555
       the document itself.
435
556
557
   Extra databases:
558
559
   This panel will let you browse for additional databases that you may want
560
   to search. Extra databases are designated by their database directory (ie:
561
   /home/someothergui/.recoll/xapiandb, /usr/local/recollglobal/xapiandb).
562
563
   Once entered, the databases will appear in the All extra databases list,
564
   and you can chose which ones you want to use at any moment by tranferring
565
   them to/from the Active extra databases list.
566
567
   Your main database (the one the current configuration indexes to), is
568
   always implicitely active. If this is not desirable, you can set up your
569
   configuration so that it indexes, for example, an empty directory.
570
436
     ----------------------------------------------------------------------
571
     ----------------------------------------------------------------------
437
572
438
                            Chapter 4. Installation
573
                            Chapter 4. Installation
439
574
440
4.1. Building from source
575
4.1. Building from source
441
576
442
  4.1.1. Prerequisites
577
  4.1.1. Prerequisites
443
578
444
   At the very least, you will need to download and install the xapian core
579
   At the very least, you will need to download and install the xapian core
445
   package (Recoll currently uses version 0.9.2), and the qt runtime and
580
   package (Recoll development currently uses version 0.9.5), and the qt
446
   development packages (Recoll development currently uses version 3.3.5, but
581
   runtime and development packages (Recoll development currently uses
447
   any 3.3 version is probably ok).
582
   version 3.3.5, but any 3.3 version is probably ok).
448
583
449
   You will most probably be able to find a binary package for qt for your
584
   You will most probably be able to find a binary package for qt for your
450
   system. You may have to compile Xapian but this is not difficult (if you
585
   system. You may have to compile Xapian but this is not difficult (if you
451
   are using FreeBSD, there is a port).
586
   are using FreeBSD, there is a port).
452
587
...
...
561
696
562
   There are two sets of configuration files. The system-wide files are kept
697
   There are two sets of configuration files. The system-wide files are kept
563
   in a directory named like /usr/[local/]share/recoll/examples, they define
698
   in a directory named like /usr/[local/]share/recoll/examples, they define
564
   default values for the system. A parallel set of files exists in the
699
   default values for the system. A parallel set of files exists in the
565
   .recoll directory in your home (this can be changed with the
700
   .recoll directory in your home (this can be changed with the
566
   RECOLL_CONFDIR environment variable. The database is also kept in .recoll
701
   RECOLL_CONFDIR environment variable.
567
   by default, (this can be changed by a configuration parameter).
568
702
569
   If the .recoll directory does not exist when recoll or recollindex are
703
   If the .recoll directory does not exist when recoll or recollindex are
570
   started, it will be created with a set of empty configuration files.
704
   started, it will be created with a set of empty configuration files.
571
   recoll will give you a chance to edit the configuration file before
705
   recoll will give you a chance to edit the configuration file before
572
   starting indexation. recollindex will proceed immediately.
706
   starting indexing. recollindex will proceed immediately.
573
707
574
   Most of the parameters specific to the recoll GUI are set through the
708
   Most of the parameters specific to the recoll GUI are set through the
575
   Preferences menu and stored in the standard QT place ($HOME/.qt/recollrc).
709
   Preferences menu and stored in the standard QT place ($HOME/.qt/recollrc).
576
   You probably do not want to edit this by hand.
710
   You probably do not want to edit this by hand.
577
711
...
...
598
     * Parameter affectation (name = value).
732
     * Parameter affectation (name = value).
599
733
600
     * Section definition ([somedirname]).
734
     * Section definition ([somedirname]).
601
735
602
   Section lines allow redefining some parameters for a directory subtree.
736
   Section lines allow redefining some parameters for a directory subtree.
603
   Some of the parameters used for indexation are looked up hierarchically
737
   Some of the parameters used for indexing are looked up hierarchically from
604
   from the more to the less specific. Not all parameters can be meaningfully
738
   the more to the less specific. Not all parameters can be meaningfully
605
   redefined, this is specified for each in the next section.
739
   redefined, this is specified for each in the next section.
606
740
607
   The tilde character (~) is expanded in file names to the name of the
741
   The tilde character (~) is expanded in file names to the name of the
608
   user's home directory.
742
   user's home directory.
609
743
...
...
617
   recoll.conf is the main configuration file. It defines things like what to
751
   recoll.conf is the main configuration file. It defines things like what to
618
   index (top directories and things to ignore), and the default character
752
   index (top directories and things to ignore), and the default character
619
   set to use for document types which do not specify it internally.
753
   set to use for document types which do not specify it internally.
620
754
621
   The default configuration will index your home directory. If this is not
755
   The default configuration will index your home directory. If this is not
622
   appropriate, use recoll to copy the sample configuration, click Cancel,
756
   appropriate, start recoll to create a blank configuration, click Cancel,
623
   and edit the configuration file before restarting the command. This will
757
   and edit the configuration file before restarting the command. This will
624
   start the initial indexation, which may take some time.
758
   start the initial indexing, which may take some time.
625
759
626
   Paramers:
760
   Paramers:
627
761
628
   topdirs
762
   topdirs
629
763
630
           Specifies the list of directories or files to index (recursively
764
           Specifies the list of directories or files to index (recursively
631
           for directories). The indexer will not follow symbolic links
765
           for directories). The indexer will not follow symbolic links
632
           inside the indexed trees. If an entry in the topdirs list is a
766
           inside the indexed trees. If an entry in the topdirs list is a
633
           symbolic link, indexation will not start and will generate an
767
           symbolic link, indexing will not start and will generate an error.
634
           error.
635
768
636
   skippedNames
769
   skippedNames
637
770
638
           A space-separated list of patterns for names of files or
771
           A space-separated list of patterns for names of files or
639
           directories that should be completely ignored. The list defined in
772
           directories that should be completely ignored. The list defined in
...
...
660
           Verbosity level for recoll and recollindex. A value of 4 lists
793
           Verbosity level for recoll and recollindex. A value of 4 lists
661
           quite a lot of debug/information messages. 2 only lists errors.
794
           quite a lot of debug/information messages. 2 only lists errors.
662
795
663
   logfilename
796
   logfilename
664
797
665
           Where should the messages go. 'stderr' can be used as a special
798
           Where the messages should go. 'stderr' can be used as a special
666
           value.
799
           value, and is the default.
667
800
668
   filtersdir
801
   filtersdir
669
802
670
           A directory to search for the external filter scripts used to
803
           A directory to search for the external filter scripts used to
671
           index some types of files. The value should not be changed, except
804
           index some types of files. The value should not be changed, except
...
...
675
   indexstemminglanguages
808
   indexstemminglanguages
676
809
677
           A list of languages for which the stem expansion databases will be
810
           A list of languages for which the stem expansion databases will be
678
           built. See recollindex(1) for possible values. You can add a stem
811
           built. See recollindex(1) for possible values. You can add a stem
679
           expansion database for a different language by using recollindex
812
           expansion database for a different language by using recollindex
680
           -s, but it will be deleted during the next indexation. Only
813
           -s, but it will be deleted during the next indexing. Only
681
           languages listed in the configuration file are permanent.
814
           languages listed in the configuration file are permanent.
682
815
683
   iconsdir
816
   iconsdir
684
817
685
           The name of the directory where recoll result list icons are
818
           The name of the directory where recoll result list icons are
686
           stored. You can change this if you want different images.
819
           stored. You can change this if you want different images.
687
820
688
   dbdir
821
   dbdir
689
822
690
           The name of the Xapian database directory. It will be created if
823
           The name of the Xapian data directory. It will be created if
691
           needed when the database is initialized.
824
           needed when the index is initialized.
692
825
693
   defaultcharset
826
   defaultcharset
694
827
695
           The name of the character set used for files that do not contain a
828
           The name of the character set used for files that do not contain a
696
           character set definition (ie: plain text files). This can be
829
           character set definition (ie: plain text files). This can be
...
...
708
841
709
           Decide if we use the file -i system command as a final step for
842
           Decide if we use the file -i system command as a final step for
710
           determining the mime type for a file (the main procedure uses
843
           determining the mime type for a file (the main procedure uses
711
           suffix associations as defined in the mimemap file). This can be
844
           suffix associations as defined in the mimemap file). This can be
712
           useful for files with suffixless names, but it will also cause the
845
           useful for files with suffixless names, but it will also cause the
713
           indexation of many bogus "text" files.
846
           indexing of many bogus "text" files.
714
847
715
   indexallfilenames
848
   indexallfilenames
716
849
717
           Recoll indexes file names in a special section of the database to
850
           Recoll indexes file names in a special section of the database to
718
           allow specific file names searches using wild cards. This
851
           allow specific file names searches using wild cards. This
719
           parameter decides if file name indexing is performed only for
852
           parameter decides if file name indexing is performed only for
720
           files with mime types that would qualify them for full text
853
           files with mime types that would qualify them for full text
721
           indexation, or for all files inside the selected subtrees,
854
           indexing, or for all files inside the selected subtrees,
722
           independant of mime type.
855
           independant of mime type.
723
856
724
     ----------------------------------------------------------------------
857
     ----------------------------------------------------------------------
725
858
726
  4.4.2. The mimemap file
859
  4.4.2. The mimemap file
...
...
728
   mimemap specifies the file name extension to mime type mappings.
861
   mimemap specifies the file name extension to mime type mappings.
729
862
730
   For file names without an extension, or with an unknown one, the system's
863
   For file names without an extension, or with an unknown one, the system's
731
   file -i command will be executed to determine the mime type (this can be
864
   file -i command will be executed to determine the mime type (this can be
732
   switched off inside the main configuration file).
865
   switched off inside the main configuration file).
733
734
   mimemap also has a list of extensions which should be ignored totally (to
735
   avoid losing time by executing file for things that certainly should not
736
   be indexed).
737
866
738
   The mappings can be specified on a per-subtree basis, which may be useful
867
   The mappings can be specified on a per-subtree basis, which may be useful
739
   in some cases. Example: gaim logs have a .txt extension but should be
868
   in some cases. Example: gaim logs have a .txt extension but should be
740
   handled specially, which is possible because they are usually all located
869
   handled specially, which is possible because they are usually all located
741
   in one place.
870
   in one place.
...
...
748
877
749
     ----------------------------------------------------------------------
878
     ----------------------------------------------------------------------
750
879
751
  4.4.3. The mimeconf file
880
  4.4.3. The mimeconf file
752
881
753
   mimeconf specifies how the different mime types are handled for
882
   mimeconf specifies how the different mime types are handled for indexing,
754
   indexation, and for display.
883
   and for display.
755
884
756
   Changing the indexation parameters is probably not a good idea except if
885
   Changing the indexing parameters is probably not a good idea except if you
757
   you are a Recoll developper.
886
   are a Recoll developper.
758
887
759
   You may want to adjust the external viewers defined in (ie: html is either
888
   You may want to adjust the external viewers defined in (ie: html is either
760
   previewed internally or displayed using firefox, but you may prefer
889
   previewed internally or displayed using firefox, but you may prefer
761
   mozilla, your openoffice.org program might be named oofice instead of
890
   mozilla, your openoffice.org program might be named oofice instead of
762
   openoffice ...). Look for the [view] section.
891
   openoffice ...). Look for the [view] section.