Switch to unified view

a/src/README b/src/README
...
...
43
43
44
   3. Search
44
   3. Search
45
45
46
                3.1. Simple search
46
                3.1. Simple search
47
47
48
                3.2. The result list
49
50
                             3.2.1. The result list right-click menu
51
52
                3.3. The preview window
53
48
                3.2. Complex/advanced search
54
                3.4. Complex/advanced search
49
55
50
                3.3. Multiple databases
56
                3.5. Multiple databases
51
57
52
                3.4. Document history
58
                3.6. Document history
53
59
54
                3.5. Result list sorting
60
                3.7. Sorting search results
55
61
56
                3.6. Additional result list functionality
57
58
                3.7. Search tips, shortcuts
62
                3.8. Search tips, shortcuts
59
63
60
                3.8. Customising the search interface
64
                3.9. Customising the search interface
61
65
62
   4. Installation
66
   4. Installation
63
67
68
                4.1. Installing a prebuilt copy
69
70
                             4.1.1. Installing through a package system
71
72
                             4.1.2. Installing a prebuilt Recoll
73
64
                4.1. Building from source
74
                4.2. Building from source
65
75
66
                             4.1.1. Prerequisites
76
                             4.2.1. Prerequisites
67
77
68
                             4.1.2. Building
78
                             4.2.2. Building
69
79
70
                             4.1.3. Installation
80
                             4.2.3. Installation
71
72
                4.2. Installing a prebuilt copy
73
74
                             4.2.1. Installing through a package system
75
76
                             4.2.2. Installing a prebuilt Recoll
77
81
78
                4.3. Packages needed for external file types
82
                4.3. Packages needed for external file types
79
83
80
                4.4. Configuration overview
84
                4.4. Configuration overview
81
85
...
...
91
95
92
1.1. Giving it a try
96
1.1. Giving it a try
93
97
94
   If you do not like reading manuals (who does?) and would like to give
98
   If you do not like reading manuals (who does?) and would like to give
95
   Recoll a try, just perform installation and start the recoll user
99
   Recoll a try, just perform installation and start the recoll user
96
   interface, which will index your home directory and let you search it
100
   interface, which will index your home directory by default, allowing you
97
   right after.
101
   to search immediately after indexing completes.
98
102
99
   Do not do this if your home has a huge number of documents and you do not
103
   Do not do this if your home directory contains a huge number of documents
100
   want to wait or are very short on disk space. In this case, you may want
104
   and you do not want to wait or are very short on disk space. In this case,
101
   to edit the configuration file first to restrict the indexed area.
105
   you may want to edit the configuration file first to restrict the indexed
106
   area.
102
107
103
   Also be aware that you will need to install the appropriate supporting
108
   Also be aware that you may need to install the appropriate supporting
104
   applications for document types that need them (for example antiword for
109
   applications for document types that need them (for example antiword for
105
   ms-word files).
110
   ms-word files).
106
111
107
     ----------------------------------------------------------------------
112
     ----------------------------------------------------------------------
108
113
...
...
115
   return a list of matching documents, ordered so that the most relevant
120
   return a list of matching documents, ordered so that the most relevant
116
   documents will appear first.
121
   documents will appear first.
117
122
118
   You do not need to remember in what file or email message you stored a
123
   You do not need to remember in what file or email message you stored a
119
   given piece of information. You just ask for related terms, and the tool
124
   given piece of information. You just ask for related terms, and the tool
120
   will return a list of documents where those terms are prominent.
125
   will return a list of documents where those terms are prominent, in a
126
   similar way to internet search engines.
121
127
122
   This mode of operation has been made very familiar by internet search
128
   Recoll tries to determine which documents are most relevant to the search
123
   engines.
129
   terms you provide. Computer algorithms for determining relevance can be
124
130
   very complex, and in general are inferior to the power of the human mind
125
   The notion of relevance is a difficult one, as only you, the user,
131
   to rapidly determine relevance. The quality of relevance guessing by the
126
   actually know which documents are relevant to your search, and the
132
   search tool is probably the most important element for a search
127
   application can only try a guess. The quality of this guess is probably
133
   application.
128
   the most important element for a search application.
129
134
130
   In many cases, you are looking for all the forms of a word, not for a
135
   In many cases, you are looking for all the forms of a word, not for a
131
   specific form or spelling. These different forms may include plurals,
136
   specific form or spelling. These different forms may include plurals,
132
   different tenses for a verb, or terms derived from the same root or stem
137
   different tenses for a verb, or terms derived from the same root or stem
133
   (exemple: floor, floors, floored, floorings...). Recoll will by default
138
   (exemple: floor, floors, floored, floorings...). Recoll will by default
134
   expand queries to all such related terms (words that reduce to the same
139
   expand queries to all such related terms (words that reduce to the same
135
   stem). This expansion can be disabled at search time.
140
   stem). This expansion can be disabled at search time.
136
141
137
   Stemming, by itself, does not provide for misspellings or phonetic
142
   Stemming, by itself, does not accomodate for misspellings or phonetic
138
   searches. Recoll currently does not support these.
143
   searches. Recoll currently does not support these features.
139
144
140
     ----------------------------------------------------------------------
145
     ----------------------------------------------------------------------
141
146
142
1.3. Recoll overview
147
1.3. Recoll overview
143
148
...
...
200
   Recoll indexing takes place at discrete times. There is currently no
205
   Recoll indexing takes place at discrete times. There is currently no
201
   interface to real time file modification monitors. The typical usage is to
206
   interface to real time file modification monitors. The typical usage is to
202
   have a nightly indexing run programmed into your cron file.
207
   have a nightly indexing run programmed into your cron file.
203
208
204
   +------------------------------------------------------------------------+
209
   +------------------------------------------------------------------------+
205
   | Side note: there is nothing in Recoll and Xapian that would prevent    |
210
   | There is nothing in Recoll and Xapian that would prevent interfacing   |
206
   | interfacing with a real time file modification monitor, but this would |
211
   | with a real time file modification monitor, but this would tend to     |
207
   | tend to consume significant system resources for dubious gain, because |
212
   | consume significant system resources for dubious gain, because you     |
208
   | you rarely need a full text search to find documents you just          |
213
   | rarely need a full text search to find documents you just modified.    |
209
   | modified. recollindex -i can be used to add individual files to the    |
214
   | recollindex -i can be used to add individual files to the index if you |
210
   | index if you want to play with this, see the manual page.              |
215
   | want to play with this, see the manual page.                           |
211
   +------------------------------------------------------------------------+
216
   +------------------------------------------------------------------------+
212
217
213
   Recoll knows about quite a few different document types. The parameters
218
   Recoll knows about quite a few different document types. The parameters
214
   for document types recognition and processing are set in configuration
219
   for document types recognition and processing are set in configuration
215
   files Most file types, like HTML or word processing files, only hold one
220
   files Most file types, like HTML or word processing files, only hold one
...
...
220
   internally. Other types (ie: postscript, pdf, ms-word, rtf) need external
225
   internally. Other types (ie: postscript, pdf, ms-word, rtf) need external
221
   applications for preprocessing. The list is in the installation section.
226
   applications for preprocessing. The list is in the installation section.
222
227
223
   Without further configuration, Recoll will index all appropriate files
228
   Without further configuration, Recoll will index all appropriate files
224
   from your home directory, with a reasonable set of defaults.
229
   from your home directory, with a reasonable set of defaults.
230
231
   In some cases, it may be interesting to index different areas of the file
232
   system to separate databases. You can do this by using multiple
233
   configuration directories, each indexing a file system area to a specific
234
   database. You would use the RECOLL_CONFDIR environment variable or the -c
235
   confdir option to recollindex to indicate which configuration to process.
236
   The recoll search program can use any selection of the existing databases
237
   for each search, this is configurable inside the user interface.
225
238
226
     ----------------------------------------------------------------------
239
     ----------------------------------------------------------------------
227
240
228
2.2. Index storage
241
2.2. Index storage
229
242
...
...
242
255
243
   Of course, images, sound and video do not increase the index size, which
256
   Of course, images, sound and video do not increase the index size, which
244
   means that it will be quite typical nowadays (2006), that even a big index
257
   means that it will be quite typical nowadays (2006), that even a big index
245
   will be negligible against the total amount of data on the computer.
258
   will be negligible against the total amount of data on the computer.
246
259
247
   The index data directory only contains data that will be rebuilt by an
260
   The index data directory (xapiandb) only contains data that will be
248
   index run, so that it can be destroyed safely.
261
   rebuilt by an index run, and it can always be destroyed safely.
249
262
250
     ----------------------------------------------------------------------
263
     ----------------------------------------------------------------------
251
264
252
  2.2.1. Security aspects
265
  2.2.1. Security aspects
253
266
...
...
255
   does hold enough data to allow for an almost complete reconstruction. If
268
   does hold enough data to allow for an almost complete reconstruction. If
256
   confidential data is indexed, access to the database directory should be
269
   confidential data is indexed, access to the database directory should be
257
   restricted.
270
   restricted.
258
271
259
   As of version 1.4, Recoll will create the configuration directory with a
272
   As of version 1.4, Recoll will create the configuration directory with a
260
   mode of 0700 (access by owner only). As the index directory is by default
273
   mode of 0700 (access by owner only). As the index data directory is by
261
   a subdirectory of the configuration directory, this should result in
274
   default a subdirectory of the configuration directory, this should result
262
   appropriate protection.
275
   in appropriate protection.
263
276
264
   If you use another setup, you should think of the kind of protection you
277
   If you use another setup, you should think of the kind of protection you
265
   need for your index, and set the directory access modes appropriately.
278
   need for your index, and set the directory and files access modes
279
   appropriately.
266
280
267
     ----------------------------------------------------------------------
281
     ----------------------------------------------------------------------
268
282
269
2.3. The indexing configuration
283
2.3. The indexing configuration
270
284
...
...
280
   empty configuration files.
294
   empty configuration files.
281
295
282
   The configuration is also documented inside the installation chapter of
296
   The configuration is also documented inside the installation chapter of
283
   this document, or in the recoll.conf(5) man page.
297
   this document, or in the recoll.conf(5) man page.
284
298
299
   The applications needed to index file types other than text, html or email
300
   (ie: pdf, postscript, ms-word...) are described in the external packages
301
   section
302
285
     ----------------------------------------------------------------------
303
     ----------------------------------------------------------------------
286
304
287
2.4. Starting indexing
305
2.4. Starting indexing
288
306
289
   Indexing is performed either by the recollindex program, or by the
307
   Indexing is performed either by the recollindex program, or by the
290
   indexing thread inside the recoll program (use the File menu).
308
   indexing thread inside the recoll program (use the File menu). Both
309
   programs will use of the RECOLL_CONFDIR variable or accept a -c confdir
310
   option to specify the configuration directory to be used.
291
311
292
   If the recoll program finds no index when it starts, it will automatically
312
   If the recoll program finds no index when it starts, it will automatically
293
   start indexing (except if cancelled).
313
   start indexing (except if cancelled).
294
314
295
   It is best to avoid interrupting the indexing process, as this may
315
   It is best to avoid interrupting the indexing process, as this may
296
   sometimes leave the database in a bad state. This is not a serious
316
   sometimes leave the index in a bad state. This is not a serious problem,
297
   problem, as you then just need to clear everything and restart the
317
   as you then just need to clear everything and restart the indexing: the
298
   indexing: the index files are normally stored in the
318
   index files are normally stored in the $HOME/.recoll/xapiandb directory,
299
   $HOME/.recoll/xapiandb directory, which you can just delete if needed.
319
   which you can just delete if needed. Alternatively, you can start
300
   Alternatively, you can start recollindex -z, which will reset the database
320
   recollindex with option -z, which will reset the database before indexing.
301
   before indexing.
302
321
303
     ----------------------------------------------------------------------
322
     ----------------------------------------------------------------------
304
323
305
2.5. Using cron to automate indexing
324
2.5. Using cron to automate indexing
306
325
...
...
337
   with any of the search terms (the ones with more terms will get better
356
   with any of the search terms (the ones with more terms will get better
338
   scores). All terms will ensure that only documents with all the terms will
357
   scores). All terms will ensure that only documents with all the terms will
339
   be returned. File name will specifically look for file names, and allows
358
   be returned. File name will specifically look for file names, and allows
340
   using wildcards (*, ? , []).
359
   using wildcards (*, ? , []).
341
360
361
   You can search for exact phrases (adjacent words in a given order) by
362
   enclosing the input inside double quotes. Ex: "virtual reality".
363
364
   Character case has no influence on search, except that you can disable
365
   stem expansion for any term by capitalizing it. Ie: a search for floor
366
   will also normally look for flooring, floored, etc., but a search for
367
   Floor will only look for floor, in any character case (stemming can also
368
   be disabled globally in the preferences).
369
342
   Recoll remembers the last few searches that you performed. You can use the
370
   Recoll remembers the last few searches that you performed. You can use the
343
   simple search text entry widget (a combobox) to recall them (click on the
371
   simple search text entry widget (a combobox) to recall them (click on the
344
   thing at the right of the text field). Please note, however, that only the
372
   thing at the right of the text field). Please note, however, that only the
345
   search texts are remembered, not the mode (all/any/filename).
373
   search texts are remembered, not the mode (all/any/filename).
346
374
375
   Hitting ^Tab (Ctrl + Tab) while entering a word in the simple search entry
376
   will open a window with possible completions for the word. The completions
377
   are extracted from the database.
378
379
   Double-clicking on a word in the result list or a preview window will
380
   insert it into the simple search entry field.
381
347
   You can use the Tools / Advanced search dialog for more complex searches.
382
   You can use the Tools / Advanced search dialog for more complex searches.
348
383
384
     ----------------------------------------------------------------------
385
386
3.2. The result list
387
349
   After starting a search, a list of results will instantly be displayed in
388
   After starting a search, a list of results will instantly be displayed in
350
   the main list window. Clicking on the Preview link for an entry will open
389
   the main list window.
351
   an internal preview window for the document. Clicking the Edit link will
352
   attempt to start an external viewer (have a look at the mimeconf
353
   configuration file to see how these are configured).
354
390
355
   By default, the document list is presented in order of relevance (how well
391
   By default, the document list is presented in order of relevance (how well
356
   the system estimates that the document matches the query). You can specify
392
   the system estimates that the document matches the query). You can specify
357
   a different ordering by using the Tools / Sort parameters dialog.
393
   a different ordering by using the Tools / Sort parameters dialog.
394
395
   Clicking on the Preview link for an entry will open an internal preview
396
   window for the document. Clicking the Edit link will attempt to start an
397
   external viewer (have a look at the mimeconf configuration file to see how
398
   these are configured).
358
399
359
   The Preview and Edit edit links may not be present for all entries,
400
   The Preview and Edit edit links may not be present for all entries,
360
   meaning that Recoll has no configured way to preview a given file type
401
   meaning that Recoll has no configured way to preview a given file type
361
   (which was indexed by name only), or no configured external viewer for the
402
   (which was indexed by name only), or no configured external viewer for the
362
   file type. This can sometimes be adjusted simply by tweaking the mimemap
403
   file type. This can sometimes be adjusted simply by tweaking the mimemap
...
...
364
405
365
   You can click on the Query details link at the top of the results page to
406
   You can click on the Query details link at the top of the results page to
366
   see the query actually performed, after stem expansion and other
407
   see the query actually performed, after stem expansion and other
367
   processing.
408
   processing.
368
409
369
     ----------------------------------------------------------------------
410
   Double-clicking on any word inside the result list or a preview window
411
   will insert it into the simple search text.
370
412
413
   The result list is divided into pages (the size of which you can change in
414
   the preferences). Use the arrow buttons in the toolbar or the links at the
415
   bottom of the page to browse the results.
416
417
     ----------------------------------------------------------------------
418
419
  3.2.1. The result list right-click menu
420
421
   Apart from the preview and edit links, you can display a popup menu by
422
   right-clicking over a paragraph in the result list. This menu has the
423
   following entries:
424
425
     * Preview
426
427
     * Edit
428
429
     * Copy File Name
430
431
     * Copy Url
432
433
     * Find similar
434
435
   The Preview and Edit entries do the same thing as the corresponding links.
436
   The two following entries will copy either an url or the file path to the
437
   clipboard, for pasting into another application.
438
439
   The Find similar entry will select a number of relevant term from the
440
   current document and enter them into the simple search field. You can then
441
   start a simple search, with a good chance of finding documents related to
442
   the current result.
443
444
     ----------------------------------------------------------------------
445
446
3.3. The preview window
447
448
   The preview window opens when you first click a Preview link inside the
449
   result list.
450
451
   Subsequent preview requests for a given search open new tabs in the
452
   existing window.
453
454
   Starting another search and requesting a preview will create a new preview
455
   window. The old one stays open until you close it.
456
457
   You can close a preview tab by typing ^W (Ctrl + W) in the window. Closing
458
   the last tab for a window will also close the window.
459
460
   Of course you can also close a preview window by using the window manager
461
   button in the top of the frame.
462
463
   You can display successive or previous documents from the result list
464
   inside a preview tab by typing Ctrl+Down or Ctrl+Up (Down and Up are the
465
   arrow keys).
466
467
   The preview tabs have an internal incremental search function. You
468
   initiate the search either by typing a / (slash) inside the text area or
469
   by clicking into the Search for: text field and entering the search
470
   string. You can then use the Next and Previous buttons to find the
471
   next/previous occurence. You can also type F3 inside the text area to get
472
   to the next occurrence.
473
474
   If you have a search string entered and you use ^Up/^Down to browse the
475
   results, the search is initiated for each successive document. If the
476
   string is found, the cursor will be positionned at the first occurrence of
477
   the search string.
478
479
     ----------------------------------------------------------------------
480
371
3.2. Complex/advanced search
481
3.4. Complex/advanced search
372
482
373
   The advanced search dialog has fields that will allow a more refined
483
   The advanced search dialog has fields that will allow a more refined
374
   search, looking for documents with all given words, a given exact phrase,
484
   search, looking for documents with all given elements, a given exact
375
   none of the given words, or a given file name (with wildcard expansion).
485
   phrase, none of the given elements, or a given file name (with wildcard
376
   All relevant fields will be combined by an implicit AND clause.
486
   expansion). All relevant fields will be combined by an implicit AND
487
   clause. All fields except "Exact phrase" can accept a mix of single words
488
   and phrases enclosed in double quotes.
377
489
378
   It will let you search for documents of specific mime types (ie: only
490
   Advanced search will let you search for documents of specific mime types
379
   text/plain, or text/html or application/pdf etc...)
491
   (ie: only text/plain, or text/html or application/pdf etc...). The state
492
   of the file type selection can be saved as the default (the file type
493
   filter will not be activated at program startup, but the lists will be in
494
   the restored state).
380
495
381
   It will let you restrict the search results to a subtree of the indexed
496
   You can also restrict the search results to a subtree of the indexed area.
382
   area.
497
   If you need to do this often, you may think of setting up multiple indexes
498
   instead, as the performance will be much better.
383
499
384
   Click on the Start Search button in the advanced search dialog to start
500
   Click on the Start Search button in the advanced search dialog, or type
385
   the search. The button in the main window always performs a simple search.
501
   Enter in any text field to start the search. The button in the main window
502
   always performs a simple search.
386
503
387
   Click on the Show query details link at the top of the result page to see
504
   Click on the Show query details link at the top of the result page to see
388
   the query expansion.
505
   the query expansion.
389
506
390
     ----------------------------------------------------------------------
507
     ----------------------------------------------------------------------
391
508
392
3.3. Multiple databases
509
3.5. Multiple databases
393
510
394
   Your Recoll configuration always defines a main index. This is what gets
511
   Multiple Recoll databases or indexes can be created by using several
395
   updated, for example, when you execute recollindex.
512
   configuration directories which are usually set to index different areas
513
   of the file system. A specific index can be selected for updating or
514
   searching, using the RECOLL_CONFDIR environment variable or the -c option
515
   to recoll and recollindex.
396
516
397
   You can use the search configuration tool to define additional databases
517
   A recollindex program instance can only update one specific index.
398
   to be searched. These databases can be made active or inactive at any
399
   moment.
400
518
401
   The typical use of this feature is for a system administrator to set up a
519
   A recoll program instance is also associated with a specific index, which
402
   central index, that you may choose to search, or not, in addition to your
520
   is the one to be updated by its indexing thread, but it can use any number
403
   personal data. Of course, there are other possibilities.
521
   of Recoll indexes for searching. The external indexes can be selected
522
   through the external indexes tab in the preferences dialog.
404
523
405
   The main index (defined by your personal configuration) is always active.
524
   Index selection is performed in two phases. A set of all usable indexes
525
   must first be defined, and then the subset of indexes to be used for
526
   searching. Of course, these parameters are retained across program
527
   executions (there are kept separately for each Recoll configuration). The
528
   set of all indexes is usually quite stable, while the active ones might
529
   typically be adjusted quite frequently.
406
530
407
   The list of searchable databases may also be defined by the
531
   The main index (defined by RECOLL_CONFDIR) is always active. If this is
408
   RECOLL_EXTRA_DBS environment variable. This should hold a colon-separated
532
   undesirable, you can set up your base configuration to index an empty
409
   list of index directories, ie:
533
   directory.
534
535
   As building the set of all indexes can be a little tedious when done
536
   through the user interface, you can use the RECOLL_EXTRA_DBS environment
537
   variable to provide an initial set. This might typically be set up by a
538
   system administrator so that every user does not have to do it. The
539
   variable should define a colon-separated list of index directories, ie:
410
540
411
 export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
541
 export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
412
542
413
     ----------------------------------------------------------------------
543
   A typical usage scenario for the multiple index feature would be for a
544
   system administrator to set up a central index for shared data, that you
545
   may choose to search, or not, in addition to your personal data. Of
546
   course, there are other possibilities. There are many cases where you know
547
   the subset of files that you want to be searched for a given query, and
548
   where restricting the query will much improve the precision of the
549
   results. This can also be performed with the directory filter in advanced
550
   search, but multiple indexes will have much better performance and may be
551
   worth the trouble.
414
552
553
     ----------------------------------------------------------------------
554
415
3.4. Document history
555
3.6. Document history
416
556
417
   Documents that you actually view (with the internal preview or an external
557
   Documents that you actually view (with the internal preview or an external
418
   tool) are entered into the document history, which is remembered. You can
558
   tool) are entered into the document history, which is remembered. You can
419
   display the history list by using the Tools/Doc History menu entry.
559
   display the history list by using the Tools/Doc History menu entry.
420
560
421
     ----------------------------------------------------------------------
561
     ----------------------------------------------------------------------
422
562
423
3.5. Result list sorting
563
3.7. Sorting search results
424
564
425
   The documents in a result list are normally sorted in order of relevance.
565
   The documents in a result list are normally sorted in order of relevance.
426
   It is possible to specify different sort parameters by using the Sort
566
   It is possible to specify different sort parameters by using the Sort
427
   parameters dialog (located in the Tools menu).
567
   parameters dialog (located in the Tools menu).
428
568
...
...
434
   the program exits. An activated sort is indicated in the result list
574
   the program exits. An activated sort is indicated in the result list
435
   header.
575
   header.
436
576
437
     ----------------------------------------------------------------------
577
     ----------------------------------------------------------------------
438
578
439
3.6. Additional result list functionality
440
441
   Apart from the preview and edit links, you can display a popup menu by
442
   right-clicking over a paragraph in the result list. This menu has the
443
   following entries:
444
445
     * Preview
446
447
     * Edit
448
449
     * Copy File Name
450
451
     * Copy Url
452
453
     * More like this
454
455
   The Preview and Edit entries do the same thing as the corresponding links.
456
   The two following entries will copy either an url or the file path to the
457
   clipboard, for pasting into another application.
458
459
   The More like this entry will select a number of relevant term from the
460
   current document and enter them into the simple search field. You can then
461
   start a simple search, with a good chance of finding documents related to
462
   the current result.
463
464
     ----------------------------------------------------------------------
465
466
3.7. Search tips, shortcuts
579
3.8. Search tips, shortcuts
467
580
468
   Disabling stem expansion. Entering a capitalized word in any search field
581
   Disabling stem expansion. Entering a capitalized word in any search field
469
   will prevent stem expansion (no search for gardening if you enter Garden
582
   will prevent stem expansion (no search for gardening if you enter Garden
470
   instead of garden). This is the only case where character case should make
583
   instead of garden). This is the only case where character case should make
471
   a difference for a Recoll search.
584
   a difference for a Recoll search.
472
585
473
   Phrases. A phrase can be looked for by enclosing it in double quotes.
586
   Phrases. A phrase can be looked for by enclosing it in double quotes.
474
   Example: "user manual" will look only for occurrences of user immediately
587
   Example: "user manual" will look only for occurrences of user immediately
475
   followed by manual. You can use the This exact phrase field of the
588
   followed by manual. You can use the This exact phrase field of the
476
   advanced search dialog to the same effect.
589
   advanced search dialog to the same effect. Phrases can be entered along
590
   simple terms in all search entry fields (except This exact phrase).
477
591
592
   AutoPhrases. This option can be set in the preferences dialog. If it is
593
   set, a phrase will be automatically built and added to simple searches
594
   when looking for Any terms. This will not change radically the results,
595
   but will give a relevance boost to the results where the search terms
596
   appear as a phrase. Ie: searching for virtual reality will still find all
597
   documents where either virtual or reality or both appear, but those which
598
   contain virtual reality should appear sooner in the list.
599
478
   Term completion. Typing ^TAB (Control+Tab) in the simple search entry
600
   Term completion. Typing ^TAB (Control + Tab) in the simple search entry
479
   field while entering a word will either complete the current word if its
601
   field while entering a word will either complete the current word if its
480
   beginning matches a unique term in the index, or open a window to propose
602
   beginning matches a unique term in the index, or open a window to propose
481
   a list of completions
603
   a list of completions
482
604
483
   Picking up new terms for search from displayed documents. Double-clicking
605
   Picking up new terms for search from displayed documents. Double-clicking
484
   on a word in the result list or in a preview window will copy it to the
606
   on a word in the result list or in a preview window will copy it to the
485
   simple search entry field.
607
   simple search entry field.
486
608
487
   Finding related documents. Selecting the More like this entry in the
609
   Finding related documents. Selecting the Find similar documents entry in
488
   result list paragraph right-click menu will select a set of "interesting"
610
   the result list paragraph right-click menu will select a set of
489
   terms from the current result, and insert them into the simple search
611
   "interesting" terms from the current result, and insert them into the
490
   entry field. You can then possibly edit the list and start a search to
612
   simple search entry field. You can then possibly edit the list and start a
491
   find documents which may be apparented to the current result.
613
   search to find documents which may be apparented to the current result.
492
614
493
   Query explanation. You can get an exact description of what the query
615
   Query explanation. You can get an exact description of what the query
494
   looked for, including stem expansion, and boolean operators used, by
616
   looked for, including stem expansion, and boolean operators used, by
495
   clicking on the result list header.
617
   clicking on the result list header.
496
618
497
   File names. File names are added as terms during indexing, and you can
619
   File names. File names are added as terms during indexing, and you can
498
   specify them as ordinary terms in normal search fields (Recoll used to
620
   specify them as ordinary terms in normal search fields (Recoll used to
499
   index all directories in the file path as terms. This has been abandonned
621
   index all directories in the file path as terms. This has been abandonned
500
   as it did not seem really useful). Alternatively, you can use specific
622
   as it did not seem really useful). Alternatively, you can use the specific
501
   file name search which will only look for file names and can use wildcard
623
   file name search which will only look for file names and can use wildcard
502
   expansion.
624
   expansion.
503
625
504
   Quitting. Entering ^Q almost anywhere will close the application.
626
   Quitting. Entering ^Q almost anywhere will close the application.
505
627
506
   Closing previews. Entering ^W in a preview tab will close it (and, for the
628
   Closing previews. Entering Esc will close the preview window and all its
507
   last tab, close the preview window).
629
   tabs. Entering ^W in a tab will close it (and, for the last tab, close the
630
   preview window).
508
631
509
     ----------------------------------------------------------------------
632
   List browsing in preview. Entering ^Down or ^Up (Ctrl + an arrow key) in a
633
   preview window will display the next or the previous document from the
634
   result list. Any secondary search currently active will be executed on the
635
   new document.
510
636
637
     ----------------------------------------------------------------------
638
511
3.8. Customising the search interface
639
3.9. Customising the search interface
512
640
513
   It is possible to customise some aspects of the search interface by using
641
   It is possible to customise some aspects of the search interface by using
514
   Query configuration entry in the Preferences menu.
642
   Query configuration entry in the Preferences menu.
515
643
516
   There are two tabs in the dialog, dealing with the interface itself, and
644
   There are two tabs in the dialog, dealing with the interface itself, and
...
...
557
685
558
     * Replace abstracts from documents: this decides if we should synthetize
686
     * Replace abstracts from documents: this decides if we should synthetize
559
       and display an abstract in place of an explicit abstract found within
687
       and display an abstract in place of an explicit abstract found within
560
       the document itself.
688
       the document itself.
561
689
562
   Extra databases:
690
     * Synthetic abstract size: adjust to taste...
563
691
564
   This panel will let you browse for additional databases that you may want
692
     * Synthetic abstract context words: how many words should be displayed
565
   to search. Extra databases are designated by their database directory (ie:
693
       around each term occurrence.
566
   /home/someothergui/.recoll/xapiandb, /usr/local/recollglobal/xapiandb).
567
694
695
   External indexes: This panel will let you browse for additional indexes
696
   that you may want to search. External indexes are designated by their
697
   database directory (ie: /home/someothergui/.recoll/xapiandb,
698
   /usr/local/recollglobal/xapiandb).
699
568
   Once entered, the databases will appear in the All extra databases list,
700
   Once entered, the indexes will appear in the All indexes list, and you can
569
   and you can chose which ones you want to use at any moment by tranferring
701
   chose which ones you want to use at any moment by tranferring them to/from
570
   them to/from the Active extra databases list.
702
   the Active indexes list.
571
703
572
   Your main database (the one the current configuration indexes to), is
704
   Your main database (the one the current configuration indexes to), is
573
   always implicitely active. If this is not desirable, you can set up your
705
   always implicitely active. If this is not desirable, you can set up your
574
   configuration so that it indexes, for example, an empty directory.
706
   configuration so that it indexes, for example, an empty directory.
575
707
576
     ----------------------------------------------------------------------
708
     ----------------------------------------------------------------------
577
709
578
                            Chapter 4. Installation
710
                            Chapter 4. Installation
579
711
712
4.1. Installing a prebuilt copy
713
714
   Recoll binary installations are always linked statically to the xapian
715
   libraries, and have no other dependencies. You will only have to check or
716
   install supporting applications for the file types that you want to index
717
   beyond text, html and mail files.
718
719
     ----------------------------------------------------------------------
720
721
  4.1.1. Installing through a package system
722
723
   If you use a BSD-type port system or a prebuilt package (RPM or other),
724
   just follow the usual procedure, and maybe have a look at the
725
   configuration section (but this may not be necessary for a quick test with
726
   default parameters).
727
728
     ----------------------------------------------------------------------
729
730
  4.1.2. Installing a prebuilt Recoll
731
732
   The unpackaged binary versions are just compressed tar files of a build
733
   tree, where only the useful parts were kept (executables and sample
734
   configuration).
735
736
   The executable binary files are built with a static link to libxapian and
737
   libiconv, to make installation easier (no dependencies). However, this
738
   also means that you cannot change the versions which are used.
739
740
   After extracting the tar file, you can proceed with installation as if you
741
   had built the package from source.
742
743
   The binary trees are built for installation to /usr/local.
744
745
     ----------------------------------------------------------------------
746
580
4.1. Building from source
747
4.2. Building from source
581
748
582
  4.1.1. Prerequisites
749
  4.2.1. Prerequisites
583
750
584
   At the very least, you will need to download and install the xapian core
751
   At the very least, you will need to download and install the xapian core
585
   package (Recoll development currently uses version 0.9.5), and the qt
752
   package (Recoll development currently uses version 0.9.5), and the qt
586
   runtime and development packages (Recoll development currently uses
753
   runtime and development packages (Recoll development currently uses
587
   version 3.3.5, but any 3.3 version is probably ok).
754
   version 3.3.5, but any 3.3 version is probably ok).
...
...
594
   not be critical). On Linux systems, the iconv interface is part of libc
761
   not be critical). On Linux systems, the iconv interface is part of libc
595
   and you should not need to do anything special.
762
   and you should not need to do anything special.
596
763
597
     ----------------------------------------------------------------------
764
     ----------------------------------------------------------------------
598
765
599
  4.1.2. Building
766
  4.2.2. Building
600
767
601
   Recoll has been built on Linux (redhat7.3, mandriva 2005, Fedora Core 3),
768
   Recoll has been built on Linux (redhat7.3, mandriva 2005, Fedora Core 3),
602
   FreeBSD and Solaris 8. If you build on another system, I would very much
769
   FreeBSD and Solaris 8. If you build on another system, I would very much
603
   welcome patches.
770
   welcome patches.
604
771
...
...
634
   manually copy and modify one of the existing files (the new file name
801
   manually copy and modify one of the existing files (the new file name
635
   should be the output of uname -s).
802
   should be the output of uname -s).
636
803
637
     ----------------------------------------------------------------------
804
     ----------------------------------------------------------------------
638
805
639
  4.1.3. Installation
806
  4.2.3. Installation
640
807
641
   Either type make install or execute recollinstall prefix, in the root of
808
   Either type make install or execute recollinstall prefix, in the root of
642
   the source tree. This will copy the commands to prefix/bin and the sample
809
   the source tree. This will copy the commands to prefix/bin and the sample
643
   configuration files, scripts and other shared data to prefix/share/recoll.
810
   configuration files, scripts and other shared data to prefix/share/recoll.
644
811
812
   If the installation prefix given to recollinstall is different from what
813
   was specified when executing configure, you will have to set the
814
   RECOLL_DATADIR environment variable to indicate where the shared data is
815
   to be found.
816
645
   You can then proceed to configuration.
817
   You can then proceed to configuration.
646
647
     ----------------------------------------------------------------------
648
649
4.2. Installing a prebuilt copy
650
651
  4.2.1. Installing through a package system
652
653
   If you are lucky enough to be using a port system or a prebuilt package
654
   (RPM or other), just follow the usual procedure, and have a look at the
655
   configuration section.
656
657
     ----------------------------------------------------------------------
658
659
  4.2.2. Installing a prebuilt Recoll
660
661
   The unpackaged binary versions are just compressed tar files of a build
662
   tree, where only the useful parts were kept (executables and sample
663
   configuration).
664
665
   The executable binary files are built with a static link to libxapian and
666
   libiconv, to make installation easier (no dependencies). However, this
667
   also means that you cannot change the versions which are used.
668
669
   After extracting the tar file, you can proceed with installation as if you
670
   had built the package from source.
671
818
672
     ----------------------------------------------------------------------
819
     ----------------------------------------------------------------------
673
820
674
4.3. Packages needed for external file types
821
4.3. Packages needed for external file types
675
822
...
...
681
828
682
     * Postscript: pstotext.
829
     * Postscript: pstotext.
683
830
684
     * MS Word: antiword.
831
     * MS Word: antiword.
685
832
833
     * MS Excel and PowerPoint: catdoc.
834
686
     * RTF: unrtf
835
     * RTF: unrtf
687
836
688
     * dvi: dvips
837
     * dvi: dvips
689
838
690
     * djvu: DjVuLibre
839
     * djvu: DjVuLibre
...
...
699
848
700
4.4. Configuration overview
849
4.4. Configuration overview
701
850
702
   There are two sets of configuration files. The system-wide files are kept
851
   There are two sets of configuration files. The system-wide files are kept
703
   in a directory named like /usr/[local/]share/recoll/examples, they define
852
   in a directory named like /usr/[local/]share/recoll/examples, they define
704
   default values for the system. A parallel set of files exists in the
853
   default values for the system. A parallel set of files exists by default
705
   .recoll directory in your home (this can be changed with the
854
   in the .recoll directory in your home. This directory can be changed with
706
   RECOLL_CONFDIR environment variable.
855
   the RECOLL_CONFDIR environment variable or the -c option parameter to
856
   recoll and recollindex.
707
857
708
   If the .recoll directory does not exist when recoll or recollindex are
858
   If the .recoll directory does not exist when recoll or recollindex are
709
   started, it will be created with a set of empty configuration files.
859
   started, it will be created with a set of empty configuration files.
710
   recoll will give you a chance to edit the configuration file before
860
   recoll will give you a chance to edit the configuration file before
711
   starting indexing. recollindex will proceed immediately.
861
   starting indexing. recollindex will proceed immediately.
...
...
768
918
769
           Specifies the list of directories or files to index (recursively
919
           Specifies the list of directories or files to index (recursively
770
           for directories). The indexer will not follow symbolic links
920
           for directories). The indexer will not follow symbolic links
771
           inside the indexed trees. If an entry in the topdirs list is a
921
           inside the indexed trees. If an entry in the topdirs list is a
772
           symbolic link, indexing will not start and will generate an error.
922
           symbolic link, indexing will not start and will generate an error.
923
924
   dbdir
925
926
           The name of the Xapian data directory. It will be created if
927
           needed when the index is initialized. If this is not an absolute
928
           path, it will be interpreted relative to the configuration
929
           directory.
773
930
774
   skippedNames
931
   skippedNames
775
932
776
           A space-separated list of patterns for names of files or
933
           A space-separated list of patterns for names of files or
777
           directories that should be completely ignored. The list defined in
934
           directories that should be completely ignored. The list defined in
...
...
816
           built. See recollindex(1) for possible values. You can add a stem
973
           built. See recollindex(1) for possible values. You can add a stem
817
           expansion database for a different language by using recollindex
974
           expansion database for a different language by using recollindex
818
           -s, but it will be deleted during the next indexing. Only
975
           -s, but it will be deleted during the next indexing. Only
819
           languages listed in the configuration file are permanent.
976
           languages listed in the configuration file are permanent.
820
977
821
   iconsdir
822
823
           The name of the directory where recoll result list icons are
824
           stored. You can change this if you want different images.
825
826
   dbdir
827
828
           The name of the Xapian data directory. It will be created if
829
           needed when the index is initialized.
830
831
   defaultcharset
978
   defaultcharset
832
979
833
           The name of the character set used for files that do not contain a
980
           The name of the character set used for files that do not contain a
834
           character set definition (ie: plain text files). This can be
981
           character set definition (ie: plain text files). This can be
835
           redefined for any subdirectory. If it is not set at all, the
982
           redefined for any subdirectory. If it is not set at all, the
...
...
857
           parameter decides if file name indexing is performed only for
1004
           parameter decides if file name indexing is performed only for
858
           files with mime types that would qualify them for full text
1005
           files with mime types that would qualify them for full text
859
           indexing, or for all files inside the selected subtrees,
1006
           indexing, or for all files inside the selected subtrees,
860
           independant of mime type.
1007
           independant of mime type.
861
1008
1009
   idxabsmlen
1010
1011
           Recoll stores an abstract for each indexed file inside the
1012
           database. This is so that they can be displayed inside the result
1013
           lists without decoding the original file. This parameter defines
1014
           the size of the stored abstract (which can come from an actual
1015
           section or just be the beginning of the text). The default value
1016
           is 250.
1017
1018
   iconsdir
1019
1020
           The name of the directory where recoll result list icons are
1021
           stored. You can change this if you want different images.
1022
862
     ----------------------------------------------------------------------
1023
     ----------------------------------------------------------------------
863
1024
864
  4.4.2. The mimemap file
1025
  4.4.2. The mimemap file
865
1026
866
   mimemap specifies the file name extension to mime type mappings.
1027
   mimemap specifies the file name extension to mime type mappings.