Switch to unified view

a/src/README b/src/README
...
...
59
59
60
                3.7. Sorting search results
60
                3.7. Sorting search results
61
61
62
                3.8. Search tips, shortcuts
62
                3.8. Search tips, shortcuts
63
63
64
                3.9. Customising the search interface
64
                3.9. Customizing the search interface
65
65
66
   4. Installation
66
   4. Installation
67
67
68
                4.1. Installing a prebuilt copy
68
                4.1. Installing a prebuilt copy
69
69
...
...
121
   documents will appear first.
121
   documents will appear first.
122
122
123
   You do not need to remember in what file or email message you stored a
123
   You do not need to remember in what file or email message you stored a
124
   given piece of information. You just ask for related terms, and the tool
124
   given piece of information. You just ask for related terms, and the tool
125
   will return a list of documents where those terms are prominent, in a
125
   will return a list of documents where those terms are prominent, in a
126
   similar way to internet search engines.
126
   similar way to Internet search engines.
127
127
128
   Recoll tries to determine which documents are most relevant to the search
128
   Recoll tries to determine which documents are most relevant to the search
129
   terms you provide. Computer algorithms for determining relevance can be
129
   terms you provide. Computer algorithms for determining relevance can be
130
   very complex, and in general are inferior to the power of the human mind
130
   very complex, and in general are inferior to the power of the human mind
131
   to rapidly determine relevance. The quality of relevance guessing by the
131
   to rapidly determine relevance. The quality of relevance guessing by the
...
...
133
   application.
133
   application.
134
134
135
   In many cases, you are looking for all the forms of a word, not for a
135
   In many cases, you are looking for all the forms of a word, not for a
136
   specific form or spelling. These different forms may include plurals,
136
   specific form or spelling. These different forms may include plurals,
137
   different tenses for a verb, or terms derived from the same root or stem
137
   different tenses for a verb, or terms derived from the same root or stem
138
   (exemple: floor, floors, floored, floorings...). Recoll will by default
138
   (example: floor, floors, floored, flooring...). Recoll will by default
139
   expand queries to all such related terms (words that reduce to the same
139
   expand queries to all such related terms (words that reduce to the same
140
   stem). This expansion can be disabled at search time.
140
   stem). This expansion can be disabled at search time.
141
141
142
   Stemming, by itself, does not accomodate for misspellings or phonetic
142
   Stemming, by itself, does not accommodate for misspellings or phonetic
143
   searches. Recoll currently does not support these features.
143
   searches. Recoll currently does not support these features.
144
144
145
     ----------------------------------------------------------------------
145
     ----------------------------------------------------------------------
146
146
147
1.3. Recoll overview
147
1.3. Recoll overview
...
...
157
   The resulting index can be big (roughly the size of the original document
157
   The resulting index can be big (roughly the size of the original document
158
   set), but it is not a document archive. Recoll can only display documents
158
   set), but it is not a document archive. Recoll can only display documents
159
   that still exist at the place from which they were indexed. (Actually,
159
   that still exist at the place from which they were indexed. (Actually,
160
   there is a way to reconstruct a document from the information in the
160
   there is a way to reconstruct a document from the information in the
161
   index, but the result is not nice, as all formatting, punctuation and
161
   index, but the result is not nice, as all formatting, punctuation and
162
   capitalisation are lost).
162
   capitalization are lost).
163
163
164
   Recoll stores all internal data in Unicode UTF-8 format, and it can index
164
   Recoll stores all internal data in Unicode UTF-8 format, and it can index
165
   files with different character sets, encodings, and languages into the
165
   files with different character sets, encodings, and languages into the
166
   same index. It has input filters for many document types.
166
   same index. It has input filters for many document types.
167
167
168
   Stemming depends on the document language. Recoll stores the unstemmed
168
   Stemming depends on the document language. Recoll stores the unstemmed
169
   versions of terms and uses auxiliary databases for term expansion. It can
169
   versions of terms and uses auxiliary databases for term expansion. It can
170
   switch stemming languages, or add a language, without reindexing. Storing
170
   switch stemming languages, or add a language, without re-indexing. Storing
171
   documents in different languages in the same index is possible, and useful
171
   documents in different languages in the same index is possible, and useful
172
   in practice, but does introduce possibilities of confusion. Recoll
172
   in practice, but does introduce possibilities of confusion. Recoll
173
   currently makes no attempt at automatic language recognition.
173
   currently makes no attempt at automatic language recognition.
174
174
175
   Recoll has many parameters which define exactly what to index, and how to
175
   Recoll has many parameters which define exactly what to index, and how to
176
   classify and decode the source documents. These are kept in a
176
   classify and decode the source documents. These are kept in a
177
   configuration file. A default configuration is copied into a standard
177
   configuration file. A default configuration is copied into a standard
178
   location (usually something like /usr/[local/]share/recoll/examples)
178
   location (usually something like /usr/[local/]share/recoll/examples)
179
   during installation. The default parameters from this file may be
179
   during installation. The default parameters from this file may be
180
   overriden by values that you set inside your personal configuration, found
180
   overridden by values that you set inside your personal configuration,
181
   by default in the .recoll subdirectory of your home directory. The default
181
   found by default in the .recoll sub-directory of your home directory. The
182
   configuration will index your home directory with default parameters and
182
   default configuration will index your home directory with default
183
   should be sufficient for giving Recoll a try, but you may want to adjust
183
   parameters and should be sufficient for giving Recoll a try, but you may
184
   it later.
184
   want to adjust it later.
185
185
186
   Indexing is started automatically the first time you execute the recoll
186
   Indexing is started automatically the first time you execute the recoll
187
   search graphical user interface, or by executing the recollindex command.
187
   search graphical user interface, or by executing the recollindex command.
188
188
189
   Searches are performed inside the recoll program, which has many options
189
   Searches are performed inside the recoll program, which has many options
...
...
269
   confidential data is indexed, access to the database directory should be
269
   confidential data is indexed, access to the database directory should be
270
   restricted.
270
   restricted.
271
271
272
   As of version 1.4, Recoll will create the configuration directory with a
272
   As of version 1.4, Recoll will create the configuration directory with a
273
   mode of 0700 (access by owner only). As the index data directory is by
273
   mode of 0700 (access by owner only). As the index data directory is by
274
   default a subdirectory of the configuration directory, this should result
274
   default a sub-directory of the configuration directory, this should result
275
   in appropriate protection.
275
   in appropriate protection.
276
276
277
   If you use another setup, you should think of the kind of protection you
277
   If you use another setup, you should think of the kind of protection you
278
   need for your index, and set the directory and files access modes
278
   need for your index, and set the directory and files access modes
279
   appropriately.
279
   appropriately.
...
...
281
     ----------------------------------------------------------------------
281
     ----------------------------------------------------------------------
282
282
283
2.3. The indexing configuration
283
2.3. The indexing configuration
284
284
285
   Values set in the system-wide configuration file (named like
285
   Values set in the system-wide configuration file (named like
286
   /usr/[local/]share/recoll/examples/recoll.conf) can be overriden by those
286
   /usr/[local/]share/recoll/examples/recoll.conf) can be overridden by those
287
   set in the personal one, named $HOME/.recoll/recoll.conf by default or
287
   set in the personal one, named $HOME/.recoll/recoll.conf by default or
288
   $RECOLL_CONFDIR/recoll.conf if RECOLL_CONFDIR is set.
288
   $RECOLL_CONFDIR/recoll.conf if RECOLL_CONFDIR is set.
289
289
290
   The most accurate documentation for editing the file is given by comments
290
   The most accurate documentation for editing the file is given by comments
291
   inside the central one. If you want to adjust the configuration before
291
   inside the central one. If you want to adjust the configuration before
...
...
294
   empty configuration files.
294
   empty configuration files.
295
295
296
   The configuration is also documented inside the installation chapter of
296
   The configuration is also documented inside the installation chapter of
297
   this document, or in the recoll.conf(5) man page.
297
   this document, or in the recoll.conf(5) man page.
298
298
299
   The applications needed to index file types other than text, html or email
299
   The applications needed to index file types other than text, HTML or email
300
   (ie: pdf, postscript, ms-word...) are described in the external packages
300
   (ie: pdf, postscript, ms-word...) are described in the external packages
301
   section
301
   section
302
302
303
     ----------------------------------------------------------------------
303
     ----------------------------------------------------------------------
304
304
...
...
308
   indexing thread inside the recoll program (use the File menu). Both
308
   indexing thread inside the recoll program (use the File menu). Both
309
   programs will use of the RECOLL_CONFDIR variable or accept a -c confdir
309
   programs will use of the RECOLL_CONFDIR variable or accept a -c confdir
310
   option to specify the configuration directory to be used.
310
   option to specify the configuration directory to be used.
311
311
312
   If the recoll program finds no index when it starts, it will automatically
312
   If the recoll program finds no index when it starts, it will automatically
313
   start indexing (except if cancelled).
313
   start indexing (except if canceled).
314
314
315
   It is best to avoid interrupting the indexing process, as this may
315
   It is best to avoid interrupting the indexing process, as this may
316
   sometimes leave the index in a bad state. This is not a serious problem,
316
   sometimes leave the index in a bad state. This is not a serious problem,
317
   as you then just need to clear everything and restart the indexing: the
317
   as you then just need to clear everything and restart the indexing: the
318
   index files are normally stored in the $HOME/.recoll/xapiandb directory,
318
   index files are normally stored in the $HOME/.recoll/xapiandb directory,
...
...
368
   be disabled globally in the preferences).
368
   be disabled globally in the preferences).
369
369
370
   Recoll remembers the last few searches that you performed. You can use the
370
   Recoll remembers the last few searches that you performed. You can use the
371
   simple search text entry widget (a combobox) to recall them (click on the
371
   simple search text entry widget (a combobox) to recall them (click on the
372
   thing at the right of the text field). Please note, however, that only the
372
   thing at the right of the text field). Please note, however, that only the
373
   search texts are remembered, not the mode (all/any/filename).
373
   search texts are remembered, not the mode (all/any/file name).
374
374
375
   Hitting ^Tab (Ctrl + Tab) while entering a word in the simple search entry
375
   Hitting ^Tab (Ctrl + Tab) while entering a word in the simple search entry
376
   will open a window with possible completions for the word. The completions
376
   will open a window with possible completions for the word. The completions
377
   are extracted from the database.
377
   are extracted from the database.
378
378
...
...
416
416
417
     ----------------------------------------------------------------------
417
     ----------------------------------------------------------------------
418
418
419
  3.2.1. The result list right-click menu
419
  3.2.1. The result list right-click menu
420
420
421
   Apart from the preview and edit links, you can display a popup menu by
421
   Apart from the preview and edit links, you can display a pop-up menu by
422
   right-clicking over a paragraph in the result list. This menu has the
422
   right-clicking over a paragraph in the result list. This menu has the
423
   following entries:
423
   following entries:
424
424
425
     * Preview
425
     * Preview
426
426
...
...
431
     * Copy Url
431
     * Copy Url
432
432
433
     * Find similar
433
     * Find similar
434
434
435
   The Preview and Edit entries do the same thing as the corresponding links.
435
   The Preview and Edit entries do the same thing as the corresponding links.
436
   The two following entries will copy either an url or the file path to the
436
   The two following entries will copy either an URL or the file path to the
437
   clipboard, for pasting into another application.
437
   clipboard, for pasting into another application.
438
438
439
   The Find similar entry will select a number of relevant term from the
439
   The Find similar entry will select a number of relevant term from the
440
   current document and enter them into the simple search field. You can then
440
   current document and enter them into the simple search field. You can then
441
   start a simple search, with a good chance of finding documents related to
441
   start a simple search, with a good chance of finding documents related to
...
...
466
466
467
   The preview tabs have an internal incremental search function. You
467
   The preview tabs have an internal incremental search function. You
468
   initiate the search either by typing a / (slash) inside the text area or
468
   initiate the search either by typing a / (slash) inside the text area or
469
   by clicking into the Search for: text field and entering the search
469
   by clicking into the Search for: text field and entering the search
470
   string. You can then use the Next and Previous buttons to find the
470
   string. You can then use the Next and Previous buttons to find the
471
   next/previous occurence. You can also type F3 inside the text area to get
471
   next/previous occurrence. You can also type F3 inside the text area to get
472
   to the next occurrence.
472
   to the next occurrence.
473
473
474
   If you have a search string entered and you use ^Up/^Down to browse the
474
   If you have a search string entered and you use ^Up/^Down to browse the
475
   results, the search is initiated for each successive document. If the
475
   results, the search is initiated for each successive document. If the
476
   string is found, the cursor will be positionned at the first occurrence of
476
   string is found, the cursor will be positioned at the first occurrence of
477
   the search string.
477
   the search string.
478
478
479
     ----------------------------------------------------------------------
479
     ----------------------------------------------------------------------
480
480
481
3.4. Complex/advanced search
481
3.4. Complex/advanced search
...
...
486
   expansion). All relevant fields will be combined by an implicit AND
486
   expansion). All relevant fields will be combined by an implicit AND
487
   clause. All fields except "Exact phrase" can accept a mix of single words
487
   clause. All fields except "Exact phrase" can accept a mix of single words
488
   and phrases enclosed in double quotes.
488
   and phrases enclosed in double quotes.
489
489
490
   Advanced search will let you search for documents of specific mime types
490
   Advanced search will let you search for documents of specific mime types
491
   (ie: only text/plain, or text/html or application/pdf etc...). The state
491
   (ie: only text/plain, or text/HTML or application/pdf etc...). The state
492
   of the file type selection can be saved as the default (the file type
492
   of the file type selection can be saved as the default (the file type
493
   filter will not be activated at program startup, but the lists will be in
493
   filter will not be activated at program start-up, but the lists will be in
494
   the restored state).
494
   the restored state).
495
495
496
   You can also restrict the search results to a subtree of the indexed area.
496
   You can also restrict the search results to a sub-tree of the indexed
497
   If you need to do this often, you may think of setting up multiple indexes
497
   area. If you need to do this often, you may think of setting up multiple
498
   instead, as the performance will be much better.
498
   indexes instead, as the performance will be much better.
499
499
500
   Click on the Start Search button in the advanced search dialog, or type
500
   Click on the Start Search button in the advanced search dialog, or type
501
   Enter in any text field to start the search. The button in the main window
501
   Enter in any text field to start the search. The button in the main window
502
   always performs a simple search.
502
   always performs a simple search.
503
503
...
...
568
568
569
   The tool sorts a specified number of the most relevant documents in the
569
   The tool sorts a specified number of the most relevant documents in the
570
   result list, according to specified criteria. The currently available
570
   result list, according to specified criteria. The currently available
571
   criteria are date and mime type.
571
   criteria are date and mime type.
572
572
573
   The sort parameters stay in effect until they are explicitely reset, or
573
   The sort parameters stay in effect until they are explicitly reset, or the
574
   the program exits. An activated sort is indicated in the result list
574
   program exits. An activated sort is indicated in the result list header.
575
   header.
576
575
577
     ----------------------------------------------------------------------
576
     ----------------------------------------------------------------------
578
577
579
3.8. Search tips, shortcuts
578
3.8. Search tips, shortcuts
579
580
   Term completion. Typing ^TAB (Control + Tab) in the simple search entry
581
   field while entering a word will either complete the current word if its
582
   beginning matches a unique term in the index, or open a window to propose
583
   a list of completions
584
585
   Picking up new terms from result or preview text. Double-clicking on a
586
   word in the result list or in a preview window will copy it to the simple
587
   search entry field.
580
588
581
   Disabling stem expansion. Entering a capitalized word in any search field
589
   Disabling stem expansion. Entering a capitalized word in any search field
582
   will prevent stem expansion (no search for gardening if you enter Garden
590
   will prevent stem expansion (no search for gardening if you enter Garden
583
   instead of garden). This is the only case where character case should make
591
   instead of garden). This is the only case where character case should make
584
   a difference for a Recoll search.
592
   a difference for a Recoll search. You can also disable stem expansion or
593
   change the stemming language in the preferences.
585
594
586
   Phrases. A phrase can be looked for by enclosing it in double quotes.
595
   Phrases. A phrase can be looked for by enclosing it in double quotes.
587
   Example: "user manual" will look only for occurrences of user immediately
596
   Example: "user manual" will look only for occurrences of user immediately
588
   followed by manual. You can use the This exact phrase field of the
597
   followed by manual. You can use the This exact phrase field of the
589
   advanced search dialog to the same effect. Phrases can be entered along
598
   advanced search dialog to the same effect. Phrases can be entered along
590
   simple terms in all search entry fields (except This exact phrase).
599
   simple terms in all simple or advanced search entry fields (except This
600
   exact phrase).
591
601
602
   Browsing the result list inside a preview window (1.5). Entering
603
   Shift-Down or Shift-Up (Shift + an arrow key) in a preview window will
604
   display the next or the previous document from the result list. Any
605
   secondary search currently active will be executed on the new document.
606
592
   AutoPhrases. This option can be set in the preferences dialog. If it is
607
   AutoPhrases (1.5). This option can be set in the preferences dialog. If it
593
   set, a phrase will be automatically built and added to simple searches
608
   is set, a phrase will be automatically built and added to simple searches
594
   when looking for Any terms. This will not change radically the results,
609
   when looking for Any terms. This will not change radically the results,
595
   but will give a relevance boost to the results where the search terms
610
   but will give a relevance boost to the results where the search terms
596
   appear as a phrase. Ie: searching for virtual reality will still find all
611
   appear as a phrase. Ie: searching for virtual reality will still find all
597
   documents where either virtual or reality or both appear, but those which
612
   documents where either virtual or reality or both appear, but those which
598
   contain virtual reality should appear sooner in the list.
613
   contain virtual reality should appear sooner in the list.
599
614
600
   Term completion. Typing ^TAB (Control + Tab) in the simple search entry
601
   field while entering a word will either complete the current word if its
602
   beginning matches a unique term in the index, or open a window to propose
603
   a list of completions
604
605
   Picking up new terms for search from displayed documents. Double-clicking
606
   on a word in the result list or in a preview window will copy it to the
607
   simple search entry field.
608
609
   Finding related documents. Selecting the Find similar documents entry in
615
   Finding related documents. Selecting the Find similar documents entry in
610
   the result list paragraph right-click menu will select a set of
616
   the result list paragraph right-click menu will select a set of
611
   "interesting" terms from the current result, and insert them into the
617
   "interesting" terms from the current result, and insert them into the
612
   simple search entry field. You can then possibly edit the list and start a
618
   simple search entry field. You can then possibly edit the list and start a
613
   search to find documents which may be apparented to the current result.
619
   search to find documents which may be apparented to the current result.
614
620
615
   Query explanation. You can get an exact description of what the query
616
   looked for, including stem expansion, and boolean operators used, by
617
   clicking on the result list header.
618
619
   File names. File names are added as terms during indexing, and you can
621
   File names. File names are added as terms during indexing, and you can
620
   specify them as ordinary terms in normal search fields (Recoll used to
622
   specify them as ordinary terms in normal search fields (Recoll used to
621
   index all directories in the file path as terms. This has been abandonned
623
   index all directories in the file path as terms. This has been abandoned
622
   as it did not seem really useful). Alternatively, you can use the specific
624
   as it did not seem really useful). Alternatively, you can use the specific
623
   file name search which will only look for file names and can use wildcard
625
   file name search which will only look for file names and can use wildcard
624
   expansion.
626
   expansion.
625
627
628
   Query explanation. You can get an exact description of what the query
629
   looked for, including stem expansion, and Boolean operators used, by
630
   clicking on the result list header.
631
632
   Closing previews. Entering ^W in a tab will close it (and, for the last
633
   tab, close the preview window). Entering Esc will close the preview window
634
   and all its tabs.
635
626
   Quitting. Entering ^Q almost anywhere will close the application.
636
   Quitting. Entering ^Q almost anywhere will close the application.
627
637
628
   Closing previews. Entering Esc will close the preview window and all its
629
   tabs. Entering ^W in a tab will close it (and, for the last tab, close the
630
   preview window).
631
632
   List browsing in preview. Entering Shift-Down or Shift-Up (Shift + an
633
   arrow key) in a preview window will display the next or the previous
634
   document from the result list. Any secondary search currently active will
635
   be executed on the new document.
636
637
     ----------------------------------------------------------------------
638
     ----------------------------------------------------------------------
638
639
639
3.9. Customising the search interface
640
3.9. Customizing the search interface
640
641
641
   It is possible to customise some aspects of the search interface by using
642
   It is possible to customize some aspects of the search interface by using
642
   Query configuration entry in the Preferences menu.
643
   Query configuration entry in the Preferences menu.
643
644
644
   There are two tabs in the dialog, dealing with the interface itself, and
645
   There are two tabs in the dialog, dealing with the interface itself, and
645
   with the parameters used for searching and returning results.
646
   with the parameters used for searching and returning results.
646
647
647
   User interface parameters:
648
   User interface parameters:
648
649
649
     * Number of results in a result page
650
     * Number of results in a result page
650
651
651
     * Result list font: There is quite a lot of information shown in the
652
     * Result list font: There is quite a lot of information shown in the
652
       result list, and you may want to customise the font and/or font size.
653
       result list, and you may want to customize the font and/or font size.
653
       The rest of the fonts used by Recoll are determined by your generic QT
654
       The rest of the fonts used by Recoll are determined by your generic QT
654
       config (try the qtconfig command.
655
       config (try the qtconfig command.
655
656
656
     * Html help browser: this will let you chose your preferred browser
657
     * HTML help browser: this will let you chose your preferred browser
657
       which will be started from the Help menu to read the user manual. You
658
       which will be started from the Help menu to read the user manual. You
658
       can enter a simple name if the command is in your PATH, or browse for
659
       can enter a simple name if the command is in your PATH, or browse for
659
       a full pathname.
660
       a full pathname.
660
661
661
     * Show document type icons in result list: icons in the result list can
662
     * Show document type icons in result list: icons in the result list can
662
       be turned off. They take quite a lot of space and convey relatively
663
       be turned off. They take quite a lot of space and convey relatively
663
       little useful information.
664
       little useful information.
664
665
665
     * Auto-start simple search on whitespace entry: if this is checked, a
666
     * Auto-start simple search on white space entry: if this is checked, a
666
       search will be executed each time you enter a space in the simple
667
       search will be executed each time you enter a space in the simple
667
       search input field. This lets you look at the result list as you enter
668
       search input field. This lets you look at the result list as you enter
668
       new terms. This is off by default, you may like it or not...
669
       new terms. This is off by default, you may like it or not...
669
670
670
   Search parameters:
671
   Search parameters:
...
...
681
       document abstracts when displaying the result list. Abstracts are
682
       document abstracts when displaying the result list. Abstracts are
682
       constructed by taking context from the document information, around
683
       constructed by taking context from the document information, around
683
       the search terms. This can slow down result list display significantly
684
       the search terms. This can slow down result list display significantly
684
       for big documents, and you may want to turn it off.
685
       for big documents, and you may want to turn it off.
685
686
686
     * Replace abstracts from documents: this decides if we should synthetize
687
     * Replace abstracts from documents: this decides if we should synthesize
687
       and display an abstract in place of an explicit abstract found within
688
       and display an abstract in place of an explicit abstract found within
688
       the document itself.
689
       the document itself.
689
690
690
     * Synthetic abstract size: adjust to taste...
691
     * Synthetic abstract size: adjust to taste...
691
692
...
...
696
   that you may want to search. External indexes are designated by their
697
   that you may want to search. External indexes are designated by their
697
   database directory (ie: /home/someothergui/.recoll/xapiandb,
698
   database directory (ie: /home/someothergui/.recoll/xapiandb,
698
   /usr/local/recollglobal/xapiandb).
699
   /usr/local/recollglobal/xapiandb).
699
700
700
   Once entered, the indexes will appear in the All indexes list, and you can
701
   Once entered, the indexes will appear in the All indexes list, and you can
701
   chose which ones you want to use at any moment by tranferring them to/from
702
   chose which ones you want to use at any moment by transferring them
702
   the Active indexes list.
703
   to/from the Active indexes list.
703
704
704
   Your main database (the one the current configuration indexes to), is
705
   Your main database (the one the current configuration indexes to), is
705
   always implicitely active. If this is not desirable, you can set up your
706
   always implicitly active. If this is not desirable, you can set up your
706
   configuration so that it indexes, for example, an empty directory.
707
   configuration so that it indexes, for example, an empty directory.
707
708
708
     ----------------------------------------------------------------------
709
     ----------------------------------------------------------------------
709
710
710
                            Chapter 4. Installation
711
                            Chapter 4. Installation
...
...
712
4.1. Installing a prebuilt copy
713
4.1. Installing a prebuilt copy
713
714
714
   Recoll binary installations are always linked statically to the xapian
715
   Recoll binary installations are always linked statically to the xapian
715
   libraries, and have no other dependencies. You will only have to check or
716
   libraries, and have no other dependencies. You will only have to check or
716
   install supporting applications for the file types that you want to index
717
   install supporting applications for the file types that you want to index
717
   beyond text, html and mail files.
718
   beyond text, HTML and mail files.
718
719
719
     ----------------------------------------------------------------------
720
     ----------------------------------------------------------------------
720
721
721
  4.1.1. Installing through a package system
722
  4.1.1. Installing through a package system
722
723
...
...
768
     * dvi: dvips
769
     * dvi: dvips
769
770
770
     * djvu: DjVuLibre
771
     * djvu: DjVuLibre
771
772
772
     * MP3: Recoll will use the id3info command from the id3lib package to
773
     * MP3: Recoll will use the id3info command from the id3lib package to
773
       extract tag information. Without it, only the filenames will be
774
       extract tag information. Without it, only the file names will be
774
       indexed.
775
       indexed.
775
776
776
   Text, Html, mail folders and Openoffice files are processed internally.
777
   Text, HTML, mail folders and Openoffice files are processed internally.
777
778
778
     ----------------------------------------------------------------------
779
     ----------------------------------------------------------------------
779
780
780
4.3. Building from source
781
4.3. Building from source
781
782
782
  4.3.1. Prerequisites
783
  4.3.1. Prerequisites
783
784
784
   At the very least, you will need to download and install the xapian core
785
   At the very least, you will need to download and install the xapian core
785
   package (Recoll development currently uses version 0.9.5), and the qt
786
   package (Recoll development currently uses version 0.9.5), and the qt
786
   runtime and development packages (Recoll development currently uses
787
   run-time and development packages (Recoll development currently uses
787
   version 3.3.5, but any 3.3 version is probably ok).
788
   version 3.3.5, but any 3.3 version is probably OK).
788
789
789
   You will most probably be able to find a binary package for qt for your
790
   You will most probably be able to find a binary package for qt for your
790
   system. You may have to compile Xapian but this is not difficult (if you
791
   system. You may have to compile Xapian but this is not difficult (if you
791
   are using FreeBSD, there is a port).
792
   are using FreeBSD, there is a port).
792
793
...
...
807
808
808
     * QTDIR should point to the directory above the one that holds the qt
809
     * QTDIR should point to the directory above the one that holds the qt
809
       include files (ie: qt.h).
810
       include files (ie: qt.h).
810
811
811
     * QMAKESPECS should be set to the name of one of the qt mkspecs
812
     * QMAKESPECS should be set to the name of one of the qt mkspecs
812
       subdirectories (ie: linux-g++).
813
       sub-directories (ie: linux-g++).
813
814
814
   On many Linux systems, QTDIR is set by the login scripts, and QMAKESPECS
815
   On many Linux systems, QTDIR is set by the login scripts, and QMAKESPECS
815
   is not needed because there is a default link in mkspecs/.
816
   is not needed because there is a default link in mkspecs/.
816
817
817
   The Recoll configure script does a better job of checking these variables
818
   The Recoll configure script does a better job of checking these variables
...
...
823
   Normal procedure:
824
   Normal procedure:
824
825
825
         cd recoll-xxx
826
         cd recoll-xxx
826
         configure
827
         configure
827
         make
828
         make
828
         (practises usual hardship-repelling invocations)
829
         (practices usual hardship-repelling invocations)
829
     
830
     
830
831
831
   There little autoconfiguration. The configure script will mainly link one
832
   There little auto-configuration. The configure script will mainly link one
832
   of the system-specific files in the mk directory to mk/sysconf. If your
833
   of the system-specific files in the mk directory to mk/sysconf. If your
833
   system is not known yet, it will tell you as much, and you may want to
834
   system is not known yet, it will tell you as much, and you may want to
834
   manually copy and modify one of the existing files (the new file name
835
   manually copy and modify one of the existing files (the new file name
835
   should be the output of uname -s).
836
   should be the output of uname -s).
836
837
...
...
873
   edit them by hand for now (there is still some hope for a GUI
874
   edit them by hand for now (there is still some hope for a GUI
874
   configuration tool in the future). The most accurate documentation for the
875
   configuration tool in the future). The most accurate documentation for the
875
   configuration parameters is given by comments inside the default files,
876
   configuration parameters is given by comments inside the default files,
876
   and we will just give a general overview here.
877
   and we will just give a general overview here.
877
878
878
   All configuration files share the same format. For exemple, a short
879
   All configuration files share the same format. For example, a short
879
   extract of the main configuration file might look as follows:
880
   extract of the main configuration file might look as follows:
880
881
881
         # Space-separated list of directories to index.
882
         # Space-separated list of directories to index.
882
         topdirs =  ~/docs /usr/share/doc
883
         topdirs =  ~/docs /usr/share/doc
883
884
...
...
891
892
892
     * Parameter affectation (name = value).
893
     * Parameter affectation (name = value).
893
894
894
     * Section definition ([somedirname]).
895
     * Section definition ([somedirname]).
895
896
896
   Section lines allow redefining some parameters for a directory subtree.
897
   Section lines allow redefining some parameters for a directory sub-tree.
897
   Some of the parameters used for indexing are looked up hierarchically from
898
   Some of the parameters used for indexing are looked up hierarchically from
898
   the more to the less specific. Not all parameters can be meaningfully
899
   the more to the less specific. Not all parameters can be meaningfully
899
   redefined, this is specified for each in the next section.
900
   redefined, this is specified for each in the next section.
900
901
901
   The tilde character (~) is expanded in file names to the name of the
902
   The tilde character (~) is expanded in file names to the name of the
...
...
939
           directories that should be completely ignored. The list defined in
940
           directories that should be completely ignored. The list defined in
940
           the default file is:
941
           the default file is:
941
942
942
 *~ #* bin CVS  Cache caughtspam  tmp
943
 *~ #* bin CVS  Cache caughtspam  tmp
943
944
944
           The list can be redefined for subdirectories, but is only actually
945
           The list can be redefined for sub-directories, but is only
945
           changed for the top level ones in topdirs.
946
           actually changed for the top level ones in topdirs.
946
947
947
           The top-level directories are not affected by this list (that is,
948
           The top-level directories are not affected by this list (that is,
948
           a directory in topdirs might match and would still be indexed).
949
           a directory in topdirs might match and would still be indexed).
949
950
950
           The list in the default configuration does not exclude hidden
951
           The list in the default configuration does not exclude hidden
...
...
968
   filtersdir
969
   filtersdir
969
970
970
           A directory to search for the external filter scripts used to
971
           A directory to search for the external filter scripts used to
971
           index some types of files. The value should not be changed, except
972
           index some types of files. The value should not be changed, except
972
           if you want to modify one of the default scripts. The value can be
973
           if you want to modify one of the default scripts. The value can be
973
           redefined for any subdirectory.
974
           redefined for any sub-directory.
974
975
975
   indexstemminglanguages
976
   indexstemminglanguages
976
977
977
           A list of languages for which the stem expansion databases will be
978
           A list of languages for which the stem expansion databases will be
978
           built. See recollindex(1) for possible values. You can add a stem
979
           built. See recollindex(1) for possible values. You can add a stem
...
...
982
983
983
   defaultcharset
984
   defaultcharset
984
985
985
           The name of the character set used for files that do not contain a
986
           The name of the character set used for files that do not contain a
986
           character set definition (ie: plain text files). This can be
987
           character set definition (ie: plain text files). This can be
987
           redefined for any subdirectory. If it is not set at all, the
988
           redefined for any sub-directory. If it is not set at all, the
988
           character set used is the one defined by the nls environment
989
           character set used is the one defined by the nls environment
989
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
990
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
990
991
991
   guesscharset
992
   guesscharset
992
993
...
...
997
   usesystemfilecommand
998
   usesystemfilecommand
998
999
999
           Decide if we use the file -i system command as a final step for
1000
           Decide if we use the file -i system command as a final step for
1000
           determining the mime type for a file (the main procedure uses
1001
           determining the mime type for a file (the main procedure uses
1001
           suffix associations as defined in the mimemap file). This can be
1002
           suffix associations as defined in the mimemap file). This can be
1002
           useful for files with suffixless names, but it will also cause the
1003
           useful for files with suffix-less names, but it will also cause
1003
           indexing of many bogus "text" files.
1004
           the indexing of many bogus "text" files.
1004
1005
1005
   indexallfilenames
1006
   indexallfilenames
1006
1007
1007
           Recoll indexes file names in a special section of the database to
1008
           Recoll indexes file names in a special section of the database to
1008
           allow specific file names searches using wild cards. This
1009
           allow specific file names searches using wild cards. This
1009
           parameter decides if file name indexing is performed only for
1010
           parameter decides if file name indexing is performed only for
1010
           files with mime types that would qualify them for full text
1011
           files with mime types that would qualify them for full text
1011
           indexing, or for all files inside the selected subtrees,
1012
           indexing, or for all files inside the selected subtrees,
1012
           independant of mime type.
1013
           independently of mime type.
1013
1014
1014
   idxabsmlen
1015
   idxabsmlen
1015
1016
1016
           Recoll stores an abstract for each indexed file inside the
1017
           Recoll stores an abstract for each indexed file inside the
1017
           database. This is so that they can be displayed inside the result
1018
           database. This is so that they can be displayed inside the result
...
...
1042
1043
1043
   mimemap also has a recoll_noindex variable which is a list of suffixes.
1044
   mimemap also has a recoll_noindex variable which is a list of suffixes.
1044
   Matching files will be skipped (avoids unnecessary decompressions or file
1045
   Matching files will be skipped (avoids unnecessary decompressions or file
1045
   executions). This is partially redundant with skippedNames in the main
1046
   executions). This is partially redundant with skippedNames in the main
1046
   configuration file, with two differences: it will not affect directories,
1047
   configuration file, with two differences: it will not affect directories,
1047
   and it can be changed for any subdirectory.
1048
   and it can be changed for any sub-directory.
1048
1049
1049
     ----------------------------------------------------------------------
1050
     ----------------------------------------------------------------------
1050
1051
1051
  4.4.3. The mimeconf file
1052
  4.4.3. The mimeconf file
1052
1053
1053
   mimeconf specifies how the different mime types are handled for indexing,
1054
   mimeconf specifies how the different mime types are handled for indexing,
1054
   and for display.
1055
   and for display.
1055
1056
1056
   Changing the indexing parameters is probably not a good idea except if you
1057
   Changing the indexing parameters is probably not a good idea except if you
1057
   are a Recoll developper.
1058
   are a Recoll developers.
1058
1059
1059
   You may want to adjust the external viewers defined in (ie: html is either
1060
   You may want to adjust the external viewers defined in (ie: HTML is either
1060
   previewed internally or displayed using firefox, but you may prefer
1061
   previewed internally or displayed using firefox, but you may prefer
1061
   mozilla, your openoffice.org program might be named oofice instead of
1062
   mozilla, your openoffice.org program might be named oofice instead of
1062
   openoffice ...). Look for the [view] section.
1063
   openoffice ...). Look for the [view] section.
1063
1064
1064
   You can also change the icons which are displayed by recoll in the result
1065
   You can also change the icons which are displayed by recoll in the result