Switch to unified view

a/src/README b/src/README
...
...
43
43
44
                             2.4.2. Using cron to automate indexing
44
                             2.4.2. Using cron to automate indexing
45
45
46
                2.5. Real time indexing
46
                2.5. Real time indexing
47
47
48
   3. Search
48
   3. Searching
49
49
50
                3.1. Simple search
50
                3.1. Simple search
51
51
52
                3.2. The result list
52
                3.2. The result list
53
53
54
                             3.2.1. The result list right-click menu
54
                             3.2.1. The result list right-click menu
55
55
56
                3.3. The preview window
56
                3.3. The preview window
57
57
58
                3.4. The query language
59
58
                3.4. Complex/advanced search
60
                3.5. Complex/advanced search
59
61
60
                3.5. The term explorer tool
62
                3.6. The term explorer tool
61
63
64
                3.7. More about wildcards
65
62
                3.6. Multiple databases
66
                3.8. Multiple databases
63
67
64
                3.7. Document history
68
                3.9. Document history
65
69
66
                3.8. Sorting search results
70
                3.10. Sorting search results
67
71
68
                3.9. Search tips, shortcuts
72
                3.11. Search tips, shortcuts
69
73
70
                3.10. Customizing the search interface
74
                3.12. Customizing the search interface
71
75
72
   4. Installation
76
   4. Installation
73
77
74
                4.1. Installing a prebuilt copy
78
                4.1. Installing a prebuilt copy
75
79
...
...
94
                             4.4.2. The mimemap file
98
                             4.4.2. The mimemap file
95
99
96
                             4.4.3. The mimeconf file
100
                             4.4.3. The mimeconf file
97
101
98
                             4.4.4. The mimeview file
102
                             4.4.4. The mimeview file
103
104
                             4.4.5. Examples of configuration adjustments
99
105
100
     ----------------------------------------------------------------------
106
     ----------------------------------------------------------------------
101
107
102
                            Chapter 1. Introduction
108
                            Chapter 1. Introduction
103
109
...
...
207
213
208
   Indexing is the process by which the set of documents is analyzed and the
214
   Indexing is the process by which the set of documents is analyzed and the
209
   data entered into the database. Recoll indexing is normally incremental:
215
   data entered into the database. Recoll indexing is normally incremental:
210
   documents will only be processed if they have been modified. On the first
216
   documents will only be processed if they have been modified. On the first
211
   execution, of course, all documents will need processing. A full index
217
   execution, of course, all documents will need processing. A full index
212
   build can be forced later on by specifying an option to the indexing
218
   build can be forced later by specifying an option to the indexing command
213
   command (recollindex -z).
219
   (recollindex -z).
214
220
215
   Recoll indexing can be performed with two different methods:
221
   Recoll indexing can be performed with two different methods:
216
222
217
     * Periodic indexing: indexing takes place at discrete times, by
223
     * Periodic indexing: indexing takes place at discrete times, by
218
       executing the recollindex command. The typical usage is to have a
224
       executing the recollindex command. The typical usage is to have a
...
...
433
   email folders change. You probably do not want to enable it if your system
439
   email folders change. You probably do not want to enable it if your system
434
   is short on resources. Periodic indexing is adequate in most cases.
440
   is short on resources. Periodic indexing is adequate in most cases.
435
441
436
     ----------------------------------------------------------------------
442
     ----------------------------------------------------------------------
437
443
438
                               Chapter 3. Search
444
                              Chapter 3. Searching
439
445
440
   The recoll program provides the user interface for searching. It is based
446
   The recoll program provides the user interface for searching. It is based
441
   on the QT library.
447
   on the QT library.
442
448
443
     ----------------------------------------------------------------------
449
     ----------------------------------------------------------------------
...
...
450
456
451
    3. Enter search term(s) in the text field at the top of the window.
457
    3. Enter search term(s) in the text field at the top of the window.
452
458
453
    4. Click the Search button or hit the Enter key to start the search.
459
    4. Click the Search button or hit the Enter key to start the search.
454
460
455
   The initial default search mode is Any term. This will look for documents
461
   The initial default search mode is All terms. This will look for documents
456
   with any of the search terms (the ones with more terms will get better
462
   containing all of the search terms (the ones with more terms will get
457
   scores). All terms will ensure that only documents with all the terms will
463
   better scores). Any term will search for documents where at least one of
458
   be returned. File name will specifically look for file names, and allows
464
   the terms appear. File name will specifically look for file names.
459
   using wildcards (*, ? , []).
465
466
   The fourth entry (Query Language) is described in its own section.
467
468
   All search modes allow wildcards inside terms (*, ?, []). You may want to
469
   have a look at the section about wildcards for more information about
470
   this.
460
471
461
   You can search for exact phrases (adjacent words in a given order) by
472
   You can search for exact phrases (adjacent words in a given order) by
462
   enclosing the input inside double quotes. Ex: "virtual reality".
473
   enclosing the input inside double quotes. Ex: "virtual reality".
463
474
464
   Character case has no influence on search, except that you can disable
475
   Character case has no influence on search, except that you can disable
...
...
470
   Recoll remembers the last few searches that you performed. You can use the
481
   Recoll remembers the last few searches that you performed. You can use the
471
   simple search text entry widget (a combobox) to recall them (click on the
482
   simple search text entry widget (a combobox) to recall them (click on the
472
   thing at the right of the text field). Please note, however, that only the
483
   thing at the right of the text field). Please note, however, that only the
473
   search texts are remembered, not the mode (all/any/file name).
484
   search texts are remembered, not the mode (all/any/file name).
474
485
475
   Typing Esc Space) while entering a word in the simple search entry will
486
   Typing Esc Space while entering a word in the simple search entry will
476
   open a window with possible completions for the word. The completions are
487
   open a window with possible completions for the word. The completions are
477
   extracted from the database.
488
   extracted from the database.
478
489
479
   Double-clicking on a word in the result list or a preview window will
490
   Double-clicking on a word in the result list or a preview window will
480
   insert it into the simple search entry field.
491
   insert it into the simple search entry field.
492
493
   Note that, apart from wildcard characters (single ? characters are ok),
494
   you can cut and paste any text into an All terms or Any term search field,
495
   punctuation, newlines and all. Recoll will process it and produce a
496
   meaningful search. This is what most differentiates this mode from the
497
   Query Language mode, where you have to care about the syntax.
481
498
482
   You can use the Tools / Advanced search dialog for more complex searches.
499
   You can use the Tools / Advanced search dialog for more complex searches.
483
500
484
     ----------------------------------------------------------------------
501
     ----------------------------------------------------------------------
485
502
...
...
494
511
495
   Clicking on the Preview link for an entry will open an internal preview
512
   Clicking on the Preview link for an entry will open an internal preview
496
   window for the document. Further Preview clicks for the same search will
513
   window for the document. Further Preview clicks for the same search will
497
   open tabs in the existing preview window. You can use Shift+Click to force
514
   open tabs in the existing preview window. You can use Shift+Click to force
498
   the creation of another preview window, which may be useful to view the
515
   the creation of another preview window, which may be useful to view the
499
   documents side by side.
516
   documents side by side. (You can also browse successive results in a
517
   single preview window by typing Shift+ArrowUp/Down in the window).
500
518
501
   Clicking the Edit link will attempt to start an external viewer. The
519
   Clicking the Edit link will attempt to start an external viewer. The
502
   viewers can be configured through the user preferences dialog, or by
520
   viewers can be configured through the user preferences dialog, or by
503
   editing the mimeview configuration file.
521
   editing the mimeview configuration file.
504
522
...
...
541
     * Find similar
559
     * Find similar
542
560
543
     * Parent document
561
     * Parent document
544
562
545
   The Preview and Edit entries do the same thing as the corresponding links.
563
   The Preview and Edit entries do the same thing as the corresponding links.
546
   The two following entries will copy either an URL or the file path to the
564
547
   clipboard, for pasting into another application.
565
   The Copy File Name and Copy Url copy the relevant data to the clipboard,
566
   for later pasting.
548
567
549
   The Find similar entry will select a number of relevant term from the
568
   The Find similar entry will select a number of relevant term from the
550
   current document and enter them into the simple search field. You can then
569
   current document and enter them into the simple search field. You can then
551
   start a simple search, with a good chance of finding documents related to
570
   start a simple search, with a good chance of finding documents related to
552
   the current result.
571
   the current result.
553
554
   The Copy File Name and Copy Url copy the relevant data to the clipboard,
555
   for later pasting.
556
572
557
   The Parent document entry will appear for documents which are not actually
573
   The Parent document entry will appear for documents which are not actually
558
   files but are part of, or attached to, a higher level document. This entry
574
   files but are part of, or attached to, a higher level document. This entry
559
   is mainly useful for email attachments and permits viewing the message to
575
   is mainly useful for email attachments and permits viewing the message to
560
   which the document is attached. Note that the entry will also appear for
576
   which the document is attached. Note that the entry will also appear for
...
...
568
584
569
   The preview window opens when you first click a Preview link inside the
585
   The preview window opens when you first click a Preview link inside the
570
   result list.
586
   result list.
571
587
572
   Subsequent preview requests for a given search open new tabs in the
588
   Subsequent preview requests for a given search open new tabs in the
573
   existing window.
589
   existing window (except if you hold the Shift key while clicking which
590
   will open a new window for side by side viewing).
574
591
575
   Starting another search and requesting a preview will create a new preview
592
   Starting another search and requesting a preview will create a new preview
576
   window. The old one stays open until you close it.
593
   window. The old one stays open until you close it.
577
594
578
   You can close a preview tab by typing ^W (Ctrl + W) in the window. Closing
595
   You can close a preview tab by typing ^W (Ctrl + W) in the window. Closing
...
...
597
   string is found, the cursor will be positioned at the first occurrence of
614
   string is found, the cursor will be positioned at the first occurrence of
598
   the search string.
615
   the search string.
599
616
600
     ----------------------------------------------------------------------
617
     ----------------------------------------------------------------------
601
618
619
3.4. The query language
620
621
   The query language processor is activated on the simple search entry when
622
   the search mode selector is set to Query Language.
623
624
   Here follows a sample request that we are going to explain:
625
626
           mime:message/rfc822 author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes
627
     
628
629
   This would search for all email messages with John Doe appearing as a
630
   phrase in the From: header, and containing either beatles or lennon and
631
   either live or unplugged but not potatoes.
632
633
   The first element, mime:message/rfc822 is a special switch that restricts
634
   the results to be email messages. There could be several such switches,
635
   which would form a list of allowed types.
636
637
   The second element author:"john doe" is a phrase search limited to a
638
   specific field. Phrase searches are specified as usual by enclosing the
639
   words in double quotes. The field specification appears before the colon.
640
   Recoll currently manages the following fields:
641
642
     * title, subject or caption are synonyms which specify data to be
643
       searched for in the document title or subject.
644
645
     * author or from for searching the documents originators.
646
647
     * keyword for searching the document specified keywords (few documents
648
       actually have any).
649
650
   The query language is currently the only way to use the Recoll field
651
   search capability.
652
653
   All elements in the search entry are normally combined with an implicit
654
   AND. It is possible to specify that elements be OR'ed instead, as in
655
   Beatles OR Lennon. The OR must be entered literally (capitals), and it has
656
   priority over the AND associations: word1 word2 OR word3 means word1 AND
657
   (word2 OR word3) not (word1 AND word2) OR word3. Do not enter explicit
658
   parenthesis, they are not supported for now.
659
660
   An entry preceded by a - specifies a term that should not appear.
661
662
   Words inside phrases and capitalized words are not stem-expanded.
663
   Wildcards may be used anywhere.
664
665
   You can use the show query link at the top of the result list to check the
666
   exact query which was finally executed by Xapian.
667
668
     ----------------------------------------------------------------------
669
602
3.4. Complex/advanced search
670
3.5. Complex/advanced search
603
671
604
   The advanced search dialog has fields that will allow a more refined
672
   The advanced search dialog has a number of fields that will allow a more
605
   search. It has a number of entry fields, each of which is configurable for
673
   refined search. Each entry field is configurable for the following modes:
606
   the following modes:
607
674
608
     * All terms.
675
     * All terms.
609
676
610
     * Any term.
677
     * Any term.
611
678
...
...
617
684
618
     * Filename search with wildcards.
685
     * Filename search with wildcards.
619
686
620
   Additional entry fields can be created by clicking the Add clause button.
687
   Additional entry fields can be created by clicking the Add clause button.
621
688
622
   All relevant fields will be combined by an implicit AND or OR conjunction.
689
   You can choose that all relevant fields will be combined by either an AND
623
   All types of clauses except "phrase" and "near" can accept a mix of single
690
   or an OR conjunction. All types of clauses except "phrase" and "near" can
624
   words and phrases enclosed in double quotes. Stemming expansion will be
691
   accept a mix of single words and phrases enclosed in double quotes.
625
   performed for all terms not beginning with a capital letter, except for
692
   Stemming expansion will be performed for all terms not beginning with a
626
   "phrase" clauses.
693
   capital letter, except for terms inside "phrase" clauses. Wildcards will
694
   be processed everywhere.
627
695
628
   Advanced search will also let you search for documents of specific mime
696
   Advanced search will also let you search for documents of specific mime
629
   types (ie: only text/plain, or text/HTML or application/pdf etc...). The
697
   types (ie: only text/plain, or text/HTML or application/pdf etc...). The
630
   state of the file type selection can be saved as the default (the file
698
   state of the file type selection can be saved as the default (the file
631
   type filter will not be activated at program start-up, but the lists will
699
   type filter will not be activated at program start-up, but the lists will
...
...
642
   Click on the Show query details link at the top of the result page to see
710
   Click on the Show query details link at the top of the result page to see
643
   the query expansion.
711
   the query expansion.
644
712
645
     ----------------------------------------------------------------------
713
     ----------------------------------------------------------------------
646
714
647
3.5. The term explorer tool
715
3.6. The term explorer tool
648
716
649
   Recoll automatically manages the expansion of search terms to their
717
   Recoll automatically manages the expansion of search terms to their
650
   derivatives (ie: plural/singular, verb inflections). But there are other
718
   derivatives (ie: plural/singular, verb inflections). But there are other
651
   cases where the exact search term is not known. For example, you may not
719
   cases where the exact search term is not known. For example, you may not
652
   remember the exact spelling, or only know the beginning of the name.
720
   remember the exact spelling, or only know the beginning of the name.
...
...
656
   terms list. It has three modes of operations:
724
   terms list. It has three modes of operations:
657
725
658
   Wildcard
726
   Wildcard
659
727
660
           In this mode of operation, you can enter a search string with
728
           In this mode of operation, you can enter a search string with
661
           shell-like wildcards (*, ?). ie: xapi* .
729
           shell-like wildcards (*, ?, []). ie: xapi* would display all index
730
           terms beginning with xapi. (More about wildcards here).
662
731
663
   Regular expression
732
   Regular expression
664
733
665
           This mode will accept a regular expression as input. Example:
734
           This mode will accept a regular expression as input. Example:
666
           word[0-9]+ . The regular expression is anchored by enclosing in ^
735
           word[0-9]+. The expression is implicitely anchored at the
667
           and $ before execution.
736
           beginning. Ie: press will match pression but not expression. You
737
           can use .*press to match the latter, but be aware that this will
738
           cause a full index term list scan, which can be quite long.
668
739
669
   Stem expansion
740
   Stem expansion
670
741
671
           This mode will perform the usual stem expansion normally done as
742
           This mode will perform the usual stem expansion normally done as
672
           part user input processing. As such it is probably mostly useful
743
           part user input processing. As such it is probably mostly useful
...
...
693
   simple search entry field. You can also cut/paste between the result list
764
   simple search entry field. You can also cut/paste between the result list
694
   and any entry field (the end of lines will be taken care of).
765
   and any entry field (the end of lines will be taken care of).
695
766
696
     ----------------------------------------------------------------------
767
     ----------------------------------------------------------------------
697
768
769
3.7. More about wildcards
770
771
   All words entered in Recoll search fields will be processed for wildcard
772
   expansion before the request is finally executed.
773
774
   The wildcard characters are:
775
776
     * * which matches 0 or more characters.
777
778
     * ? which matches a single character.
779
780
     * [] which allow defining sets of characters to be matched (ex: [abc]
781
       matches a single character which may be 'a' or 'b' or 'c', [0-9]
782
       matches any number.
783
784
   You should be aware of a few things before using wildcards.
785
786
     * Using a wildcard character at the beginning of a word can make for a
787
       slow search because Recoll will have to scan the whole index term list
788
       to find the matches.
789
790
     * Using a * at the end of a word can produce more matches than you would
791
       think, and strange search results. You can use the term explorer tool
792
       to check what completions exist for a given term. You can also see
793
       exactly what search was performed by clicking on the link at the top
794
       of the result list. In general, for natural language terms, stem
795
       expansion will produce better results than an ending * (stem expansion
796
       is turned off when any wildcard character appears in the term).
797
798
     ----------------------------------------------------------------------
799
698
3.6. Multiple databases
800
3.8. Multiple databases
699
801
700
   Multiple Recoll databases or indexes can be created by using several
802
   Multiple Recoll databases or indexes can be created by using several
701
   configuration directories which are usually set to index different areas
803
   configuration directories which are usually set to index different areas
702
   of the file system. A specific index can be selected for updating or
804
   of the file system. A specific index can be selected for updating or
703
   searching, using the RECOLL_CONFDIR environment variable or the -c option
805
   searching, using the RECOLL_CONFDIR environment variable or the -c option
...
...
729
831
730
 export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
832
 export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
731
833
732
   A typical usage scenario for the multiple index feature would be for a
834
   A typical usage scenario for the multiple index feature would be for a
733
   system administrator to set up a central index for shared data, that you
835
   system administrator to set up a central index for shared data, that you
734
   may choose to search, or not, in addition to your personal data. Of
836
   choose to search or not in addition to your personal data. Of course,
735
   course, there are other possibilities. There are many cases where you know
837
   there are other possibilities. There are many cases where you know the
736
   the subset of files that you want to be searched for a given query, and
838
   subset of files that should be searched, and where narrowing the search
737
   where restricting the query will much improve the precision of the
839
   can improve the results. You can achieve approximately the same effect
738
   results. This can also be performed with the directory filter in advanced
840
   with the directory filter in advanced search, but multiple indexes will
739
   search, but multiple indexes will have much better performance and may be
841
   have much better performance and may be worth the trouble.
740
   worth the trouble.
741
842
742
     ----------------------------------------------------------------------
843
     ----------------------------------------------------------------------
743
844
744
3.7. Document history
845
3.9. Document history
745
846
746
   Documents that you actually view (with the internal preview or an external
847
   Documents that you actually view (with the internal preview or an external
747
   tool) are entered into the document history, which is remembered. You can
848
   tool) are entered into the document history, which is remembered. You can
748
   display the history list by using the Tools/Doc History menu entry.
849
   display the history list by using the Tools/Doc History menu entry.
749
850
750
     ----------------------------------------------------------------------
851
     ----------------------------------------------------------------------
751
852
752
3.8. Sorting search results
853
3.10. Sorting search results
753
854
754
   The documents in a result list are normally sorted in order of relevance.
855
   The documents in a result list are normally sorted in order of relevance.
755
   It is possible to specify different sort parameters by using the Sort
856
   It is possible to specify different sort parameters by using the Sort
756
   parameters dialog (located in the Tools menu).
857
   parameters dialog (located in the Tools menu).
757
858
...
...
762
   The sort parameters stay in effect until they are explicitly reset, or the
863
   The sort parameters stay in effect until they are explicitly reset, or the
763
   program exits. An activated sort is indicated in the result list header.
864
   program exits. An activated sort is indicated in the result list header.
764
865
765
     ----------------------------------------------------------------------
866
     ----------------------------------------------------------------------
766
867
767
3.9. Search tips, shortcuts
868
3.11. Search tips, shortcuts
768
869
769
   Term completion. Typing Esc Space in the simple search entry field while
870
   Term completion. Typing Esc Space in the simple search entry field while
770
   entering a word will either complete the current word if its beginning
871
   entering a word will either complete the current word if its beginning
771
   matches a unique term in the index, or open a window to propose a list of
872
   matches a unique term in the index, or open a window to propose a list of
772
   completions.
873
   completions.
...
...
828
929
829
   Quitting. Entering ^Q almost anywhere will close the application.
930
   Quitting. Entering ^Q almost anywhere will close the application.
830
931
831
     ----------------------------------------------------------------------
932
     ----------------------------------------------------------------------
832
933
833
3.10. Customizing the search interface
934
3.12. Customizing the search interface
834
935
835
   It is possible to customize some aspects of the search interface by using
936
   It is possible to customize some aspects of the search interface by using
836
   Query configuration entry in the Preferences menu.
937
   Query configuration entry in the Preferences menu.
837
938
838
   There are two tabs in the dialog, dealing with the interface itself, and
939
   There are two tabs in the dialog, dealing with the interface itself, and
...
...
900
1001
901
     * Auto-start simple search on white space entry: if this is checked, a
1002
     * Auto-start simple search on white space entry: if this is checked, a
902
       search will be executed each time you enter a space in the simple
1003
       search will be executed each time you enter a space in the simple
903
       search input field. This lets you look at the result list as you enter
1004
       search input field. This lets you look at the result list as you enter
904
       new terms. This is off by default, you may like it or not...
1005
       new terms. This is off by default, you may like it or not...
1006
1007
     * Start with advanced search dialog open and Start with sort dialog
1008
       open: If you use these dialogs all the time, checking these entries
1009
       will get them to open when recoll starts.
1010
1011
     * Use desktop preferences to choose document editor: if this is checked,
1012
       the xdg-open utility will be used to open files when you click the
1013
       Edit link in the result list, instead of the application defined in
1014
       mimeview. xdg-open will in term use your desktop preferences to choose
1015
       an appropriate application.
905
1016
906
   Search parameters:
1017
   Search parameters:
907
1018
908
     * Stemming language: stemming obviously depends on the document's
1019
     * Stemming language: stemming obviously depends on the document's
909
       language. This listbox will let you chose among the stemming databases
1020
       language. This listbox will let you chose among the stemming databases
...
...
931
   External indexes: This panel will let you browse for additional indexes
1042
   External indexes: This panel will let you browse for additional indexes
932
   that you may want to search. External indexes are designated by their
1043
   that you may want to search. External indexes are designated by their
933
   database directory (ie: /home/someothergui/.recoll/xapiandb,
1044
   database directory (ie: /home/someothergui/.recoll/xapiandb,
934
   /usr/local/recollglobal/xapiandb).
1045
   /usr/local/recollglobal/xapiandb).
935
1046
936
   Once entered, the indexes will appear in the All indexes list, and you can
1047
   Once entered, the indexes will appear in the External indexes list, and
937
   chose which ones you want to use at any moment by transferring them
1048
   you can chose which ones you want to use at any moment by checking or
938
   to/from the Active indexes list.
1049
   unchecking their entries.
939
1050
940
   Your main database (the one the current configuration indexes to), is
1051
   Your main database (the one the current configuration indexes to), is
941
   always implicitly active. If this is not desirable, you can set up your
1052
   always implicitly active. If this is not desirable, you can set up your
942
   configuration so that it indexes, for example, an empty directory.
1053
   configuration so that it indexes, for example, an empty directory.
943
1054
...
...
1010
1121
1011
     * MP3: Recoll will use the id3info command from the id3lib package to
1122
     * MP3: Recoll will use the id3info command from the id3lib package to
1012
       extract tag information. Without it, only the file names will be
1123
       extract tag information. Without it, only the file names will be
1013
       indexed.
1124
       indexed.
1014
1125
1015
   Text, HTML, mail folders and Openoffice files are processed internally.
1126
   Text, HTML, mail folders Openoffice and Scribus files are processed
1127
   internally. Lyx is used to index Lyx files. Many filters need sed and awk.
1016
1128
1017
     ----------------------------------------------------------------------
1129
     ----------------------------------------------------------------------
1018
1130
1019
4.3. Building from source
1131
4.3. Building from source
1020
1132
...
...
1110
   recoll and recollindex.
1222
   recoll and recollindex.
1111
1223
1112
   If the .recoll directory does not exist when recoll or recollindex are
1224
   If the .recoll directory does not exist when recoll or recollindex are
1113
   started, it will be created with a set of empty configuration files.
1225
   started, it will be created with a set of empty configuration files.
1114
   recoll will give you a chance to edit the configuration file before
1226
   recoll will give you a chance to edit the configuration file before
1115
   starting indexing. recollindex will proceed immediately.
1227
   starting indexing. recollindex will proceed immediately. To avoid
1228
   mistakes, the automatic directory creation will only occur for the default
1229
   location, not if -c or RECOLL_CONFDIR were used (in the latter cases, you
1230
   will have to create the directory).
1116
1231
1117
   All configuration files share the same format. For example, a short
1232
   All configuration files share the same format. For example, a short
1118
   extract of the main configuration file might look as follows:
1233
   extract of the main configuration file might look as follows:
1119
1234
1120
         # Space-separated list of directories to index.
1235
         # Space-separated list of directories to index.
...
...
1140
   in the next section.
1255
   in the next section.
1141
1256
1142
   The tilde character (~) is expanded in file names to the name of the
1257
   The tilde character (~) is expanded in file names to the name of the
1143
   user's home directory.
1258
   user's home directory.
1144
1259
1145
   White space is used for separation inside lists. Elements with embedded
1260
   White space is used for separation inside lists. List elements with
1146
   spaces can be quoted using double-quotes.
1261
   embedded spaces can be quoted using double-quotes.
1147
1262
1148
     ----------------------------------------------------------------------
1263
     ----------------------------------------------------------------------
1149
1264
1150
  4.4.1. Main configuration file
1265
  4.4.1. Main configuration file
1151
1266
...
...
1170
   dbdir
1285
   dbdir
1171
1286
1172
           The name of the Xapian data directory. It will be created if
1287
           The name of the Xapian data directory. It will be created if
1173
           needed when the index is initialized. If this is not an absolute
1288
           needed when the index is initialized. If this is not an absolute
1174
           path, it will be interpreted relative to the configuration
1289
           path, it will be interpreted relative to the configuration
1175
           directory.
1290
           directory. The value can have embedded spaces but starting or
1291
           trailing spaces will be trimmed. You cannot use quotes here.
1176
1292
1177
   skippedNames
1293
   skippedNames
1178
1294
1179
           A space-separated list of patterns for names of files or
1295
           A space-separated list of patterns for names of files or
1180
           directories that should be completely ignored. The list defined in
1296
           directories that should be completely ignored. The list defined in
1181
           the default file is:
1297
           the default file is:
1182
1298
1183
 *~ #* bin CVS  Cache caughtspam  tmp
1299
 skippedNames = #* bin CVS  Cache cache* caughtspam  tmp .thumbnails .svn \
1300
          *~ recollrc
1184
1301
1185
           The list can be redefined for sub-directories, but is only
1302
           The list can be redefined for sub-directories, but is only
1186
           actually changed for the top level ones in topdirs.
1303
           actually changed for the top level ones in topdirs.
1187
1304
1188
           The top-level directories are not affected by this list (that is,
1305
           The top-level directories are not affected by this list (that is,
...
...
1193
           index quite a few things that you do not want. On the other hand,
1310
           index quite a few things that you do not want. On the other hand,
1194
           mail user agents like thunderbird usually store messages in hidden
1311
           mail user agents like thunderbird usually store messages in hidden
1195
           directories, and you probably want this indexed. One possible
1312
           directories, and you probably want this indexed. One possible
1196
           solution is to have .* in skippedNames, and add things like
1313
           solution is to have .* in skippedNames, and add things like
1197
           ~/.thunderbird or ~/.evolution in topdirs.
1314
           ~/.thunderbird or ~/.evolution in topdirs.
1315
1316
   skippedPaths and daemSkippedPaths
1317
1318
           A space-separated list of patterns for paths of files or
1319
           directories that should be skipped. There is no default in the
1320
           sample configuration file, but the code always adds the
1321
           configuration and database directories in there.
1322
1323
           skippedPaths is used both by batch and real time indexing.
1324
           daemSkippedPaths can be used to specify things that should be
1325
           indexed at startup, but not monitored.
1326
1327
           Example of use for skipping text files only in a specific
1328
           directory:
1329
1330
 skippedPaths = ~/somedir/*.txt
1331
             
1198
1332
1199
   loglevel,daemloglevel
1333
   loglevel,daemloglevel
1200
1334
1201
           Verbosity level for recoll and recollindex. A value of 4 lists
1335
           Verbosity level for recoll and recollindex. A value of 4 lists
1202
           quite a lot of debug/information messages. 2 only lists errors.
1336
           quite a lot of debug/information messages. 2 only lists errors.
...
...
1325
   non-default entries, which will override those from the central
1459
   non-default entries, which will override those from the central
1326
   configuration file.
1460
   configuration file.
1327
1461
1328
   Please note that these entries must be placed under a [view] section.
1462
   Please note that these entries must be placed under a [view] section.
1329
1463
1464
   If Use desktop preferences to choose document editor is checked in the
1465
   user preferences, all mimeview entries will be ignored except the one
1466
   labelled application/x-all (which is set to use xdg-open by default).
1467
1330
     ----------------------------------------------------------------------
1468
     ----------------------------------------------------------------------
1469
1470
  4.4.5. Examples of configuration adjustments
1471
1472
    4.4.5.1. Adding an external viewer for an non-indexed type
1473
1474
   Imagine that you have some kind of file which does not have indexable
1475
   content, but for which you would like to have a functional Edit link in
1476
   the result list (when found by file name). The file names end in .blob and
1477
   can be displayed by application blobviewer.
1478
1479
   You need two entries in the configuration files for this to work:
1480
1481
     * In $RECOLL_CONFDIR/mimemap (typically ~/.recoll/mimemap), add the
1482
       following line:
1483
1484
              application/x-blobapp = .blob
1485
          
1486
1487
       Note that the mime type is made up here, and you could call it
1488
       diesel/oil just the same.
1489
1490
     * In $RECOLL_CONFDIR/mimeview under the [view] section:
1491
1492
                  application/x-blobapp = blobviewer %f
1493
             
1494
1495
       We are supposing that blobviewer wants a file name parameter here, you
1496
       would use %u if it liked URLs better.
1497
1498
   If you just wanted to change the application used by Recoll to display a
1499
   mime type which it already knows, you would just need to edit mimeview.
1500
   The entries you add in your personal file override those in the central
1501
   configuration, which you do not need to alter
1502
1503
     ----------------------------------------------------------------------
1504
1505
    4.4.5.2. Adding indexing support for a new file type
1506
1507
   Let us now imagine that the above .blob files actually contain indexable
1508
   text and that you know how to extract it with a command line program.
1509
   Getting Recoll to index the files is easy. You need to perform the above
1510
   alteration, and also to add data to the mimeconf file (typically in
1511
   ~/.recoll/mimeconf):
1512
1513
     * Under the [index] section, add the following line (more about the
1514
       rclblob indexing script later):
1515
1516
                  application/x-blobapp = exec rclblob
1517
             
1518
1519
     * Under the [icons] section, you should choose an icon to be displayed
1520
       for the files inside the result lists. Icons are normally 64x64 pixels
1521
       PNG files which live in /usr/[local/]share/recoll/images.
1522
1523
     * Under the [categories] section, you should add the mime type where it
1524
       makes sense (you can also create a category). Categories may be used
1525
       for filtering in advanced search.
1526
1527
   The rclblob filter should be an executable program or script which exists
1528
   inside /usr/[local/]share/recoll/filters. It will be given a file name as
1529
   argument and should output the text contents in html format on the
1530
   standard output.
1531
1532
   The html could be very minimal like the following example:
1533
1534
 <html><head>
1535
 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
1536
 </head>
1537
 <body>some text content</body></html>
1538
         
1539
1540
   You should take care to escape some characters inside the text by
1541
   transforming them into appropriate entities. "&" should be transformed
1542
   into "&amp;", "<" should be transformed into "&lt;".
1543
1544
   The character set needs to be specified in the header. It does not need to
1545
   be UTF-8 (Recoll will take care of translating it), but it must be
1546
   accurate for good results.
1547
1548
   Recoll will also make use of other header fields if they are present:
1549
   title, description, keywords.
1550
1551
   The easiest way to write a new filter is probably to start from an
1552
   existing one.
1553
1554
     ----------------------------------------------------------------------