|
a/src/README |
|
b/src/README |
|
... |
|
... |
10 |
|
10 |
|
11 |
Copyright (c) 2005 Jean-Francois Dockes
|
11 |
Copyright (c) 2005 Jean-Francois Dockes
|
12 |
|
12 |
|
13 |
This document introduces full text search notions and describes the
|
13 |
This document introduces full text search notions and describes the
|
14 |
installation and use of the Recoll application. It currently describes
|
14 |
installation and use of the Recoll application. It currently describes
|
15 |
Recoll 1.9.
|
15 |
Recoll 1.12.
|
16 |
|
|
|
17 |
[ Split HTML / Single HTML ]
|
|
|
18 |
|
16 |
|
19 |
----------------------------------------------------------------------
|
17 |
----------------------------------------------------------------------
|
20 |
|
18 |
|
21 |
Table of Contents
|
19 |
Table of Contents
|
22 |
|
20 |
|
|
... |
|
... |
48 |
|
46 |
|
49 |
2.4.2. Using cron to automate indexing
|
47 |
2.4.2. Using cron to automate indexing
|
50 |
|
48 |
|
51 |
2.5. Real time indexing
|
49 |
2.5. Real time indexing
|
52 |
|
50 |
|
53 |
3. Searching
|
51 |
3. Searching with the Qt graphical user interface
|
54 |
|
52 |
|
55 |
3.1. Simple search
|
53 |
3.1. Simple search
|
56 |
|
54 |
|
57 |
3.2. The result list
|
55 |
3.2. The result list
|
58 |
|
56 |
|
|
... |
|
... |
70 |
|
68 |
|
71 |
3.8. Multiple databases
|
69 |
3.8. Multiple databases
|
72 |
|
70 |
|
73 |
3.9. Document history
|
71 |
3.9. Document history
|
74 |
|
72 |
|
75 |
3.10. Sorting search results
|
73 |
3.10. Sorting search results and collapsing duplicates
|
76 |
|
74 |
|
77 |
3.11. Search tips, shortcuts
|
75 |
3.11. Search tips, shortcuts
|
78 |
|
76 |
|
79 |
3.11.1. Terms and search expansion
|
77 |
3.11.1. Terms and search expansion
|
80 |
|
78 |
|
|
... |
|
... |
82 |
|
80 |
|
83 |
3.11.3. Others
|
81 |
3.11.3. Others
|
84 |
|
82 |
|
85 |
3.12. Customizing the search interface
|
83 |
3.12. Customizing the search interface
|
86 |
|
84 |
|
|
|
85 |
4. Searching with the KDE KIO slave
|
|
|
86 |
|
|
|
87 |
4.1. What's this
|
|
|
88 |
|
|
|
89 |
4.2. Searchable documents
|
|
|
90 |
|
|
|
91 |
5. Searching on the command line
|
|
|
92 |
|
87 |
4. Programming interface
|
93 |
6. Programming interface
|
88 |
|
94 |
|
89 |
4.1. Writing a document filter
|
95 |
6.1. Writing a document filter
|
90 |
|
96 |
|
91 |
4.1.1. Filter HTML output
|
97 |
6.1.1. Filter HTML output
|
92 |
|
98 |
|
93 |
4.2. Field data processing configuration
|
99 |
6.2. Field data processing configuration
|
94 |
|
100 |
|
95 |
4.3. API
|
101 |
6.3. API
|
96 |
|
102 |
|
97 |
4.3.1. Interface elements
|
103 |
6.3.1. Interface elements
|
98 |
|
104 |
|
99 |
4.3.2. Python interface
|
105 |
6.3.2. Python interface
|
100 |
|
106 |
|
101 |
5. Installation
|
107 |
7. Installation
|
102 |
|
108 |
|
103 |
5.1. Installing a prebuilt copy
|
109 |
7.1. Installing a prebuilt copy
|
104 |
|
110 |
|
105 |
5.1.1. Installing through a package system
|
111 |
7.1.1. Installing through a package system
|
106 |
|
112 |
|
107 |
5.1.2. Installing a prebuilt Recoll
|
113 |
7.1.2. Installing a prebuilt Recoll
|
108 |
|
114 |
|
109 |
5.2. Supporting packages
|
115 |
7.2. Supporting packages
|
110 |
|
116 |
|
111 |
5.3. Building from source
|
117 |
7.3. Building from source
|
112 |
|
118 |
|
113 |
5.3.1. Prerequisites
|
119 |
7.3.1. Prerequisites
|
114 |
|
120 |
|
115 |
5.3.2. Building
|
121 |
7.3.2. Building
|
116 |
|
122 |
|
117 |
5.3.3. Installation
|
123 |
7.3.3. Installation
|
118 |
|
124 |
|
119 |
5.4. Configuration overview
|
125 |
7.4. Configuration overview
|
120 |
|
126 |
|
121 |
5.4.1. Main configuration file
|
127 |
7.4.1. Main configuration file
|
122 |
|
128 |
|
123 |
5.4.2. The mimemap file
|
129 |
7.4.2. The mimemap file
|
124 |
|
130 |
|
125 |
5.4.3. The mimeconf file
|
131 |
7.4.3. The mimeconf file
|
126 |
|
132 |
|
127 |
5.4.4. The mimeview file
|
133 |
7.4.4. The mimeview file
|
128 |
|
134 |
|
129 |
5.4.5. Examples of configuration adjustments
|
135 |
7.4.5. Examples of configuration adjustments
|
130 |
|
136 |
|
131 |
5.5. The KDE Kicker Recoll applet
|
137 |
7.5. The KDE Kicker Recoll applet
|
132 |
|
138 |
|
133 |
----------------------------------------------------------------------
|
139 |
----------------------------------------------------------------------
|
134 |
|
140 |
|
135 |
Chapter 1. Introduction
|
141 |
Chapter 1. Introduction
|
136 |
|
142 |
|
|
... |
|
... |
141 |
interface, which will index your home directory by default, allowing you
|
147 |
interface, which will index your home directory by default, allowing you
|
142 |
to search immediately after indexing completes.
|
148 |
to search immediately after indexing completes.
|
143 |
|
149 |
|
144 |
Do not do this if your home directory contains a huge number of documents
|
150 |
Do not do this if your home directory contains a huge number of documents
|
145 |
and you do not want to wait or are very short on disk space. In this case,
|
151 |
and you do not want to wait or are very short on disk space. In this case,
|
146 |
you may want to edit the configuration file first to restrict the indexed
|
152 |
you may first want to customize the configuration to restrict the indexed
|
147 |
area.
|
153 |
area.
|
148 |
|
154 |
|
149 |
Also be aware that you may need to install the appropriate supporting
|
155 |
Also be aware that you may need to install the appropriate supporting
|
150 |
applications for document types that need them (for example antiword for
|
156 |
applications for document types that need them (for example antiword for
|
151 |
ms-word files).
|
157 |
ms-word files).
|
|
... |
|
... |
214 |
documents in different languages in the same index is possible, and useful
|
220 |
documents in different languages in the same index is possible, and useful
|
215 |
in practice, but does introduce possibilities of confusion. Recoll
|
221 |
in practice, but does introduce possibilities of confusion. Recoll
|
216 |
currently makes no attempt at automatic language recognition.
|
222 |
currently makes no attempt at automatic language recognition.
|
217 |
|
223 |
|
218 |
Recoll has many parameters which define exactly what to index, and how to
|
224 |
Recoll has many parameters which define exactly what to index, and how to
|
219 |
classify and decode the source documents. These are kept in a
|
225 |
classify and decode the source documents. These are kept in configuration
|
220 |
configuration file. A default configuration is copied into a standard
|
226 |
files. A default configuration is copied into a standard location (usually
|
221 |
location (usually something like /usr/[local/]share/recoll/examples)
|
227 |
something like /usr/[local/]share/recoll/examples) during installation.
|
222 |
during installation. The default parameters from this file may be
|
228 |
The default parameters from this file may be overridden by values that you
|
223 |
overridden by values that you set inside your personal configuration,
|
229 |
set inside your personal configuration, found by default in the .recoll
|
224 |
found by default in the .recoll sub-directory of your home directory. The
|
230 |
sub-directory of your home directory. The default configuration will index
|
225 |
default configuration will index your home directory with default
|
231 |
your home directory with default parameters and should be sufficient for
|
226 |
parameters and should be sufficient for giving Recoll a try, but you may
|
232 |
giving Recoll a try, but you may want to adjust it later.
|
227 |
want to adjust it later.
|
|
|
228 |
|
233 |
|
229 |
Indexing is started automatically the first time you execute the recoll
|
234 |
Indexing is started automatically the first time you execute the recoll
|
230 |
search graphical user interface, or by executing the recollindex command.
|
235 |
search graphical user interface, or by executing the recollindex command.
|
231 |
|
236 |
|
232 |
Searches are performed inside the recoll program, which has many options
|
237 |
Searches are performed inside the recoll program, which has many options
|
|
... |
|
... |
417 |
|
422 |
|
418 |
----------------------------------------------------------------------
|
423 |
----------------------------------------------------------------------
|
419 |
|
424 |
|
420 |
2.3.1. The indexing configuration GUI
|
425 |
2.3.1. The indexing configuration GUI
|
421 |
|
426 |
|
422 |
As of Recoll 1.10, most parameters for a given indexing configuration can
|
427 |
Most parameters for a given indexing configuration can be set from a
|
423 |
be set from a recoll GUI running on this configuration (either as default,
|
428 |
recoll GUI running on this configuration (either as default, or by setting
|
424 |
or by setting RECOLL_CONFDIR or the -c option.)
|
429 |
RECOLL_CONFDIR or the -c option.)
|
425 |
|
430 |
|
426 |
The interface is started from the Preferences menu. It has two main
|
431 |
The interface is started from the Preferences menu. It has two main
|
427 |
panels. The first panel allows setting global variables, like the list of
|
432 |
panels. The first panel allows setting global variables, like the list of
|
428 |
top directories or the list of skipped paths. The second panel allows
|
433 |
top directories or the list of skipped paths. The second panel allows
|
429 |
setting variables that can be redefined for subdirectories. This second
|
434 |
setting variables that can be redefined for subdirectories. This second
|
|
... |
|
... |
531 |
it if your system is short on resources. Periodic indexing is adequate in
|
536 |
it if your system is short on resources. Periodic indexing is adequate in
|
532 |
most cases.
|
537 |
most cases.
|
533 |
|
538 |
|
534 |
----------------------------------------------------------------------
|
539 |
----------------------------------------------------------------------
|
535 |
|
540 |
|
536 |
Chapter 3. Searching
|
541 |
Chapter 3. Searching with the Qt graphical user interface
|
537 |
|
542 |
|
538 |
The recoll program provides the user interface for searching. It is based
|
543 |
The recoll program provides the main user interface for searching. It is
|
539 |
on the QT library.
|
544 |
based on the QT library.
|
540 |
|
545 |
|
541 |
recoll has two search modes:
|
546 |
recoll has two search modes:
|
542 |
|
547 |
|
543 |
* Simple search (the default, on the main screen) has a single entry
|
548 |
* Simple search (the default, on the main screen) has a single entry
|
544 |
field where you can enter multiple words.
|
549 |
field where you can enter multiple words.
|
|
... |
|
... |
552 |
contain embedded punctuation or other non-textual characters. For exemple,
|
557 |
contain embedded punctuation or other non-textual characters. For exemple,
|
553 |
Recoll can handle things like e-mail addresses, or arbitrary cut and paste
|
558 |
Recoll can handle things like e-mail addresses, or arbitrary cut and paste
|
554 |
from another text window, punctation and all.
|
559 |
from another text window, punctation and all.
|
555 |
|
560 |
|
556 |
The main case where you should enter text differently from how it is
|
561 |
The main case where you should enter text differently from how it is
|
557 |
printed is for east-oriental languages written with Chinese characters.
|
562 |
printed is for east-asian languages (Chinese, Japanese, Korean). Words
|
558 |
Words composed of single or multiple characters should be entered
|
563 |
composed of single or multiple characters should be entered separated by
|
559 |
separated by white space in this case (they would typically be printed
|
564 |
white space in this case (they would typically be printed without white
|
560 |
without white space).
|
565 |
space).
|
561 |
|
566 |
|
562 |
----------------------------------------------------------------------
|
567 |
----------------------------------------------------------------------
|
563 |
|
568 |
|
564 |
3.1. Simple search
|
569 |
3.1. Simple search
|
565 |
|
570 |
|
566 |
1. Start the recoll program.
|
571 |
1. Start the recoll program.
|
567 |
|
572 |
|
568 |
2. Possibly choose a search mode: Any term or All terms or File name.
|
573 |
2. Possibly choose a search mode: Any term, All terms, File name or Query
|
|
|
574 |
language.
|
569 |
|
575 |
|
570 |
3. Enter search term(s) in the text field at the top of the window.
|
576 |
3. Enter search term(s) in the text field at the top of the window.
|
571 |
|
577 |
|
572 |
4. Click the Search button or hit the Enter key to start the search.
|
578 |
4. Click the Search button or hit the Enter key to start the search.
|
573 |
|
579 |
|
|
... |
|
... |
577 |
the terms appear.
|
583 |
the terms appear.
|
578 |
|
584 |
|
579 |
File name will specifically look for file names. The entry will be split
|
585 |
File name will specifically look for file names. The entry will be split
|
580 |
at white space characters, and each pattern will be separately expanded.
|
586 |
at white space characters, and each pattern will be separately expanded.
|
581 |
If you want to search for a pattern including white space, you need to use
|
587 |
If you want to search for a pattern including white space, you need to use
|
582 |
double quotes.
|
588 |
double quotes. The point of having a separate file name search is that
|
|
|
589 |
wild card expansion can be performed more efficiently on a relatively
|
|
|
590 |
small subset of the index.
|
583 |
|
591 |
|
584 |
The fourth entry (Query Language) is described in its own section.
|
592 |
The fourth entry (Query Language) is described in its own section.
|
585 |
|
593 |
|
586 |
All search modes allow wildcards inside terms (*, ?, []). You may want to
|
594 |
All search modes allow wildcards inside terms (*, ?, []). You may want to
|
587 |
have a look at the section about wildcards for more information about
|
595 |
have a look at the section about wildcards for more information about
|
|
... |
|
... |
591 |
enclosing the input inside double quotes. Ex: "virtual reality".
|
599 |
enclosing the input inside double quotes. Ex: "virtual reality".
|
592 |
|
600 |
|
593 |
Character case has no influence on search, except that you can disable
|
601 |
Character case has no influence on search, except that you can disable
|
594 |
stem expansion for any term by capitalizing it. Ie: a search for floor
|
602 |
stem expansion for any term by capitalizing it. Ie: a search for floor
|
595 |
will also normally look for flooring, floored, etc., but a search for
|
603 |
will also normally look for flooring, floored, etc., but a search for
|
596 |
Floor will only look for floor, in any character case (stemming can also
|
604 |
Floor will only look for floor, in any character case. Sstemming can also
|
597 |
be disabled globally in the preferences).
|
605 |
be disabled globally in the preferences.
|
598 |
|
606 |
|
599 |
Recoll remembers the last few searches that you performed. You can use the
|
607 |
Recoll remembers the last few searches that you performed. You can use the
|
600 |
simple search text entry widget (a combobox) to recall them (click on the
|
608 |
simple search text entry widget (a combobox) to recall them (click on the
|
601 |
thing at the right of the text field). Please note, however, that only the
|
609 |
thing at the right of the text field). Please note, however, that only the
|
602 |
search texts are remembered, not the mode (all/any/file name).
|
610 |
search texts are remembered, not the mode (all/any/file name).
|
|
... |
|
... |
632 |
open tabs in the existing preview window. You can use Shift+Click to force
|
640 |
open tabs in the existing preview window. You can use Shift+Click to force
|
633 |
the creation of another preview window, which may be useful to view the
|
641 |
the creation of another preview window, which may be useful to view the
|
634 |
documents side by side. (You can also browse successive results in a
|
642 |
documents side by side. (You can also browse successive results in a
|
635 |
single preview window by typing Shift+ArrowUp/Down in the window).
|
643 |
single preview window by typing Shift+ArrowUp/Down in the window).
|
636 |
|
644 |
|
637 |
Clicking the Edit link will attempt to start an external viewer. The
|
645 |
Clicking the Edit link will attempt to start an external editor. The
|
638 |
viewers can be configured through the user preferences dialog, or by
|
646 |
editors can be configured through the user preferences dialog, or by
|
639 |
editing the mimeview configuration file.
|
647 |
editing the mimeview configuration file.
|
640 |
|
648 |
|
641 |
The Preview and Edit edit links may not be present for all entries,
|
649 |
The Preview and Edit edit links may not be present for all entries,
|
642 |
meaning that Recoll has no configured way to preview a given file type
|
650 |
meaning that Recoll has no configured way to preview a given file type
|
643 |
(which was indexed by name only), or no configured external viewer for the
|
651 |
(which was indexed by name only), or no configured external editor for the
|
644 |
file type. This can sometimes be adjusted simply by tweaking the mimemap
|
652 |
file type. This can sometimes be adjusted simply by tweaking the mimemap
|
645 |
and mimeview configuration files (the latter can be modified with the user
|
653 |
and mimeview configuration files (the latter can be modified with the user
|
646 |
preferences dialog).
|
654 |
preferences dialog).
|
647 |
|
655 |
|
|
|
656 |
The format of the result list entries is entirely configurable by using
|
|
|
657 |
the preference dialog to edit an HTML fragment.
|
|
|
658 |
|
648 |
You can click on the Query details link at the top of the results page to
|
659 |
You can click on the Query details link at the top of the results page to
|
649 |
see the query actually performed, after stem expansion and other
|
660 |
see the query actually performed, after stem expansion and other
|
650 |
processing.
|
661 |
processing.
|
651 |
|
662 |
|
652 |
Double-clicking on any word inside the result list or a preview window
|
663 |
Double-clicking on any word inside the result list or a preview window
|
|
... |
|
... |
670 |
|
681 |
|
671 |
* Copy File Name
|
682 |
* Copy File Name
|
672 |
|
683 |
|
673 |
* Copy Url
|
684 |
* Copy Url
|
674 |
|
685 |
|
675 |
* Find similar
|
686 |
* Save to File
|
676 |
|
687 |
|
677 |
* Find similar
|
688 |
* Find similar
|
678 |
|
689 |
|
679 |
* Parent document
|
690 |
* Parent document
|
680 |
|
691 |
|
681 |
The Preview and Edit entries do the same thing as the corresponding links.
|
692 |
The Preview and Edit entries do the same thing as the corresponding links.
|
682 |
|
693 |
|
683 |
The Copy File Name and Copy Url copy the relevant data to the clipboard,
|
694 |
The Copy File Name and Copy Url copy the relevant data to the clipboard,
|
684 |
for later pasting.
|
695 |
for later pasting.
|
|
|
696 |
|
|
|
697 |
Save to File allows saving the contents of a result document to a chosen
|
|
|
698 |
file. This entry will only appear if the document does not correspond to
|
|
|
699 |
an existing file, but is a subdocument inside such a file (ie: an email
|
|
|
700 |
attachment). It is especially useful to extract attachments with no
|
|
|
701 |
associated editor.
|
685 |
|
702 |
|
686 |
The Find similar entry will select a number of relevant term from the
|
703 |
The Find similar entry will select a number of relevant term from the
|
687 |
current document and enter them into the simple search field. You can then
|
704 |
current document and enter them into the simple search field. You can then
|
688 |
start a simple search, with a good chance of finding documents related to
|
705 |
start a simple search, with a good chance of finding documents related to
|
689 |
the current result.
|
706 |
the current result.
|
|
... |
|
... |
730 |
If you have a search string entered and you use ^Up/^Down to browse the
|
747 |
If you have a search string entered and you use ^Up/^Down to browse the
|
731 |
results, the search is initiated for each successive document. If the
|
748 |
results, the search is initiated for each successive document. If the
|
732 |
string is found, the cursor will be positioned at the first occurrence of
|
749 |
string is found, the cursor will be positioned at the first occurrence of
|
733 |
the search string.
|
750 |
the search string.
|
734 |
|
751 |
|
|
|
752 |
A right-click menu in the text area allows switching between displaying
|
|
|
753 |
the main text or the contents of fields associated to the document (ie:
|
|
|
754 |
author, abtract, etc.). This is especially useful in cases where the term
|
|
|
755 |
match did not occur in the main text but in one of the fields.
|
|
|
756 |
|
735 |
----------------------------------------------------------------------
|
757 |
----------------------------------------------------------------------
|
736 |
|
758 |
|
737 |
3.4. The query language
|
759 |
3.4. The query language
|
738 |
|
760 |
|
739 |
The query language processor is activated on the simple search entry when
|
761 |
The query language processor is activated on the simple search entry when
|
|
... |
|
... |
831 |
|
853 |
|
832 |
----------------------------------------------------------------------
|
854 |
----------------------------------------------------------------------
|
833 |
|
855 |
|
834 |
3.5. Complex/advanced search
|
856 |
3.5. Complex/advanced search
|
835 |
|
857 |
|
836 |
The advanced search dialog has a number of fields that will allow a more
|
858 |
The advanced search dialog helps you build more complex queries. It can be
|
|
|
859 |
opened through the Tools menu or through the main toolbar.
|
|
|
860 |
|
|
|
861 |
The dialog has three parts:
|
|
|
862 |
|
|
|
863 |
* The top part allows constructing a query by combining multiple clauses
|
837 |
refined search. Each entry field is configurable for the following modes:
|
864 |
of different types. Each entry field is configurable for the following
|
|
|
865 |
modes:
|
838 |
|
866 |
|
839 |
* All terms.
|
867 |
* All terms.
|
840 |
|
868 |
|
841 |
* Any term.
|
869 |
* Any term.
|
842 |
|
870 |
|
843 |
* None of the terms.
|
871 |
* None of the terms.
|
844 |
|
872 |
|
845 |
* Phrase (exact terms in order within an adjustable window).
|
873 |
* Phrase (exact terms in order within an adjustable window).
|
846 |
|
874 |
|
847 |
* Proximity (terms in any order within an adjustable window).
|
875 |
* Proximity (terms in any order within an adjustable window).
|
848 |
|
876 |
|
849 |
* Filename search with wildcards.
|
877 |
* Filename search.
|
850 |
|
878 |
|
851 |
Additional entry fields can be created by clicking the Add clause button.
|
879 |
Additional entry fields can be created by clicking the Add clause
|
|
|
880 |
button.
|
852 |
|
881 |
|
853 |
You can choose that all relevant fields will be combined by either an AND
|
882 |
When searching, the non-empty clauses will be combined either with an
|
854 |
or an OR conjunction. All types of clauses except "phrase" and "near" can
|
883 |
AND or an OR conjunction, depending on the choice made on the left
|
855 |
accept a mix of single words and phrases enclosed in double quotes.
|
884 |
(All clauses or Any clause).
|
856 |
Stemming expansion will be performed for all terms not beginning with a
|
|
|
857 |
capital letter, except for terms inside "phrase" clauses. Wildcards will
|
|
|
858 |
be processed everywhere.
|
|
|
859 |
|
885 |
|
860 |
Advanced search will also let you search for documents of specific mime
|
886 |
Entries of all types except "Phrase" and "Near" accept a mix of single
|
861 |
types (ie: only text/plain, or text/HTML or application/pdf etc...). The
|
887 |
words and phrases enclosed in double quotes. Stemming and wildcard
|
|
|
888 |
expansion will be performed as for simple search.
|
|
|
889 |
|
|
|
890 |
* The next part allows filtering the results by their mime types.
|
|
|
891 |
|
862 |
state of the file type selection can be saved as the default (the file
|
892 |
The state of the file type selection can be saved as the default (the
|
863 |
type filter will not be activated at program start-up, but the lists will
|
893 |
file type filter will not be activated at program start-up, but the
|
864 |
be in the restored state).
|
894 |
lists will be in the restored state).
|
865 |
|
895 |
|
866 |
You can also restrict the search results to a sub-tree of the indexed
|
896 |
* The bottom part allows restricting the search results to a sub-tree of
|
867 |
area. If you need to do this often, you may think of setting up multiple
|
897 |
the indexed area. If you need to do this often, you may think of
|
868 |
indexes instead, as the performance will be much better.
|
898 |
setting up multiple indexes instead, as the performance will be much
|
|
|
899 |
better.
|
|
|
900 |
|
|
|
901 |
Phrases and Proximity searches. These two clauses work in similar ways,
|
|
|
902 |
with the difference that proximity searches do not impose an order on the
|
|
|
903 |
words. In both cases, an adjustable number (slack) of non-matched words
|
|
|
904 |
may be accepted between the searched ones (use the counter on the left to
|
|
|
905 |
adjust this count). For phrases, the default count is zero (exact match).
|
|
|
906 |
For proximity it is ten (meaning that two search terms, would be matched
|
|
|
907 |
if found within a window of twelve words). Examples: a phrase search for
|
|
|
908 |
quick fox with a slack of 0 will match quick fox but not quick brown fox.
|
|
|
909 |
With a slack of 1 it will match the latter, but not fox quick. A proximity
|
|
|
910 |
search for quick fox with the default slack will match the latter, and
|
|
|
911 |
also a fox is a cunning and quick animal.
|
869 |
|
912 |
|
870 |
Click on the Start Search button in the advanced search dialog, or type
|
913 |
Click on the Start Search button in the advanced search dialog, or type
|
871 |
Enter in any text field to start the search. The button in the main window
|
914 |
Enter in any text field to start the search. The button in the main window
|
872 |
always performs a simple search.
|
915 |
always performs a simple search.
|
873 |
|
916 |
|
|
... |
|
... |
1018 |
You can erase the document history by using the Erase document history
|
1061 |
You can erase the document history by using the Erase document history
|
1019 |
entry in the File menu.
|
1062 |
entry in the File menu.
|
1020 |
|
1063 |
|
1021 |
----------------------------------------------------------------------
|
1064 |
----------------------------------------------------------------------
|
1022 |
|
1065 |
|
1023 |
3.10. Sorting search results
|
1066 |
3.10. Sorting search results and collapsing duplicates
|
1024 |
|
1067 |
|
1025 |
The documents in a result list are normally sorted in order of relevance.
|
1068 |
The documents in a result list are normally sorted in order of relevance.
|
1026 |
It is possible to specify different sort parameters by using the Sort
|
1069 |
It is possible to specify different sort parameters by using the Sort
|
1027 |
parameters dialog (located in the Tools menu).
|
1070 |
parameters dialog (located in the Tools menu).
|
1028 |
|
1071 |
|
|
... |
|
... |
1035 |
|
1078 |
|
1036 |
Sort parameters are remembered between program invocations, but result
|
1079 |
Sort parameters are remembered between program invocations, but result
|
1037 |
sorting is normally always inactive when the program starts. It is
|
1080 |
sorting is normally always inactive when the program starts. It is
|
1038 |
possible to keep the sorting activation state between program invocations
|
1081 |
possible to keep the sorting activation state between program invocations
|
1039 |
by checking the Remember sort activation state option in the preferences.
|
1082 |
by checking the Remember sort activation state option in the preferences.
|
|
|
1083 |
|
|
|
1084 |
It is also possible to hide duplicate entries inside the result list
|
|
|
1085 |
(documents with the exact same contents as the displayed one). The test of
|
|
|
1086 |
identity is based on an MD5 hash of the document container, not only of
|
|
|
1087 |
the text contents (so that ie, a text document with an image added will
|
|
|
1088 |
not be a duplicate of the text only). Duplicates hiding is controlled by
|
|
|
1089 |
an entry in the Query configuration dialog, and is off by default.
|
1040 |
|
1090 |
|
1041 |
----------------------------------------------------------------------
|
1091 |
----------------------------------------------------------------------
|
1042 |
|
1092 |
|
1043 |
3.11. Search tips, shortcuts
|
1093 |
3.11. Search tips, shortcuts
|
1044 |
|
1094 |
|
|
... |
|
... |
1079 |
|
1129 |
|
1080 |
3.11.2. Working with phrases and proximity
|
1130 |
3.11.2. Working with phrases and proximity
|
1081 |
|
1131 |
|
1082 |
Phrases and Proximity searches. A phrase can be looked for by enclosing it
|
1132 |
Phrases and Proximity searches. A phrase can be looked for by enclosing it
|
1083 |
in double quotes. Example: "user manual" will look only for occurrences of
|
1133 |
in double quotes. Example: "user manual" will look only for occurrences of
|
1084 |
user immediately followed by manual. You can use the This exact phrase
|
1134 |
user immediately followed by manual. You can use the This phrase field of
|
1085 |
field of the advanced search dialog to the same effect. Phrases can be
|
1135 |
the advanced search dialog to the same effect. Phrases can be entered
|
1086 |
entered along simple terms in all simple or advanced search entry fields
|
1136 |
along simple terms in all simple or advanced search entry fields (except
|
1087 |
(except This exact phrase).
|
1137 |
This exact phrase).
|
1088 |
|
1138 |
|
1089 |
AutoPhrases. This option can be set in the preferences dialog. If it is
|
1139 |
AutoPhrases. This option can be set in the preferences dialog. If it is
|
1090 |
set, a phrase will be automatically built and added to simple searches
|
1140 |
set, a phrase will be automatically built and added to simple searches
|
1091 |
when looking for Any terms. This will not change radically the results,
|
1141 |
when looking for Any terms. This will not change radically the results,
|
1092 |
but will give a relevance boost to the results where the search terms
|
1142 |
but will give a relevance boost to the results where the search terms
|
|
... |
|
... |
1134 |
|
1184 |
|
1135 |
User interface parameters:
|
1185 |
User interface parameters:
|
1136 |
|
1186 |
|
1137 |
* Number of results in a result page:
|
1187 |
* Number of results in a result page:
|
1138 |
|
1188 |
|
|
|
1189 |
* Hide duplicate results: decides if result list entries are shown for
|
|
|
1190 |
identical documents found in different places.
|
|
|
1191 |
|
1139 |
* Highlight color for query terms: Terms from the user query are
|
1192 |
* Highlight color for query terms: Terms from the user query are
|
1140 |
highlighted in the result list samples and the preview window. The
|
1193 |
highlighted in the result list samples and the preview window. The
|
1141 |
color can be chosen here. Any QT color string should work (ie red,
|
1194 |
color can be chosen here. Any QT color string should work (ie red,
|
1142 |
#ff0000). The default is blue.
|
1195 |
#ff0000). The default is blue.
|
1143 |
|
1196 |
|
|
... |
|
... |
1265 |
alternative indexer may also need to implement a way of purging the index
|
1318 |
alternative indexer may also need to implement a way of purging the index
|
1266 |
from stale data,
|
1319 |
from stale data,
|
1267 |
|
1320 |
|
1268 |
----------------------------------------------------------------------
|
1321 |
----------------------------------------------------------------------
|
1269 |
|
1322 |
|
|
|
1323 |
Chapter 4. Searching with the KDE KIO slave
|
|
|
1324 |
|
|
|
1325 |
4.1. What's this
|
|
|
1326 |
|
|
|
1327 |
The Recoll KIO slave allows performing a Recoll search by entering an
|
|
|
1328 |
appropriate URL in a KDE open dialog, or with an HTML-based interface
|
|
|
1329 |
displayed in Konqueror.
|
|
|
1330 |
|
|
|
1331 |
The HTML-based interface is similar to the QT-based interface, but
|
|
|
1332 |
slightly less powerful for now. Its advantage is that you can perform your
|
|
|
1333 |
search while staying fully within the KDE framework: drag and drop from
|
|
|
1334 |
the result list works normally and you have your normal choice of
|
|
|
1335 |
applications for opening files.
|
|
|
1336 |
|
|
|
1337 |
The alternative interface uses a directory view of search results. Due to
|
|
|
1338 |
limitations in the current KIO slave interface, it is currently not
|
|
|
1339 |
obviously useful (to me).
|
|
|
1340 |
|
|
|
1341 |
The interface is described in more detail inside a help file which you can
|
|
|
1342 |
access by entering recoll:/ inside the konqueror URL line (this works only
|
|
|
1343 |
if the recoll KIO slave has been previously installed).
|
|
|
1344 |
|
|
|
1345 |
The instructions for building this module are located in the source tree.
|
|
|
1346 |
See: kde/kio/recoll/00README.txt
|
|
|
1347 |
|
|
|
1348 |
----------------------------------------------------------------------
|
|
|
1349 |
|
|
|
1350 |
4.2. Searchable documents
|
|
|
1351 |
|
|
|
1352 |
As a sample application, the Recoll KIO slave could allow preparing a set
|
|
|
1353 |
of HTML documents (for example a manual) so that they become their own
|
|
|
1354 |
search interface inside konqueror.
|
|
|
1355 |
|
|
|
1356 |
This can be done by either explicitely inserting <a href="recoll:/...">
|
|
|
1357 |
links around some document areas, or automatically by adding a very small
|
|
|
1358 |
javascript program to the documents, like the following example, which
|
|
|
1359 |
would initiate a search by double-clicking any term:
|
|
|
1360 |
|
|
|
1361 |
<script language="JavaScript">
|
|
|
1362 |
function recollsearch() {
|
|
|
1363 |
var t = document.getSelection();
|
|
|
1364 |
window.location.href = 'recoll://search/query?qtp=a&p=0&q=' +
|
|
|
1365 |
encodeURIComponent(t);
|
|
|
1366 |
}
|
|
|
1367 |
</script>
|
|
|
1368 |
....
|
|
|
1369 |
<body ondblclick="recollsearch()">
|
|
|
1370 |
|
|
|
1371 |
----------------------------------------------------------------------
|
|
|
1372 |
|
|
|
1373 |
Chapter 5. Searching on the command line
|
|
|
1374 |
|
|
|
1375 |
There are several ways to obtain search results as a text stream, without
|
|
|
1376 |
a graphical interface:
|
|
|
1377 |
|
|
|
1378 |
* By passing option -t to the recoll program.
|
|
|
1379 |
|
|
|
1380 |
* By using the recollq program.
|
|
|
1381 |
|
|
|
1382 |
* By writing a custom Python program, using the Recoll Python API.
|
|
|
1383 |
|
|
|
1384 |
The first two methods work in the same way and accept/need the same
|
|
|
1385 |
arguments (except for the additional -t to recoll). The query to be
|
|
|
1386 |
executed is specified as command line arguments.
|
|
|
1387 |
|
|
|
1388 |
recollq is not built by default. You can use the Makefile in the query
|
|
|
1389 |
directory to build it. This is a very simple program, and it will often be
|
|
|
1390 |
useful to taylor its output format to your needs.
|
|
|
1391 |
|
|
|
1392 |
recollq has a man page (not installed by default, look in the doc/man
|
|
|
1393 |
directory). The Usage string is as follows:
|
|
|
1394 |
|
|
|
1395 |
recollq [-o|-a|-f] <query string>
|
|
|
1396 |
Runs a recoll query and displays result lines.
|
|
|
1397 |
Default: will interpret the argument(s) as a query language string
|
|
|
1398 |
-o Emulate the gui simple search in ANY TERM mode
|
|
|
1399 |
-a Emulate the gui simple search in ALL TERMS mode
|
|
|
1400 |
-f Emulate the gui simple search in filename mode
|
|
|
1401 |
Common options:
|
|
|
1402 |
-c <configdir> : specify config directory, overriding $RECOLL_CONFDIR
|
|
|
1403 |
-d also dump file contents
|
|
|
1404 |
-n <cnt> limit the maximum number of results (0->no limit, default 2000)
|
|
|
1405 |
-b : basic. Just output urls, no mime types or titles
|
|
|
1406 |
-m : dump the whole document meta[] array
|
|
|
1407 |
-S fld : sort by field name
|
|
|
1408 |
-D : sort descending
|
|
|
1409 |
|
|
|
1410 |
Sample execution:
|
|
|
1411 |
|
|
|
1412 |
recollq 'ilur -nautique mime:text/html'
|
|
|
1413 |
Recoll query: ((((ilur:(wqf=11) OR ilurs) AND_NOT (nautique:(wqf=11)
|
|
|
1414 |
OR nautiques OR nautiqu OR nautiquement)) FILTER Ttext/html))
|
|
|
1415 |
4 results
|
|
|
1416 |
text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/comptes.html] [comptes.html] 18593 bytes
|
|
|
1417 |
text/html [file:///Users/uncrypted-dockes/projets/nautique/webnautique/articles/ilur1/index.html] [Constructio...
|
|
|
1418 |
text/html [file:///Users/uncrypted-dockes/projets/pagepers/index.html] [psxtcl/writemime/recoll]...
|
|
|
1419 |
text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/recu-chasse-maree....
|
|
|
1420 |
|
|
|
1421 |
----------------------------------------------------------------------
|
|
|
1422 |
|
1270 |
Chapter 4. Programming interface
|
1423 |
Chapter 6. Programming interface
|
1271 |
|
1424 |
|
1272 |
Recoll has an Application programming Interface, usable both for indexing
|
1425 |
Recoll has an Application programming Interface, usable both for indexing
|
1273 |
and searching, currently accessible from the Python language.
|
1426 |
and searching, currently accessible from the Python language.
|
1274 |
|
1427 |
|
1275 |
Another less radical way to extend the application is to write filters for
|
1428 |
Another less radical way to extend the application is to write filters for
|
|
... |
|
... |
1278 |
The processing of metadata attributes for documents (fields) is highly
|
1431 |
The processing of metadata attributes for documents (fields) is highly
|
1279 |
configurable.
|
1432 |
configurable.
|
1280 |
|
1433 |
|
1281 |
----------------------------------------------------------------------
|
1434 |
----------------------------------------------------------------------
|
1282 |
|
1435 |
|
1283 |
4.1. Writing a document filter
|
1436 |
6.1. Writing a document filter
|
1284 |
|
1437 |
|
1285 |
Recoll filters are executable programs which translate from a specific
|
1438 |
Recoll filters are executable programs which translate from a specific
|
1286 |
format (ie: openoffice, acrobat, etc.) to the Recoll indexing input
|
1439 |
format (ie: openoffice, acrobat, etc.) to the Recoll indexing input
|
1287 |
format, which may be text/plain or text/html.
|
1440 |
format, which may be text/plain or text/html.
|
1288 |
|
1441 |
|
|
... |
|
... |
1332 |
cannot specify the character set and other metadata, so they are limited
|
1485 |
cannot specify the character set and other metadata, so they are limited
|
1333 |
to cases where these elements are not needed.
|
1486 |
to cases where these elements are not needed.
|
1334 |
|
1487 |
|
1335 |
----------------------------------------------------------------------
|
1488 |
----------------------------------------------------------------------
|
1336 |
|
1489 |
|
1337 |
4.1.1. Filter HTML output
|
1490 |
6.1.1. Filter HTML output
|
1338 |
|
1491 |
|
1339 |
The output HTML could be very minimal like the following example:
|
1492 |
The output HTML could be very minimal like the following example:
|
1340 |
|
1493 |
|
1341 |
<html><head>
|
1494 |
<html><head>
|
1342 |
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
|
1495 |
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
|
|
... |
|
... |
1365 |
See the following section for details about configuring how field data is
|
1518 |
See the following section for details about configuring how field data is
|
1366 |
processed by the indexer.
|
1519 |
processed by the indexer.
|
1367 |
|
1520 |
|
1368 |
----------------------------------------------------------------------
|
1521 |
----------------------------------------------------------------------
|
1369 |
|
1522 |
|
1370 |
4.2. Field data processing configuration
|
1523 |
6.2. Field data processing configuration
|
1371 |
|
1524 |
|
1372 |
Fields are named pieces of information in or about documents, like title,
|
1525 |
Fields are named pieces of information in or about documents, like title,
|
1373 |
author, abstract.
|
1526 |
author, abstract.
|
1374 |
|
1527 |
|
1375 |
The field values for documents can appear in several ways during indexing:
|
1528 |
The field values for documents can appear in several ways during indexing:
|
|
... |
|
... |
1400 |
A field becomes stored by appearing in the [stored] section of the fields
|
1553 |
A field becomes stored by appearing in the [stored] section of the fields
|
1401 |
file.
|
1554 |
file.
|
1402 |
|
1555 |
|
1403 |
----------------------------------------------------------------------
|
1556 |
----------------------------------------------------------------------
|
1404 |
|
1557 |
|
1405 |
4.3. API
|
1558 |
6.3. API
|
1406 |
|
1559 |
|
1407 |
4.3.1. Interface elements
|
1560 |
6.3.1. Interface elements
|
1408 |
|
1561 |
|
1409 |
A few elements in the interface are specific and and need an explanation.
|
1562 |
A few elements in the interface are specific and and need an explanation.
|
1410 |
|
1563 |
|
1411 |
udi
|
1564 |
udi
|
1412 |
|
1565 |
|
|
... |
|
... |
1443 |
during indexing. The main indexer documents would also probably be a
|
1596 |
during indexing. The main indexer documents would also probably be a
|
1444 |
problem for the external indexer purge operation.
|
1597 |
problem for the external indexer purge operation.
|
1445 |
|
1598 |
|
1446 |
----------------------------------------------------------------------
|
1599 |
----------------------------------------------------------------------
|
1447 |
|
1600 |
|
1448 |
4.3.2. Python interface
|
1601 |
6.3.2. Python interface
|
1449 |
|
1602 |
|
1450 |
4.3.2.1. Introduction
|
1603 |
6.3.2.1. Introduction
|
1451 |
|
1604 |
|
1452 |
Recoll versions after 1.11 define a Python programming interface, both for
|
1605 |
Recoll versions after 1.11 define a Python programming interface, both for
|
1453 |
searching and indexing.
|
1606 |
searching and indexing.
|
1454 |
|
1607 |
|
1455 |
The python interface is not built by default and can be found in the
|
1608 |
The python interface is not built by default and can be found in the
|
|
... |
|
... |
1461 |
python setup.py install
|
1614 |
python setup.py install
|
1462 |
|
1615 |
|
1463 |
|
1616 |
|
1464 |
----------------------------------------------------------------------
|
1617 |
----------------------------------------------------------------------
|
1465 |
|
1618 |
|
1466 |
4.3.2.2. Interface manual
|
1619 |
6.3.2.2. Interface manual
|
1467 |
|
1620 |
|
1468 |
NAME
|
1621 |
NAME
|
1469 |
recoll - This is an interface to the Recoll full text indexer.
|
1622 |
recoll - This is an interface to the Recoll full text indexer.
|
1470 |
|
1623 |
|
1471 |
FILE
|
1624 |
FILE
|
|
... |
|
... |
1651 |
|
1804 |
|
1652 |
|
1805 |
|
1653 |
|
1806 |
|
1654 |
----------------------------------------------------------------------
|
1807 |
----------------------------------------------------------------------
|
1655 |
|
1808 |
|
1656 |
4.3.2.3. Example code
|
1809 |
6.3.2.3. Example code
|
1657 |
|
1810 |
|
1658 |
The following sample would query the index with a user language string.
|
1811 |
The following sample would query the index with a user language string.
|
1659 |
See the python/samples directory inside the Recoll source for other
|
1812 |
See the python/samples directory inside the Recoll source for other
|
1660 |
examples.
|
1813 |
examples.
|
1661 |
|
1814 |
|
|
... |
|
... |
1682 |
|
1835 |
|
1683 |
|
1836 |
|
1684 |
|
1837 |
|
1685 |
----------------------------------------------------------------------
|
1838 |
----------------------------------------------------------------------
|
1686 |
|
1839 |
|
1687 |
Chapter 5. Installation
|
1840 |
Chapter 7. Installation
|
1688 |
|
1841 |
|
1689 |
5.1. Installing a prebuilt copy
|
1842 |
7.1. Installing a prebuilt copy
|
1690 |
|
1843 |
|
1691 |
Recoll binary packages from the Recoll web site are always linked
|
1844 |
Recoll binary packages from the Recoll web site are always linked
|
1692 |
statically to the Xapian libraries, and have no other dependencies. You
|
1845 |
statically to the Xapian libraries, and have no other dependencies. You
|
1693 |
will only have to check or install supporting applications for the file
|
1846 |
will only have to check or install supporting applications for the file
|
1694 |
types that you want to index beyond text, HTML and mail files, and maybe
|
1847 |
types that you want to index beyond text, HTML and mail files, and maybe
|
1695 |
have a look at the configuration section (but this may not be necessary
|
1848 |
have a look at the configuration section (but this may not be necessary
|
1696 |
for a quick test with default parameters).
|
1849 |
for a quick test with default parameters).
|
1697 |
|
1850 |
|
1698 |
----------------------------------------------------------------------
|
1851 |
----------------------------------------------------------------------
|
1699 |
|
1852 |
|
1700 |
5.1.1. Installing through a package system
|
1853 |
7.1.1. Installing through a package system
|
1701 |
|
1854 |
|
1702 |
If you use a BSD-type port system or a prebuilt package (RPM or other),
|
1855 |
If you use a BSD-type port system or a prebuilt package (RPM or other),
|
1703 |
just follow the usual procedure for your system.
|
1856 |
just follow the usual procedure for your system.
|
1704 |
|
1857 |
|
1705 |
----------------------------------------------------------------------
|
1858 |
----------------------------------------------------------------------
|
1706 |
|
1859 |
|
1707 |
5.1.2. Installing a prebuilt Recoll
|
1860 |
7.1.2. Installing a prebuilt Recoll
|
1708 |
|
1861 |
|
1709 |
The unpackaged binary versions on the Recoll web site are just compressed
|
1862 |
The unpackaged binary versions on the Recoll web site are just compressed
|
1710 |
tar files of a build tree, where only the useful parts were kept
|
1863 |
tar files of a build tree, where only the useful parts were kept
|
1711 |
(executables and sample configuration).
|
1864 |
(executables and sample configuration).
|
1712 |
|
1865 |
|
|
... |
|
... |
1717 |
had built the package from source (that is, just type make install). The
|
1870 |
had built the package from source (that is, just type make install). The
|
1718 |
binary trees are built for installation to /usr/local.
|
1871 |
binary trees are built for installation to /usr/local.
|
1719 |
|
1872 |
|
1720 |
----------------------------------------------------------------------
|
1873 |
----------------------------------------------------------------------
|
1721 |
|
1874 |
|
1722 |
5.2. Supporting packages
|
1875 |
7.2. Supporting packages
|
1723 |
|
1876 |
|
1724 |
Recoll uses external applications to index some file types. You need to
|
1877 |
Recoll uses external applications to index some file types. You need to
|
1725 |
install them for the file types that you wish to have indexed (these are
|
1878 |
install them for the file types that you wish to have indexed (these are
|
1726 |
run-time dependencies. None is needed for building Recoll).
|
1879 |
run-time dependencies. None is needed for building Recoll).
|
1727 |
|
1880 |
|
|
... |
|
... |
1765 |
Text, HTML, mail folders Openoffice and Scribus files are processed
|
1918 |
Text, HTML, mail folders Openoffice and Scribus files are processed
|
1766 |
internally. Lyx is used to index Lyx files. Many filters need sed and awk.
|
1919 |
internally. Lyx is used to index Lyx files. Many filters need sed and awk.
|
1767 |
|
1920 |
|
1768 |
----------------------------------------------------------------------
|
1921 |
----------------------------------------------------------------------
|
1769 |
|
1922 |
|
1770 |
5.3. Building from source
|
1923 |
7.3. Building from source
|
1771 |
|
1924 |
|
1772 |
5.3.1. Prerequisites
|
1925 |
7.3.1. Prerequisites
|
1773 |
|
1926 |
|
1774 |
At the very least, you will need to download and install the xapian core
|
1927 |
At the very least, you will need to download and install the xapian core
|
1775 |
package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
|
1928 |
package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
|
1776 |
version will work too), and the qt run-time and development packages
|
1929 |
version will work too), and the qt run-time and development packages
|
1777 |
(Recoll development currently uses version 3.3.5, but any 3.3 version is
|
1930 |
(Recoll development currently uses version 3.3.5, but any 3.3 version is
|
|
... |
|
... |
1785 |
not be critical). On Linux systems, the iconv interface is part of libc
|
1938 |
not be critical). On Linux systems, the iconv interface is part of libc
|
1786 |
and you should not need to do anything special.
|
1939 |
and you should not need to do anything special.
|
1787 |
|
1940 |
|
1788 |
----------------------------------------------------------------------
|
1941 |
----------------------------------------------------------------------
|
1789 |
|
1942 |
|
1790 |
5.3.2. Building
|
1943 |
7.3.2. Building
|
1791 |
|
1944 |
|
1792 |
Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
|
1945 |
Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
|
1793 |
3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
|
1946 |
3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
|
1794 |
system, and need to modify things, I would very much welcome patches.
|
1947 |
system, and need to modify things, I would very much welcome patches.
|
1795 |
|
1948 |
|
|
... |
|
... |
1825 |
manually copy and modify one of the existing files (the new file name
|
1978 |
manually copy and modify one of the existing files (the new file name
|
1826 |
should be the output of uname -s).
|
1979 |
should be the output of uname -s).
|
1827 |
|
1980 |
|
1828 |
----------------------------------------------------------------------
|
1981 |
----------------------------------------------------------------------
|
1829 |
|
1982 |
|
1830 |
5.3.3. Installation
|
1983 |
7.3.3. Installation
|
1831 |
|
1984 |
|
1832 |
Either type make install or execute recollinstall prefix, in the root of
|
1985 |
Either type make install or execute recollinstall prefix, in the root of
|
1833 |
the source tree. This will copy the commands to prefix/bin and the sample
|
1986 |
the source tree. This will copy the commands to prefix/bin and the sample
|
1834 |
configuration files, scripts and other shared data to prefix/share/recoll.
|
1987 |
configuration files, scripts and other shared data to prefix/share/recoll.
|
1835 |
|
1988 |
|
|
... |
|
... |
1840 |
|
1993 |
|
1841 |
You can then proceed to configuration.
|
1994 |
You can then proceed to configuration.
|
1842 |
|
1995 |
|
1843 |
----------------------------------------------------------------------
|
1996 |
----------------------------------------------------------------------
|
1844 |
|
1997 |
|
1845 |
5.4. Configuration overview
|
1998 |
7.4. Configuration overview
|
1846 |
|
1999 |
|
1847 |
Most of the parameters specific to the recoll GUI are set through the
|
2000 |
Most of the parameters specific to the recoll GUI are set through the
|
1848 |
Preferences menu and stored in the standard QT place ($HOME/.qt/recollrc).
|
2001 |
Preferences menu and stored in the standard QT place ($HOME/.qt/recollrc).
|
1849 |
You probably do not want to edit this by hand.
|
2002 |
You probably do not want to edit this by hand.
|
1850 |
|
2003 |
|
1851 |
For other options, Recoll uses text configuration files. You will have to
|
2004 |
Recoll indexing options are set inside text configuration files located in
|
1852 |
edit them by hand for now (there is still some hope for a GUI
|
2005 |
a configuration directory. There can be several such directories, each of
|
1853 |
configuration tool in the future). The most accurate documentation for the
|
2006 |
which define the parameters for one index.
|
1854 |
configuration parameters is given by comments inside the default files,
|
|
|
1855 |
and we will just give a general overview here.
|
|
|
1856 |
|
2007 |
|
1857 |
There are two sets of configuration files. The system-wide files are kept
|
2008 |
The configuration files can be edited by hand or through the Indexing
|
1858 |
in a directory named like /usr/[local/]share/recoll/examples, they define
|
2009 |
configuration dialog (Preferences menu). The GUI tool will try to respect
|
1859 |
default values for the system. A parallel set of files exists by default
|
2010 |
your formatting and comments as much as possible, so it is quite possible
|
1860 |
in the .recoll directory in your home. This directory can be changed with
|
2011 |
to use both ways.
|
|
|
2012 |
|
|
|
2013 |
The most accurate documentation for the configuration parameters is given
|
|
|
2014 |
by comments inside the default files, and we will just give a general
|
|
|
2015 |
overview here.
|
|
|
2016 |
|
|
|
2017 |
For each index, there are two sets of configuration files. System-wide
|
|
|
2018 |
configuration files are kept in a directory named like
|
|
|
2019 |
/usr/[local/]share/recoll/examples, and define default values, shared by
|
|
|
2020 |
all indexes. For each index, a parallel set of files defines the
|
|
|
2021 |
customized parameters.
|
|
|
2022 |
|
|
|
2023 |
The default location of the configuration is the .recoll directory in your
|
|
|
2024 |
home. Most people will only use this directory.
|
|
|
2025 |
|
|
|
2026 |
This location can be changed, or others can be added with the
|
1861 |
the RECOLL_CONFDIR environment variable or the -c option parameter to
|
2027 |
RECOLL_CONFDIR environment variable or the -c option parameter to recoll
|
1862 |
recoll and recollindex.
|
2028 |
and recollindex.
|
1863 |
|
2029 |
|
1864 |
If the .recoll directory does not exist when recoll or recollindex are
|
2030 |
If the .recoll directory does not exist when recoll or recollindex are
|
1865 |
started, it will be created with a set of empty configuration files.
|
2031 |
started, it will be created with a set of empty configuration files.
|
1866 |
recoll will give you a chance to edit the configuration file before
|
2032 |
recoll will give you a chance to edit the configuration file before
|
1867 |
starting indexing. recollindex will proceed immediately. To avoid
|
2033 |
starting indexing. recollindex will proceed immediately. To avoid
|
|
... |
|
... |
1900 |
White space is used for separation inside lists. List elements with
|
2066 |
White space is used for separation inside lists. List elements with
|
1901 |
embedded spaces can be quoted using double-quotes.
|
2067 |
embedded spaces can be quoted using double-quotes.
|
1902 |
|
2068 |
|
1903 |
----------------------------------------------------------------------
|
2069 |
----------------------------------------------------------------------
|
1904 |
|
2070 |
|
1905 |
5.4.1. Main configuration file
|
2071 |
7.4.1. Main configuration file
|
1906 |
|
2072 |
|
1907 |
recoll.conf is the main configuration file. It defines things like what to
|
2073 |
recoll.conf is the main configuration file. It defines things like what to
|
1908 |
index (top directories and things to ignore), and the default character
|
2074 |
index (top directories and things to ignore), and the default character
|
1909 |
set to use for document types which do not specify it internally.
|
2075 |
set to use for document types which do not specify it internally.
|
1910 |
|
2076 |
|
|
... |
|
... |
2056 |
|
2222 |
|
2057 |
Recoll normally indexes any file which it knows how to read. This
|
2223 |
Recoll normally indexes any file which it knows how to read. This
|
2058 |
list lets you restrict the indexed mime types to what you specify.
|
2224 |
list lets you restrict the indexed mime types to what you specify.
|
2059 |
If the variable is unspecified or the list empty (the default),
|
2225 |
If the variable is unspecified or the list empty (the default),
|
2060 |
all supported types are processed.
|
2226 |
all supported types are processed.
|
|
|
2227 |
|
|
|
2228 |
compressedfilemaxkbs
|
|
|
2229 |
|
|
|
2230 |
Size limit for compressed (.gz or .bz2) files. These need to be
|
|
|
2231 |
decompressed in a temporary directory for identification, which
|
|
|
2232 |
can be very wasteful if 'uninteresting' big compressed files are
|
|
|
2233 |
present. Negative means no limit, 0 means no processing of any
|
|
|
2234 |
compressed file. Defaults to -1.
|
2061 |
|
2235 |
|
2062 |
indexallfilenames
|
2236 |
indexallfilenames
|
2063 |
|
2237 |
|
2064 |
Recoll indexes file names in a special section of the database to
|
2238 |
Recoll indexes file names in a special section of the database to
|
2065 |
allow specific file names searches using wild cards. This
|
2239 |
allow specific file names searches using wild cards. This
|
|
... |
|
... |
2110 |
cases. A value of 3 would allow more precision and efficiency on
|
2284 |
cases. A value of 3 would allow more precision and efficiency on
|
2111 |
longer words, but the index will be approximately twice as large.
|
2285 |
longer words, but the index will be approximately twice as large.
|
2112 |
|
2286 |
|
2113 |
----------------------------------------------------------------------
|
2287 |
----------------------------------------------------------------------
|
2114 |
|
2288 |
|
2115 |
5.4.2. The mimemap file
|
2289 |
7.4.2. The mimemap file
|
2116 |
|
2290 |
|
2117 |
mimemap specifies the file name extension to mime type mappings.
|
2291 |
mimemap specifies the file name extension to mime type mappings.
|
2118 |
|
2292 |
|
2119 |
For file names without an extension, or with an unknown one, the system's
|
2293 |
For file names without an extension, or with an unknown one, the system's
|
2120 |
file -i command will be executed to determine the mime type (this can be
|
2294 |
file -i command will be executed to determine the mime type (this can be
|
|
... |
|
... |
2136 |
given Recoll version. Having it there avoids cluttering the more
|
2310 |
given Recoll version. Having it there avoids cluttering the more
|
2137 |
user-oriented and locally customized skippedNames.
|
2311 |
user-oriented and locally customized skippedNames.
|
2138 |
|
2312 |
|
2139 |
----------------------------------------------------------------------
|
2313 |
----------------------------------------------------------------------
|
2140 |
|
2314 |
|
2141 |
5.4.3. The mimeconf file
|
2315 |
7.4.3. The mimeconf file
|
2142 |
|
2316 |
|
2143 |
mimeconf specifies how the different mime types are handled for indexing,
|
2317 |
mimeconf specifies how the different mime types are handled for indexing,
|
2144 |
and which icons are displayed in the recoll result lists.
|
2318 |
and which icons are displayed in the recoll result lists.
|
2145 |
|
2319 |
|
2146 |
Changing the parameters in the [index] section is probably not a good idea
|
2320 |
Changing the parameters in the [index] section is probably not a good idea
|
|
... |
|
... |
2150 |
recoll in the result lists (the values are the basenames of the png images
|
2324 |
recoll in the result lists (the values are the basenames of the png images
|
2151 |
inside the iconsdir directory (specified in recoll.conf).
|
2325 |
inside the iconsdir directory (specified in recoll.conf).
|
2152 |
|
2326 |
|
2153 |
----------------------------------------------------------------------
|
2327 |
----------------------------------------------------------------------
|
2154 |
|
2328 |
|
2155 |
5.4.4. The mimeview file
|
2329 |
7.4.4. The mimeview file
|
2156 |
|
2330 |
|
2157 |
mimeview specifies which programs are started when you click on an Edit
|
2331 |
mimeview specifies which programs are started when you click on an Edit
|
2158 |
link in a result list. Ie: HTML is normally displayed using firefox, but
|
2332 |
link in a result list. Ie: HTML is normally displayed using firefox, but
|
2159 |
you may prefer Konqueror, your openoffice.org program might be named
|
2333 |
you may prefer Konqueror, your openoffice.org program might be named
|
2160 |
oofice instead of openoffice etc.
|
2334 |
oofice instead of openoffice etc.
|
|
... |
|
... |
2173 |
user preferences, all mimeview entries will be ignored except the one
|
2347 |
user preferences, all mimeview entries will be ignored except the one
|
2174 |
labelled application/x-all (which is set to use xdg-open by default).
|
2348 |
labelled application/x-all (which is set to use xdg-open by default).
|
2175 |
|
2349 |
|
2176 |
----------------------------------------------------------------------
|
2350 |
----------------------------------------------------------------------
|
2177 |
|
2351 |
|
2178 |
5.4.5. Examples of configuration adjustments
|
2352 |
7.4.5. Examples of configuration adjustments
|
2179 |
|
2353 |
|
2180 |
5.4.5.1. Adding an external viewer for an non-indexed type
|
2354 |
7.4.5.1. Adding an external viewer for an non-indexed type
|
2181 |
|
2355 |
|
2182 |
Imagine that you have some kind of file which does not have indexable
|
2356 |
Imagine that you have some kind of file which does not have indexable
|
2183 |
content, but for which you would like to have a functional Edit link in
|
2357 |
content, but for which you would like to have a functional Edit link in
|
2184 |
the result list (when found by file name). The file names end in .blob and
|
2358 |
the result list (when found by file name). The file names end in .blob and
|
2185 |
can be displayed by application blobviewer.
|
2359 |
can be displayed by application blobviewer.
|
|
... |
|
... |
2208 |
The entries you add in your personal file override those in the central
|
2382 |
The entries you add in your personal file override those in the central
|
2209 |
configuration, which you do not need to alter
|
2383 |
configuration, which you do not need to alter
|
2210 |
|
2384 |
|
2211 |
----------------------------------------------------------------------
|
2385 |
----------------------------------------------------------------------
|
2212 |
|
2386 |
|
2213 |
5.4.5.2. Adding indexing support for a new file type
|
2387 |
7.4.5.2. Adding indexing support for a new file type
|
2214 |
|
2388 |
|
2215 |
Let us now imagine that the above .blob files actually contain indexable
|
2389 |
Let us now imagine that the above .blob files actually contain indexable
|
2216 |
text and that you know how to extract it with a command line program.
|
2390 |
text and that you know how to extract it with a command line program.
|
2217 |
Getting Recoll to index the files is easy. You need to perform the above
|
2391 |
Getting Recoll to index the files is easy. You need to perform the above
|
2218 |
alteration, and also to add data to the mimeconf file (typically in
|
2392 |
alteration, and also to add data to the mimeconf file (typically in
|
|
... |
|
... |
2239 |
The filter programming section describes in more detail how to write a
|
2413 |
The filter programming section describes in more detail how to write a
|
2240 |
filter.
|
2414 |
filter.
|
2241 |
|
2415 |
|
2242 |
----------------------------------------------------------------------
|
2416 |
----------------------------------------------------------------------
|
2243 |
|
2417 |
|
2244 |
5.5. The KDE Kicker Recoll applet
|
2418 |
7.5. The KDE Kicker Recoll applet
|
2245 |
|
2419 |
|
2246 |
The Recoll source tree contains the source code to the recoll_applet, a
|
2420 |
The Recoll source tree contains the source code to the recoll_applet, a
|
2247 |
small application derived from the find_applet. This can be used to add a
|
2421 |
small application derived from the find_applet. This can be used to add a
|
2248 |
small Recoll launcher to the KDE panel.
|
2422 |
small Recoll launcher to the KDE panel.
|
2249 |
|
2423 |
|