|
a/src/README |
|
b/src/README |
|
... |
|
... |
140 |
document files. The acquisition process is called indexation.
|
140 |
document files. The acquisition process is called indexation.
|
141 |
|
141 |
|
142 |
The resulting database can be big (roughly the size of the original
|
142 |
The resulting database can be big (roughly the size of the original
|
143 |
document set), but it is not a document archive. Recoll can only display
|
143 |
document set), but it is not a document archive. Recoll can only display
|
144 |
documents that still exist at the place from which they were indexed.
|
144 |
documents that still exist at the place from which they were indexed.
|
|
|
145 |
(Actually, there is a way to reconstruct a document from the information
|
|
|
146 |
in the database, but the result is not nice, as all formatting,
|
|
|
147 |
punctuation and capitalisation are lost).
|
145 |
|
148 |
|
146 |
Recoll stores all internal data in Unicode UTF-8 format, and it can index
|
149 |
Recoll stores all internal data in Unicode UTF-8 format, and it can index
|
147 |
files with different character sets, encodings, and languages into the
|
150 |
files with different character sets, encodings, and languages into the
|
148 |
same database. It has input filters for many document types.
|
151 |
same database. It has input filters for many document types.
|
149 |
|
152 |
|
|
... |
|
... |
183 |
|
186 |
|
184 |
Recoll indexation takes place at discrete times. There is currently no
|
187 |
Recoll indexation takes place at discrete times. There is currently no
|
185 |
interface to real time file modification monitors. The typical usage is to
|
188 |
interface to real time file modification monitors. The typical usage is to
|
186 |
have a nightly indexation run programmed into your cron file.
|
189 |
have a nightly indexation run programmed into your cron file.
|
187 |
|
190 |
|
|
|
191 |
+------------------------------------------------------------------------+
|
|
|
192 |
| Side note: there is nothing in Recoll and Xapian that would prevent |
|
|
|
193 |
| interfacing with a real time file modification monitor, but this would |
|
|
|
194 |
| tend to consume significant system resources for dubious gain, because |
|
|
|
195 |
| you rarely need a full text search to find documents you just |
|
|
|
196 |
| modified. recollindex -i can be used to add individual files to the |
|
|
|
197 |
| index if you want to play with this, see the manual page. |
|
|
|
198 |
+------------------------------------------------------------------------+
|
|
|
199 |
|
188 |
Recoll knows about quite a few different document types. The parameters
|
200 |
Recoll knows about quite a few different document types. The parameters
|
189 |
for document types recognition and processing are set in configuration
|
201 |
for document types recognition and processing are set in configuration
|
190 |
files Most file types, like HTML or word processing files, only hold one
|
202 |
files Most file types, like HTML or word processing files, only hold one
|
191 |
document. Some file types, like mail folder files can hold many
|
203 |
document. Some file types, like mail folder files can hold many
|
192 |
individually indexed documents.
|
204 |
individually indexed documents.
|
|
... |
|
... |
256 |
|
268 |
|
257 |
----------------------------------------------------------------------
|
269 |
----------------------------------------------------------------------
|
258 |
|
270 |
|
259 |
3.1. Simple search
|
271 |
3.1. Simple search
|
260 |
|
272 |
|
261 |
Start the recoll program, then enter search term(s) in the text field at
|
273 |
1. Start the recoll program.
|
262 |
the top left of the window. Clicking the Search button or hitting the
|
274 |
|
263 |
Enter key will start a search. By default, this will look for documents
|
275 |
2. Enter search term(s) in the text field at the top of the window.
|
264 |
with any of the terms (the ones with more terms will get better scores).
|
276 |
|
265 |
You can check the All terms checkbox to ensure that only documents with
|
277 |
3. Click the Search button or hit the Enter key to start the search.
|
|
|
278 |
|
|
|
279 |
By default, this will look for documents with any of the search terms (the
|
|
|
280 |
ones with more terms will get better scores). You can check the All terms
|
|
|
281 |
checkbox to ensure that only documents with all the terms will be
|
266 |
all the terms will be returned. Use the Tools / Advanced search dialog for
|
282 |
returned. Use the Tools / Advanced search dialog for more complex
|
267 |
more complex searches.
|
283 |
searches.
|
268 |
|
284 |
|
269 |
After starting a search, a list of results will instantly be displayed in
|
285 |
After starting a search, a list of results will instantly be displayed in
|
270 |
the main list window. Clicking on an entry will open an internal preview
|
286 |
the main list window. Clicking on an entry will open an internal preview
|
271 |
window for the document. Double-clicking will attempt to start an external
|
287 |
window for the document. Double-clicking will attempt to start an external
|
272 |
viewer (have a look at the ~/.recoll/mimeconf file to see how these are
|
288 |
viewer (have a look at the ~/.recoll/mimeconf file to see how these are
|
|
... |
|
... |
274 |
|
290 |
|
275 |
By default, the document list is presented in order of relevance (how well
|
291 |
By default, the document list is presented in order of relevance (how well
|
276 |
the system estimates that the document matches the query). You can specify
|
292 |
the system estimates that the document matches the query). You can specify
|
277 |
a different ordering by using the Tools / Sort parameters dialog.
|
293 |
a different ordering by using the Tools / Sort parameters dialog.
|
278 |
|
294 |
|
|
|
295 |
You can click on the first paragraph (Query results or No results found)
|
|
|
296 |
in the result list to get an exact display of the query actually
|
|
|
297 |
performed, after stem expansion and other processing.
|
|
|
298 |
|
279 |
----------------------------------------------------------------------
|
299 |
----------------------------------------------------------------------
|
280 |
|
300 |
|
281 |
3.2. Complex/advanced search
|
301 |
3.2. Complex/advanced search
|
282 |
|
302 |
|
283 |
The advanced search dialog has fields that will allow a more refined
|
303 |
The advanced search dialog has fields that will allow a more refined
|
284 |
search, looking for documents with all given words, a given exact phrase,
|
304 |
search, looking for documents with all given words, a given exact phrase,
|
285 |
or none of the given words (all fields may be combined by an implicit AND
|
305 |
or none of the given words (all relevant fields will be combined by an
|
286 |
clause).
|
306 |
implicit AND clause).
|
287 |
|
307 |
|
288 |
It will let you search for documents of specific mime types (ie: only
|
308 |
It will let you search for documents of specific mime types (ie: only
|
289 |
text/plain, or text/html or application/pdf etc...)
|
309 |
text/plain, or text/html or application/pdf etc...)
|
290 |
|
310 |
|
291 |
It will let you restrict the search results to a subtree of the indexed
|
311 |
It will let you restrict the search results to a subtree of the indexed
|
292 |
area.
|
312 |
area.
|
293 |
|
313 |
|
294 |
Click on the Start Search button in the advanced search dialog to start
|
314 |
Click on the Start Search button in the advanced search dialog to start
|
295 |
the search. The button in the main window always performs a simple search.
|
315 |
the search. The button in the main window always performs a simple search.
|
|
|
316 |
|
|
|
317 |
Click on the result list header paragraph to see the query expansion.
|
296 |
|
318 |
|
297 |
----------------------------------------------------------------------
|
319 |
----------------------------------------------------------------------
|
298 |
|
320 |
|
299 |
3.3. Document history
|
321 |
3.3. Document history
|
300 |
|
322 |
|
|
... |
|
... |
345 |
3.6. Customising the search interface
|
367 |
3.6. Customising the search interface
|
346 |
|
368 |
|
347 |
It is possible to customise some aspects of the search interface by using
|
369 |
It is possible to customise some aspects of the search interface by using
|
348 |
Query configuration entry in the Preferences menu.
|
370 |
Query configuration entry in the Preferences menu.
|
349 |
|
371 |
|
350 |
There are two tabs in the dialog, to modify the appearance of the user
|
372 |
There are two tabs in the dialog, dealing with the interface itself, and
|
351 |
interface (result list appearance), or the parameters used for searching
|
373 |
with the parameters used for searching and returning results.
|
352 |
(language used for stem expansion).
|
|
|
353 |
|
374 |
|
354 |
The stemming language can be chosen among those that were specified in the
|
375 |
User interface parameters:
|
|
|
376 |
|
|
|
377 |
* Number of results in a result page
|
|
|
378 |
|
|
|
379 |
* Result list font: There is quite a lot of information shown in the
|
|
|
380 |
result list, and you may want to customise the font and/or font size.
|
|
|
381 |
The rest of the fonts used by Recoll are determined by your generic QT
|
|
|
382 |
config (try the qtconfig command.
|
|
|
383 |
|
|
|
384 |
* Html help browser: this will let you chose your the preferred browser
|
|
|
385 |
which will be started from the Help menu to read the user manual. You
|
|
|
386 |
can enter a simple name if the command is in your PATH, or browse for
|
|
|
387 |
a full pathname.
|
|
|
388 |
|
|
|
389 |
* Show document type icons in result list: icons in the result list can
|
|
|
390 |
be turned off. They take quite a lot of space and convey relatively
|
|
|
391 |
little useful information.
|
|
|
392 |
|
|
|
393 |
Search parameters:
|
|
|
394 |
|
|
|
395 |
* Stemming language: stemming obviously depends on the document's
|
|
|
396 |
language. This listbox will let you chose among the stemming databases
|
|
|
397 |
which were built during indexing (this is set in the main
|
355 |
configuration file, or later added with recollindex -s (See the
|
398 |
configuration file), or later added with recollindex -s (See the
|
356 |
recollindex manual). Stemming languages which are dynamically added will
|
399 |
recollindex manual). Stemming languages which are dynamically added
|
357 |
be deleted at the next indexation pass unless they are also added in the
|
400 |
will be deleted at the next indexation pass unless they are also added
|
358 |
configuration file.
|
401 |
in the configuration file.
|
|
|
402 |
|
|
|
403 |
* Dynamically build abstracts: this decides if Recoll tries to build
|
|
|
404 |
document abstracts when displaying the result list. Abstracts are
|
|
|
405 |
constructed by taking context from the document information, around
|
|
|
406 |
the search terms. This can slow down result list display significantly
|
|
|
407 |
for big documents, and you may want to turn it off.
|
|
|
408 |
|
|
|
409 |
* Replace abstracts from documents: this decides if we should synthetize
|
|
|
410 |
and display an abstract in place of an explicit abstract found within
|
|
|
411 |
the document itself.
|
359 |
|
412 |
|
360 |
----------------------------------------------------------------------
|
413 |
----------------------------------------------------------------------
|
361 |
|
414 |
|
362 |
Chapter 4. Installation
|
415 |
Chapter 4. Installation
|
363 |
|
416 |
|
|
... |
|
... |
365 |
|
418 |
|
366 |
4.1.1. Prerequisites
|
419 |
4.1.1. Prerequisites
|
367 |
|
420 |
|
368 |
At the very least, you will need to download and install the xapian core
|
421 |
At the very least, you will need to download and install the xapian core
|
369 |
package (Recoll currently uses version 0.9.2), and the qt runtime and
|
422 |
package (Recoll currently uses version 0.9.2), and the qt runtime and
|
370 |
development packages (Recoll currently uses version 3.3.3).
|
423 |
development packages (Recoll development currently uses version 3.3.5, but
|
|
|
424 |
any 3.3 version is probably ok).
|
371 |
|
425 |
|
372 |
You will most probably be able to find a binary package for qt for your
|
426 |
You will most probably be able to find a binary package for qt for your
|
373 |
system. You may have to compile Xapian, but this is not difficult (if you
|
427 |
system. You may have to compile Xapian but this is not difficult (if you
|
374 |
are using FreeBSD, there is a port).
|
428 |
are using FreeBSD, there is a port).
|
375 |
|
429 |
|
376 |
You may also need libiconv. Recoll currently uses version 1.9 (this should
|
430 |
You may also need libiconv. Recoll currently uses version 1.9 (this should
|
377 |
not be critical). On Linux systems, the iconv interface is part of libc
|
431 |
not be critical). On Linux systems, the iconv interface is part of libc
|
378 |
and you should not need to do anything special.
|
432 |
and you should not need to do anything special.
|