Switch to unified view

a/src/INSTALL b/src/INSTALL
...
...
96
96
97
     * MP3: Recoll will use the id3info command from the id3lib package to
97
     * MP3: Recoll will use the id3info command from the id3lib package to
98
       extract tag information. Without it, only the file names will be
98
       extract tag information. Without it, only the file names will be
99
       indexed.
99
       indexed.
100
100
101
   Text, HTML, mail folders and Openoffice files are processed internally.
101
   Text, HTML, mail folders Openoffice and Scribus files are processed
102
   internally. Lyx is used to index Lyx files. Many filters need sed and awk.
102
103
103
   --------------------------------------------------------------------------
104
   --------------------------------------------------------------------------
104
105
105
   Prev                               Home                               Next 
106
   Prev                               Home                               Next 
106
   Installation                        Up                Building from source 
107
   Installation                        Up                Building from source 
...
...
215
   recoll and recollindex.
216
   recoll and recollindex.
216
217
217
   If the .recoll directory does not exist when recoll or recollindex are
218
   If the .recoll directory does not exist when recoll or recollindex are
218
   started, it will be created with a set of empty configuration files.
219
   started, it will be created with a set of empty configuration files.
219
   recoll will give you a chance to edit the configuration file before
220
   recoll will give you a chance to edit the configuration file before
220
   starting indexing. recollindex will proceed immediately.
221
   starting indexing. recollindex will proceed immediately. To avoid
222
   mistakes, the automatic directory creation will only occur for the default
223
   location, not if -c or RECOLL_CONFDIR were used (in the latter cases, you
224
   will have to create the directory).
221
225
222
   All configuration files share the same format. For example, a short
226
   All configuration files share the same format. For example, a short
223
   extract of the main configuration file might look as follows:
227
   extract of the main configuration file might look as follows:
224
228
225
         # Space-separated list of directories to index.
229
         # Space-separated list of directories to index.
...
...
245
   in the next section.
249
   in the next section.
246
250
247
   The tilde character (~) is expanded in file names to the name of the
251
   The tilde character (~) is expanded in file names to the name of the
248
   user's home directory.
252
   user's home directory.
249
253
250
   White space is used for separation inside lists. Elements with embedded
254
   White space is used for separation inside lists. List elements with
251
   spaces can be quoted using double-quotes.
255
   embedded spaces can be quoted using double-quotes.
252
256
253
4.4.1. Main configuration file
257
4.4.1. Main configuration file
254
258
255
   recoll.conf is the main configuration file. It defines things like what to
259
   recoll.conf is the main configuration file. It defines things like what to
256
   index (top directories and things to ignore), and the default character
260
   index (top directories and things to ignore), and the default character
...
...
273
   dbdir
277
   dbdir
274
278
275
           The name of the Xapian data directory. It will be created if
279
           The name of the Xapian data directory. It will be created if
276
           needed when the index is initialized. If this is not an absolute
280
           needed when the index is initialized. If this is not an absolute
277
           path, it will be interpreted relative to the configuration
281
           path, it will be interpreted relative to the configuration
278
           directory.
282
           directory. The value can have embedded spaces but starting or
283
           trailing spaces will be trimmed. You cannot use quotes here.
279
284
280
   skippedNames
285
   skippedNames
281
286
282
           A space-separated list of patterns for names of files or
287
           A space-separated list of patterns for names of files or
283
           directories that should be completely ignored. The list defined in
288
           directories that should be completely ignored. The list defined in
284
           the default file is:
289
           the default file is:
285
290
286
 *~ #* bin CVS  Cache caughtspam  tmp
291
 skippedNames = #* bin CVS  Cache cache* caughtspam  tmp .thumbnails .svn \
292
          *~ recollrc
287
293
288
           The list can be redefined for sub-directories, but is only
294
           The list can be redefined for sub-directories, but is only
289
           actually changed for the top level ones in topdirs.
295
           actually changed for the top level ones in topdirs.
290
296
291
           The top-level directories are not affected by this list (that is,
297
           The top-level directories are not affected by this list (that is,
...
...
296
           index quite a few things that you do not want. On the other hand,
302
           index quite a few things that you do not want. On the other hand,
297
           mail user agents like thunderbird usually store messages in hidden
303
           mail user agents like thunderbird usually store messages in hidden
298
           directories, and you probably want this indexed. One possible
304
           directories, and you probably want this indexed. One possible
299
           solution is to have .* in skippedNames, and add things like
305
           solution is to have .* in skippedNames, and add things like
300
           ~/.thunderbird or ~/.evolution in topdirs.
306
           ~/.thunderbird or ~/.evolution in topdirs.
307
308
   skippedPaths and daemSkippedPaths
309
310
           A space-separated list of patterns for paths of files or
311
           directories that should be skipped. There is no default in the
312
           sample configuration file, but the code always adds the
313
           configuration and database directories in there.
314
315
           skippedPaths is used both by batch and real time indexing.
316
           daemSkippedPaths can be used to specify things that should be
317
           indexed at startup, but not monitored.
318
319
           Example of use for skipping text files only in a specific
320
           directory:
321
322
 skippedPaths = ~/somedir/*.txt
323
             
301
324
302
   loglevel,daemloglevel
325
   loglevel,daemloglevel
303
326
304
           Verbosity level for recoll and recollindex. A value of 4 lists
327
           Verbosity level for recoll and recollindex. A value of 4 lists
305
           quite a lot of debug/information messages. 2 only lists errors.
328
           quite a lot of debug/information messages. 2 only lists errors.
...
...
422
   non-default entries, which will override those from the central
445
   non-default entries, which will override those from the central
423
   configuration file.
446
   configuration file.
424
447
425
   Please note that these entries must be placed under a [view] section.
448
   Please note that these entries must be placed under a [view] section.
426
449
450
   If Use desktop preferences to choose document editor is checked in the
451
   user preferences, all mimeview entries will be ignored except the one
452
   labelled application/x-all (which is set to use xdg-open by default).
453
454
4.4.5. Examples of configuration adjustments
455
456
  4.4.5.1. Adding an external viewer for an non-indexed type
457
458
   Imagine that you have some kind of file which does not have indexable
459
   content, but for which you would like to have a functional Edit link in
460
   the result list (when found by file name). The file names end in .blob and
461
   can be displayed by application blobviewer.
462
463
   You need two entries in the configuration files for this to work:
464
465
     * In $RECOLL_CONFDIR/mimemap (typically ~/.recoll/mimemap), add the
466
       following line:
467
468
              application/x-blobapp = .blob
469
          
470
471
       Note that the mime type is made up here, and you could call it
472
       diesel/oil just the same.
473
474
     * In $RECOLL_CONFDIR/mimeview under the [view] section:
475
476
                  application/x-blobapp = blobviewer %f
477
             
478
479
       We are supposing that blobviewer wants a file name parameter here, you
480
       would use %u if it liked URLs better.
481
482
   If you just wanted to change the application used by Recoll to display a
483
   mime type which it already knows, you would just need to edit mimeview.
484
   The entries you add in your personal file override those in the central
485
   configuration, which you do not need to alter
486
487
  4.4.5.2. Adding indexing support for a new file type
488
489
   Let us now imagine that the above .blob files actually contain indexable
490
   text and that you know how to extract it with a command line program.
491
   Getting Recoll to index the files is easy. You need to perform the above
492
   alteration, and also to add data to the mimeconf file (typically in
493
   ~/.recoll/mimeconf):
494
495
     * Under the [index] section, add the following line (more about the
496
       rclblob indexing script later):
497
498
                  application/x-blobapp = exec rclblob
499
             
500
501
     * Under the [icons] section, you should choose an icon to be displayed
502
       for the files inside the result lists. Icons are normally 64x64 pixels
503
       PNG files which live in /usr/[local/]share/recoll/images.
504
505
     * Under the [categories] section, you should add the mime type where it
506
       makes sense (you can also create a category). Categories may be used
507
       for filtering in advanced search.
508
509
   The rclblob filter should be an executable program or script which exists
510
   inside /usr/[local/]share/recoll/filters. It will be given a file name as
511
   argument and should output the text contents in html format on the
512
   standard output.
513
514
   The html could be very minimal like the following example:
515
516
 <html><head>
517
 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
518
 </head>
519
 <body>some text content</body></html>
520
         
521
522
   You should take care to escape some characters inside the text by
523
   transforming them into appropriate entities. "&" should be transformed
524
   into "&amp;", "<" should be transformed into "&lt;".
525
526
   The character set needs to be specified in the header. It does not need to
527
   be UTF-8 (Recoll will take care of translating it), but it must be
528
   accurate for good results.
529
530
   Recoll will also make use of other header fields if they are present:
531
   title, description, keywords.
532
533
   The easiest way to write a new filter is probably to start from an
534
   existing one.
535
427
   --------------------------------------------------------------------------
536
   --------------------------------------------------------------------------
428
537
429
   Prev                               Home                                    
538
   Prev                               Home                                    
430
   Building from source                Up                                     
539
   Building from source                Up