Switch to unified view

a/src/README b/src/README
...
...
6
6
7
  Jean-Francois Dockes
7
  Jean-Francois Dockes
8
8
9
   <jfd@recoll.org>
9
   <jfd@recoll.org>
10
10
11
   Copyright (c) 2005-2011 Jean-Francois Dockes
11
   Copyright (c) 2005-2012 Jean-Francois Dockes
12
12
13
   This document introduces full text search notions and describes the
13
   This document introduces full text search notions and describes the
14
   installation and use of the Recoll application. It currently describes
14
   installation and use of the Recoll application. It currently describes
15
   Recoll 1.16.
15
   Recoll 1.17.
16
16
17
   [ Split HTML / Single HTML ]
17
   [ Split HTML / Single HTML ]
18
18
19
     ----------------------------------------------------------------------
19
     ----------------------------------------------------------------------
20
20
...
...
108
108
109
   4. Programming interface
109
   4. Programming interface
110
110
111
                4.1. Writing a document filter
111
                4.1. Writing a document filter
112
112
113
                             4.1.1. Simple filters
114
115
                             4.1.2. Telling Recoll about the filter
116
113
                             4.1.1. Filter HTML output
117
                             4.1.3. Filter HTML output
114
118
115
                4.2. Field data processing
119
                4.2. Field data processing
116
120
117
                4.3. API
121
                4.3. API
118
122
...
...
244
   something like /usr/[local/]share/recoll/examples) during installation.
248
   something like /usr/[local/]share/recoll/examples) during installation.
245
   The default parameters from this file may be overridden by values that you
249
   The default parameters from this file may be overridden by values that you
246
   set inside your personal configuration, found by default in the .recoll
250
   set inside your personal configuration, found by default in the .recoll
247
   sub-directory of your home directory. The default configuration will index
251
   sub-directory of your home directory. The default configuration will index
248
   your home directory with default parameters and should be sufficient for
252
   your home directory with default parameters and should be sufficient for
249
   giving Recoll a try, but you may want to adjust it later.
253
   giving Recoll a try, but you may want to adjust it later, which can be
254
   done either by editing the text files or by using configuration menus in
255
   the recoll GUI
250
256
251
   Indexing is started automatically the first time you execute the recoll
257
   Indexing is started automatically the first time you execute the recoll
252
   search graphical user interface, or by executing the recollindex command.
258
   search graphical user interface, or by executing the recollindex command.
253
259
254
   Searches are usually performed inside the recoll graphical user interface
260
   Searches are usually performed inside the recoll graphical user interface
...
...
264
2.1. Introduction
270
2.1. Introduction
265
271
266
   Indexing is the process by which the set of documents is analyzed and the
272
   Indexing is the process by which the set of documents is analyzed and the
267
   data entered into the database. Recoll indexing is normally incremental:
273
   data entered into the database. Recoll indexing is normally incremental:
268
   documents will only be processed if they have been modified. On the first
274
   documents will only be processed if they have been modified. On the first
269
   execution, of course, all documents will need processing. A full index
275
   execution, all documents will need processing. A full index build can be
270
   build can be forced later by specifying an option to the indexing command
276
   forced later by specifying an option to the indexing command (recollindex
271
   (recollindex -z).
277
   -z).
272
278
273
   Recoll indexing can be performed with two different methods:
279
   Recoll indexing can be performed with two different methods:
274
280
275
     * Periodic indexing: indexing takes place at discrete times, by
281
     * Periodic indexing: indexing takes place at discrete times, by
276
       executing the recollindex command. The typical usage is to have a
282
       executing the recollindex command. The typical usage is to have a
...
...
284
   The choice between the two methods is mostly a matter of preference, and
290
   The choice between the two methods is mostly a matter of preference, and
285
   they can be combined by setting up multiple indexes (ie: use periodic
291
   they can be combined by setting up multiple indexes (ie: use periodic
286
   indexing on a big documentation directory, and real time indexing on a
292
   indexing on a big documentation directory, and real time indexing on a
287
   small home directory). Monitoring a big file system tree can consume
293
   small home directory). Monitoring a big file system tree can consume
288
   significant system resources.
294
   significant system resources.
289
290
   
291
295
292
   Recoll knows about quite a few different document types. The parameters
296
   Recoll knows about quite a few different document types. The parameters
293
   for document types recognition and processing are set in configuration
297
   for document types recognition and processing are set in configuration
294
   files.
298
   files.
295
299
...
...
299
   compound ones. Such hierarchies can go quite deep, and Recoll has no
303
   compound ones. Such hierarchies can go quite deep, and Recoll has no
300
   problem processing, for example, an ms-word document which would be an
304
   problem processing, for example, an ms-word document which would be an
301
   attachment to an email message part of a folder file archived inside a zip
305
   attachment to an email message part of a folder file archived inside a zip
302
   file...
306
   file...
303
307
304
   Recoll indexing processes plain text, HTML, openoffice and e-mail files
308
   Recoll indexing processes plain text, HTML, openoffice and e-mail files,
305
   internally (a few more actually).
309
   and a few others internally.
306
310
307
   Other file types (ie: postscript, pdf, ms-word, rtf ...) need external
311
   Other file types (ie: postscript, pdf, ms-word, rtf ...) need external
308
   applications for preprocessing. The list is in the installation section.
312
   applications for preprocessing. The list is in the installation section.
309
   After every indexing operation, Recoll updates a list of commands that
313
   After every indexing operation, Recoll updates a list of commands that
310
   would be needed for indexing existing files types. This list can be
314
   would be needed for indexing existing files types. This list can be
...
...
341
       different areas of the file system to different indexes. For example,
345
       different areas of the file system to different indexes. For example,
342
       if you were to issue the following commands:
346
       if you were to issue the following commands:
343
347
344
 export RECOLL_CONFDIR=~/.indexes-email
348
 export RECOLL_CONFDIR=~/.indexes-email
345
 recoll
349
 recoll
346
         
350
          
347
351
348
       Then Recoll would use configuration files stored in ~/.indexes-email/
352
       Then Recoll would use configuration files stored in ~/.indexes-email/
349
       and, (unless specified otherwise in recoll.conf) would look for the
353
       and, (unless specified otherwise in recoll.conf) would look for the
350
       index in ~/.indexes-email/xapiandb/.
354
       index in ~/.indexes-email/xapiandb/.
351
355
...
...
378
382
379
     ----------------------------------------------------------------------
383
     ----------------------------------------------------------------------
380
384
381
  2.2.1. Xapian index formats
385
  2.2.1. Xapian index formats
382
386
383
   If your first installation of Recoll was 1.9.0 or more recent, you can
387
   Xapian versions usually support several formats for index storage. A given
384
   skip this section.
388
   major Xapian version will have a current format, used to create new
389
   indexes, and will also support the format from the previous major version.
385
390
386
   Xapian has had two possible index formats for quite some time. The "old"
387
   one named Quartz, and the new one named Flint. Xapian 0.9 used Quartz by
388
   default, but could use Flint if a specific environment variable
389
   (XAPIAN_PREFER_FLINT) was set. Xapian 1.0 still supports Quartz but will
390
   use Flint by default for new index creations.
391
392
   The number of disk accesses performed during indexing has been much
393
   optimized in the new Flint engine and you may see indexing times improved
394
   by 50% in some cases (compared to Quartz), typically for big indexes where
395
   disk accesses dominate the indexing time. There is also a more modest
396
   improvement of index size.
397
398
   Xapian will not convert automatically an existing index from the Quartz to
391
   Xapian will not convert automatically an existing index from the older
399
   the Flint format. If you have an older index and want to take advantage of
392
   format to the newer one. If you want to upgrade to the new format, or if a
400
   the new format (which can be done without setting the environment variable
393
   very old index needs to be converted because its format is not supported
401
   as of Recoll 1.8.2 and Xapian 1.0.0), you will have to explicitly delete
394
   any more, you will have to explicitly delete the old index, then run a
402
   the old index, then run a normal indexing process.
395
   normal indexing process.
403
396
404
   Unfortunately, using the -z option to recollindex is not sufficient to
397
   Unfortunately, using the -z option to recollindex is not sufficient to
405
   change the format, you have to delete all files inside the index directory
398
   change the format, you will have to delete all files inside the index
406
   (typically ~/.recoll/xapiandb) before starting indexing.
399
   directory (typically ~/.recoll/xapiandb) before starting the indexing.
407
400
408
     ----------------------------------------------------------------------
401
     ----------------------------------------------------------------------
409
402
410
  2.2.2. Security aspects
403
  2.2.2. Security aspects
411
404
412
   The Recoll index does not hold copies of the indexed documents. But it
405
   The Recoll index does not hold copies of the indexed documents. But it
413
   does hold enough data to allow for an almost complete reconstruction. If
406
   does hold enough data to allow for an almost complete reconstruction. If
414
   confidential data is indexed, access to the database directory should be
407
   confidential data is indexed, access to the database directory should be
415
   restricted.
408
   restricted.
416
409
417
   As of version 1.4, Recoll will create the configuration directory with a
410
   Recoll (since version 1.4) will create the configuration directory with a
418
   mode of 0700 (access by owner only). As the index data directory is by
411
   mode of 0700 (access by owner only). As the index data directory is by
419
   default a sub-directory of the configuration directory, this should result
412
   default a sub-directory of the configuration directory, this should result
420
   in appropriate protection.
413
   in appropriate protection.
421
414
422
   If you use another setup, you should think of the kind of protection you
415
   If you use another setup, you should think of the kind of protection you
...
...
505
2.5. Periodic indexing
498
2.5. Periodic indexing
506
499
507
  2.5.1. Running indexing
500
  2.5.1. Running indexing
508
501
509
   Indexing is performed either by the recollindex program, or by the
502
   Indexing is performed either by the recollindex program, or by the
510
   indexing thread inside the recoll program (use the File menu). Both
503
   indexing thread inside the recoll program (start it from the File menu).
511
   programs will use the RECOLL_CONFDIR variable or accept a -c confdir
504
   Both programs will use the RECOLL_CONFDIR variable or accept a -c confdir
512
   option to specify a non-default configuration directory.
505
   option to specify a non-default configuration directory.
513
506
514
   Reasons to use either the indexing thread or the recollindex command:
507
   There are reasons to use either the indexing thread or the recollindex
508
   command, but it is also a matter of personal preferences:
515
509
516
     * Starting the indexing thread is more convenient, being just one click
510
     * Starting the indexing thread is more convenient, being just one click
517
       away.
511
       away.
518
512
519
     * The recollindex command has more options, especially the one to reset
513
     * The recollindex command has more options, especially the one to reset
...
...
521
515
522
     * The recollindex command will not take down your GUI if it crashes (a
516
     * The recollindex command will not take down your GUI if it crashes (a
523
       rare occurrence, but who knows...)
517
       rare occurrence, but who knows...)
524
518
525
     * The recollindex command uses setpriority/nice to lower its priority
519
     * The recollindex command uses setpriority/nice to lower its priority
526
       while indexing (it will also use ionice when this becomes more widely
520
       while indexing. When available (and for Recoll version 1.16.2 and
521
       newer), it also uses the ionice command to lower its IO priority. The
527
       available), the thread can't do it, else it would also slow down the
522
       thread can't do it, else it would also slow down the user/search
528
       user/search interface.
523
       interface.
529
530
   I'll let the reader decide where my heart belongs...
531
524
532
   If the recoll program finds no index when it starts, it will automatically
525
   If the recoll program finds no index when it starts, it will automatically
533
   start indexing (except if canceled).
526
   start indexing (except if canceled).
534
527
535
   The recollindex indexing process can be interrupted by sending an
528
   The recollindex indexing process can be interrupted by sending an
...
...
594
   index.
587
   index.
595
588
596
   The real time indexing support can be customised during package
589
   The real time indexing support can be customised during package
597
   configuration with the --with[out]-fam or --with[out]-inotify options. The
590
   configuration with the --with[out]-fam or --with[out]-inotify options. The
598
   default is currently to include inotify monitoring on systems that support
591
   default is currently to include inotify monitoring on systems that support
599
   it.
592
   it, and, as of recoll 1.17, gamin support on FreeBSD.
600
593
601
   The rclmon.sh script can be used to easily start and stop the daemon. It
594
   The rclmon.sh script can be used to easily start and stop the daemon. It
602
   can be found in the examples directory (typically
595
   can be found in the examples directory (typically
603
   /usr/local/[share/]recoll/examples).
596
   /usr/local/[share/]recoll/examples).
604
597
...
...
608
601
609
 recollconf=$HOME/.recoll-home
602
 recollconf=$HOME/.recoll-home
610
 recolldata=/usr/local/share/recoll
603
 recolldata=/usr/local/share/recoll
611
 RECOLL_CONFDIR=$recollconf $recolldata/examples/rclmon.sh start
604
 RECOLL_CONFDIR=$recollconf $recolldata/examples/rclmon.sh start
612
605
613
 fvwm 

606
 fvwm
614
607
615
   The indexing daemon gets started, then the window manager, for which the
608
   The indexing daemon gets started, then the window manager, for which the
616
   session waits.
609
   session waits.
617
610
618
   By default the indexing daemon will monitor the state of the X11 session,
611
   By default the indexing daemon will monitor the state of the X11 session,
...
...
622
   Under KDE, you can place a small script to start recollindex -m under
615
   Under KDE, you can place a small script to start recollindex -m under
623
   $HOME/.kde/Autostart. This will be executed when the session begins.
616
   $HOME/.kde/Autostart. This will be executed when the session begins.
624
617
625
   There is a similar mechanism under Gnome (find the session control tool in
618
   There is a similar mechanism under Gnome (find the session control tool in
626
   the menus and use the "Startup programs" tab).
619
   the menus and use the "Startup programs" tab).
620
621
   If you use the daemon completely out of an X11 session, you need to add
622
   option -x to disable X11 session monitoring (else the daemon will not
623
   start).
627
624
628
   By default, the messages from the indexing daemon will be discarded. You
625
   By default, the messages from the indexing daemon will be discarded. You
629
   may want to change this by setting the daemlogfilename and daemloglevel
626
   may want to change this by setting the daemlogfilename and daemloglevel
630
   configuration parameters. Also the log file will only be truncated when
627
   configuration parameters. Also the log file will only be truncated when
631
   the daemon starts. If the daemon runs permanently, the log file may grow
628
   the daemon starts. If the daemon runs permanently, the log file may grow
...
...
880
   can be resized, and their order can be changed (by dragging). All the
877
   can be resized, and their order can be changed (by dragging). All the
881
   changes are recorded when you quit recoll
878
   changes are recorded when you quit recoll
882
879
883
   Hovering over a table row will update the detail area at the bottom of the
880
   Hovering over a table row will update the detail area at the bottom of the
884
   window with the corresponding values. You can click the row to freeze the
881
   window with the corresponding values. You can click the row to freeze the
885
   display. The bottom area is equivalent to a classical result list
882
   display. The bottom area is equivalent to a result list paragraph, with
886
   paragraph, with links for starting a preview or a native application, and
883
   links for starting a preview or a native application, and an equivalent
887
   an equivalent right-click menu. Typing Esc (the Escape key) will unfreeze
884
   right-click menu. Typing Esc (the Escape key) will unfreeze the display.
888
   the display.
889
885
890
     ----------------------------------------------------------------------
886
     ----------------------------------------------------------------------
891
887
892
  3.1.4. The preview window
888
  3.1.4. The preview window
893
889
...
...
1115
     ----------------------------------------------------------------------
1111
     ----------------------------------------------------------------------
1116
1112
1117
  3.1.9. Sorting search results and collapsing duplicates
1113
  3.1.9. Sorting search results and collapsing duplicates
1118
1114
1119
   The documents in a result list are normally sorted in order of relevance.
1115
   The documents in a result list are normally sorted in order of relevance.
1120
   It is possible to specify different sort parameters by using the Sort
1116
   It is possible to specify a different sort order, either by using the
1121
   parameters dialog (located in the Tools menu).
1117
   vertical arrows in the GUI toolbox to sort by date, or switching to the
1122
1118
   result table display and clicking on any header. The sort order chosen
1123
   The tool sorts a specified number of the most relevant documents in the
1119
   inside the result table remains active if you switch back to the result
1124
   result list, according to specified criteria. The currently available
1120
   list, until you click one of the vertical arrows, until both are unchecked
1125
   criteria are date and mime type.
1121
   (you are back to sort by relevance).
1126
1127
   The sort parameters stay in effect until they are explicitly reset, or the
1128
   program exits. An activated sort is indicated in the result list header.
1129
1122
1130
   Sort parameters are remembered between program invocations, but result
1123
   Sort parameters are remembered between program invocations, but result
1131
   sorting is normally always inactive when the program starts. It is
1124
   sorting is normally always inactive when the program starts. It is
1132
   possible to keep the sorting activation state between program invocations
1125
   possible to keep the sorting activation state between program invocations
1133
   by checking the Remember sort activation state option in the preferences.
1126
   by checking the Remember sort activation state option in the preferences.
...
...
1197
   but will give a relevance boost to the results where the search terms
1190
   but will give a relevance boost to the results where the search terms
1198
   appear as a phrase. Ie: searching for virtual reality will still find all
1191
   appear as a phrase. Ie: searching for virtual reality will still find all
1199
   documents where either virtual or reality or both appear, but those which
1192
   documents where either virtual or reality or both appear, but those which
1200
   contain virtual reality should appear sooner in the list.
1193
   contain virtual reality should appear sooner in the list.
1201
1194
1195
   Phrase searches can strongly slow down a query if most of the terms in the
1196
   phrase are common. This is why the autophrase option is off by default for
1197
   Recoll versions before 1.17. As of version 1.17, autophrase is on by
1198
   default, but very common terms will be removed from the constructed
1199
   phrase. The removal threshold can be adjusted from the search preferences.
1200
1201
   Phrases and abbreviations. As of Recoll version 1.17, dotted abbreviations
1202
   like I.B.M. are also automatically indexed as a word without the dots:
1203
   IBM. Searching for the word inside a phrase (ie: "the IBM company") will
1204
   only match the dotted abrreviation if you increase the phrase slack (using
1205
   the advanced search panel control, or the o query language modifier).
1206
   Literal occurences of the word will be matched normally.
1207
1202
     ----------------------------------------------------------------------
1208
     ----------------------------------------------------------------------
1203
1209
1204
    3.1.10.3. Others
1210
    3.1.10.3. Others
1205
1211
1206
   Using fields. You can use the query language and field specifications to
1212
   Using fields. You can use the query language and field specifications to
...
...
1245
   the parameters used for searching and returning results, and what indexes
1251
   the parameters used for searching and returning results, and what indexes
1246
   are searched.
1252
   are searched.
1247
1253
1248
   User interface parameters:
1254
   User interface parameters:
1249
1255
1250
     * Number of results in a result page:
1251
1252
     * Hide duplicate results: decides if result list entries are shown for
1253
       identical documents found in different places.
1254
1255
     * Highlight color for query terms: Terms from the user query are
1256
     * Highlight color for query terms: Terms from the user query are
1256
       highlighted in the result list samples and the preview window. The
1257
       highlighted in the result list samples and the preview window. The
1257
       color can be chosen here. Any Qt color string should work (ie red,
1258
       color can be chosen here. Any Qt color string should work (ie red,
1258
       #ff0000). The default is blue.
1259
       #ff0000). The default is blue.
1259
1260
1260
     * Result list font: There is quite a lot of information shown in the
1261
     * Style sheet: The name of a Qt style sheet text file which is applied
1261
       result list, and you may want to customize the font and/or font size.
1262
       to the whole Recoll application on startup. The default value is
1262
       The rest of the fonts used by Recoll are determined by your generic Qt
1263
       empty, but there is a skeleton style sheet (recoll.qss) inside the
1263
       config (try the qtconfig command).
1264
       /usr/share/recoll/examples directory. Using a style sheet, you can
1264
1265
       change most Recoll graphical parameters: colors, fonts, etc. See the
1265
     * Result paragraph format string: allows you to change the presentation
1266
       sample file for a few simple examples.
1266
       of each result list entry. This is described in its own section.
1267
1268
     * Abstract snippet separator: for synthetic abstracts built from index
1269
       data, which are usually made of several snippets from different parts
1270
       of the document, this defines the snippet separator, an ellipsis by
1271
       default.
1272
1267
1273
     * Maximum text size highlighted for preview Inserting highlights on
1268
     * Maximum text size highlighted for preview Inserting highlights on
1274
       search term inside the text before inserting it in the preview window
1269
       search term inside the text before inserting it in the preview window
1275
       involves quite a lot of processing, and can be disabled over the given
1270
       involves quite a lot of processing, and can be disabled over the given
1276
       text size to speed up loading.
1271
       text size to speed up loading.
1272
1273
     * Prefer HTML to plain text for preview if set, Recoll will display HTML
1274
       as such inside the preview window. If this causes problems with the Qt
1275
       HTML display, you can uncheck it to display the plain text version
1276
       instead.
1277
1278
     * Use <PRE> tags instead of <BR> to display plain text as HTML in
1279
       preview: when displaying plain text inside the preview window, Recoll
1280
       tries to preserve some of the original text line breaks and
1281
       indentation. It can either use PRE HTML tags, which will well preserve
1282
       the indentation but will force horizontal scrolling for long lines, or
1283
       use BR tags to break at the original line breaks, which will let the
1284
       editor introduce other line breaks according to the window width, but
1285
       will lose some of the original indentation.
1277
1286
1278
     * Use desktop preferences to choose document editor: if this is checked,
1287
     * Use desktop preferences to choose document editor: if this is checked,
1279
       the xdg-open utility will be used to open files when you click the
1288
       the xdg-open utility will be used to open files when you click the
1280
       Open link in the result list, instead of the application defined in
1289
       Open link in the result list, instead of the application defined in
1281
       mimeview. xdg-open will in term use your desktop preferences to choose
1290
       mimeview. xdg-open will in term use your desktop preferences to choose
...
...
1299
1308
1300
     * Remember sort activation state if set, Recoll will remember the sort
1309
     * Remember sort activation state if set, Recoll will remember the sort
1301
       tool stat between invocations. It normally starts with sorting
1310
       tool stat between invocations. It normally starts with sorting
1302
       disabled.
1311
       disabled.
1303
1312
1304
     * Prefer HTML to plain text for preview if set, Recoll will display HTML
1313
   Result list parameters:
1305
       as such inside the preview window. If this causes problems with the Qt
1314
1306
       HTML display, you can uncheck it to display the plain text version
1315
     * Number of results in a result page
1307
       instead.
1316
1317
     * Result list font: There is quite a lot of information shown in the
1318
       result list, and you may want to customize the font and/or font size.
1319
       The rest of the fonts used by Recoll are determined by your generic Qt
1320
       config (try the qtconfig command).
1321
1322
     * Edit result list paragraph format string: allows you to change the
1323
       presentation of each result list entry. See the result list
1324
       customisation section.
1325
1326
     * Edit result page html header insert: allows you to define text
1327
       inserted at the end of the result page html header. More detail in the
1328
       result list customisation section.
1329
1330
     * Date format: allows specifying the format used for displaying dates
1331
       inside the result list. This should be specified as an strftime()
1332
       string (man strftime).
1333
1334
     * Abstract snippet separator: for synthetic abstracts built from index
1335
       data, which are usually made of several snippets from different parts
1336
       of the document, this defines the snippet separator, an ellipsis by
1337
       default.
1308
1338
1309
   Search parameters:
1339
   Search parameters:
1340
1341
     * Hide duplicate results: decides if result list entries are shown for
1342
       identical documents found in different places.
1310
1343
1311
     * Stemming language: stemming obviously depends on the document's
1344
     * Stemming language: stemming obviously depends on the document's
1312
       language. This listbox will let you chose among the stemming databases
1345
       language. This listbox will let you chose among the stemming databases
1313
       which were built during indexing (this is set in the main
1346
       which were built during indexing (this is set in the main
1314
       configuration file), or later added with recollindex -s (See the
1347
       configuration file), or later added with recollindex -s (See the
1315
       recollindex manual). Stemming languages which are dynamically added
1348
       recollindex manual). Stemming languages which are dynamically added
1316
       will be deleted at the next indexing pass unless they are also added
1349
       will be deleted at the next indexing pass unless they are also added
1317
       in the configuration file.
1350
       in the configuration file.
1318
1351
1319
     * Dynamically add phrase to simple searches: a phrase will be
1352
     * Automatically add phrase to simple searches: a phrase will be
1320
       automatically built and added to simple searches when looking for Any
1353
       automatically built and added to simple searches when looking for Any
1321
       terms. This will give a relevance boost to the results where the
1354
       terms. This will give a relevance boost to the results where the
1322
       search terms appear as a phrase (consecutive and in order).
1355
       search terms appear as a phrase (consecutive and in order).
1356
1357
     * Autophrase term frequency threshold percentage: very frequent terms
1358
       should not be included in automatic phrase searches for performance
1359
       reasons. The parameter defines the cutoff percentage (percentage of
1360
       the documents where the term appears).
1323
1361
1324
     * Replace abstracts from documents: this decides if we should synthesize
1362
     * Replace abstracts from documents: this decides if we should synthesize
1325
       and display an abstract in place of an explicit abstract found within
1363
       and display an abstract in place of an explicit abstract found within
1326
       the document itself.
1364
       the document itself.
1327
1365
...
...
1356
   alternative indexer may also need to implement a way of purging the index
1394
   alternative indexer may also need to implement a way of purging the index
1357
   from stale data,
1395
   from stale data,
1358
1396
1359
     ----------------------------------------------------------------------
1397
     ----------------------------------------------------------------------
1360
1398
1361
    3.1.11.1. The result list paragraph format
1399
    3.1.11.1. The result list format
1362
1400
1363
   The presentation of each result inside the result list can be customized
1401
   The result list presentation can be exhaustively customized by adjusting
1364
   by setting the result list paragraph format inside the User Interface tab
1402
   two elements:
1365
   of the Query configuration.
1366
1403
1404
     * The paragraph format
1405
1406
     * Html code inside the header section
1407
1408
   These can be edited from the Result list tab of the Query configuration.
1409
1410
   Newer versions of Recoll (from 1.17) use a WebKit HTML object by default
1411
   (this may be disabled at build time), and total customisation is possible
1412
   with full support for CSS and Javascript. Conversely, there are limits to
1413
   what you can do with the older Qt QTextBrowser, but still, it is possible
1414
   to decide what data each result will contain, and how it will be
1415
   displayed.
1416
1417
   No more detail will be given about the header part (only useful with the
1418
   WebKit build), if there are restrictions to what you can do, they are
1419
   beyond this author's HTML/CSS/Javascript abilities...
1420
1421
     ----------------------------------------------------------------------
1422
1423
      3.1.11.1.1. The paragraph format
1424
1367
   This is a Qt HTML string where the following printf-like % substitutions
1425
   This is an arbitrary HTML string where the following printf-like %
1368
   will be performed:
1426
   substitutions will be performed:
1369
1427
1370
     * %A. Abstract
1428
     * %A. Abstract
1371
1429
1372
     * %D. Date
1430
     * %D. Date
1373
1431
1374
     * %I. Icon image name
1432
     * %I. Icon image name. This is normally determined from the mime type.
1433
       The associations are defined inside the mimeconf configuration file.
1434
       If a thumbnail for the file is found at the standard Freedesktop
1435
       location, this will be displayed instead.
1375
1436
1376
     * %K. Keywords (if any)
1437
     * %K. Keywords (if any)
1377
1438
1378
     * %L. Preview and Edit links
1439
     * %L. Precooked Preview and Edit links
1379
1440
1380
     * %M. Mime type
1441
     * %M. Mime type
1381
1442
1382
     * %N. result Number
1443
     * %N. result Number inside the result page
1383
1444
1384
     * %R. Relevance percentage
1445
     * %R. Relevance percentage
1385
1446
1386
     * %S. Size information
1447
     * %S. Size information
1387
1448
1388
     * %T. Title
1449
     * %T. Title
1389
1450
1390
     * %U. Url
1451
     * %U. Url
1391
1452
1392
   The format of the Preview and Edit links is <a href="P%N"> and <a
1453
   The format of the Preview and Edit links is <a href="P%N"> and <a
1393
   href="E%N"> where docnum (%N expands to the document number inside the
1454
   href="E%N"> where docnum (%N) expands to the document number inside the
1394
   result list).
1455
   result page).
1395
1456
1396
   In addition to the predefined values above, all strings like %(fieldname)
1457
   In addition to the predefined values above, all strings like %(fieldname)
1397
   will be replaced by the value of the field named fieldname for this
1458
   will be replaced by the value of the field named fieldname for this
1398
   document. Only stored fields can be accessed in this way, the value of
1459
   document. Only stored fields can be accessed in this way, the value of
1399
   indexed but not stored fields is not known at this point in the search
1460
   indexed but not stored fields is not known at this point in the search
...
...
1408
   The default value for the paragraph format string is:
1469
   The default value for the paragraph format string is:
1409
1470
1410
 <img src="%I" align="left">%R %S %L &nbsp;&nbsp;<b>%T</b><br>
1471
 <img src="%I" align="left">%R %S %L &nbsp;&nbsp;<b>%T</b><br>
1411
 %M&nbsp;%D&nbsp;&nbsp;&nbsp;<i>%U</i>&nbsp;%i<br>
1472
 %M&nbsp;%D&nbsp;&nbsp;&nbsp;<i>%U</i>&nbsp;%i<br>
1412
 %A %K
1473
 %A %K
1413
       
1474
        
1414
1475
1415
   You may, for example, try the following for a more web-like experience:
1476
   You may, for example, try the following for a more web-like experience:
1416
1477
1417
 <u><b><a href="P%N">%T</a></b></u><br>
1478
 <u><b><a href="P%N">%T</a></b></u><br>
1418
 %A<font color=#008000>%U - %S</font> - %L
1479
 %A<font color=#008000>%U - %S</font> - %L
1419
       
1480
        
1420
1481
1421
   Or the clean looking:
1482
   Or the clean looking:
1422
1483
1423
 <img src="%I" align="left">%L <font color="#900000">%R</font>
1484
 <img src="%I" align="left">%L <font color="#900000">%R</font>
1424
   <b>%T</b><br>%S 
1485
   <b>%T</b><br>%S
1425
 <font color="#808080"><i>%U</i></font>
1486
 <font color="#808080"><i>%U</i></font>
1426
 <table bgcolor="#e0e0e0">
1487
 <table bgcolor="#e0e0e0">
1427
 <tr><td><div>%A</div></td></tr>
1488
 <tr><td><div>%A</div></td></tr>
1428
 </table>%K
1489
 </table>%K
1429
       
1490
        
1430
1491
1431
   Note that the P%N link in the above paragraph makes the title a preview
1492
   Note that the P%N link in the above paragraph makes the title a preview
1432
   link.
1493
   link.
1494
1495
   These samples, and some others are on the web site, with pictures to show
1496
   how they look.
1433
1497
1434
   It is also possible to define the value of the snippet separator inside
1498
   It is also possible to define the value of the snippet separator inside
1435
   the abstract section.
1499
   the abstract section.
1436
1500
1437
     ----------------------------------------------------------------------
1501
     ----------------------------------------------------------------------
...
...
1482
         window.location.href = 'recoll://search/query?qtp=a&p=0&q=' +
1546
         window.location.href = 'recoll://search/query?qtp=a&p=0&q=' +
1483
             encodeURIComponent(t);
1547
             encodeURIComponent(t);
1484
     }
1548
     }
1485
 </script>
1549
 </script>
1486
  ....
1550
  ....
1487
 <body ondblclick="recollsearch()">

1551
 <body ondblclick="recollsearch()">
1488
1552
1489
     ----------------------------------------------------------------------
1553
     ----------------------------------------------------------------------
1490
1554
1491
3.3. Searching on the command line
1555
3.3. Searching on the command line
1492
1556
...
...
1544
   The query language processor is activated in the GUI simple search entry
1608
   The query language processor is activated in the GUI simple search entry
1545
   when the search mode selector is set to Query Language. It can also be
1609
   when the search mode selector is set to Query Language. It can also be
1546
   used with the KIO slave or the command line search. It broadly has the
1610
   used with the KIO slave or the command line search. It broadly has the
1547
   same capabilities as the complex search interface in the GUI.
1611
   same capabilities as the complex search interface in the GUI.
1548
1612
1549
   The language is roughly based on the Xesam user search language
1613
   The language is roughly based on the (seemingly defunct) Xesam user search
1550
   specification.
1614
   language specification.
1551
1615
1552
   If the results of a query language search puzzle you and you doubt what
1616
   If the results of a query language search puzzle you and you doubt what
1553
   has been actually searched for, you can use the GUI show query link at the
1617
   has been actually searched for, you can use the GUI show query link at the
1554
   top of the result list to check the exact query which was finally executed
1618
   top of the result list to check the exact query which was finally executed
1555
   by Xapian.
1619
   by Xapian.
1556
1620
1557
   Here follows a sample request that we are going to explain:
1621
   Here follows a sample request that we are going to explain:
1558
1622
1559
           author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes
1623
           author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes
1560
     
1624
      
1561
1625
1562
   This would search for all documents with John Doe appearing as a phrase in
1626
   This would search for all documents with John Doe appearing as a phrase in
1563
   the author field (exactly what this is would depend on the document type,
1627
   the author field (exactly what this is would depend on the document type,
1564
   ie: the From: header, for an email message), and containing either beatles
1628
   ie: the From: header, for an email message), and containing either beatles
1565
   or lennon and either live or unplugged but not potatoes (in any part of
1629
   or lennon and either live or unplugged but not potatoes (in any part of
...
...
1583
1647
1584
   As usual, words inside quotes define a phrase (the order of words is
1648
   As usual, words inside quotes define a phrase (the order of words is
1585
   significant), so that title:"prejudice pride" is not the same as
1649
   significant), so that title:"prejudice pride" is not the same as
1586
   title:prejudice title:pride, and is unlikely to find a result.
1650
   title:prejudice title:pride, and is unlikely to find a result.
1587
1651
1588
   Most Xesam phrase modifiers are unsupported, except for l (small ell) to
1652
   Modifiers can be set on a phrase clause, for exemple to specify a
1589
   disable stemming, and p to turn a phrase into a NEAR (unordered proximity)
1653
   proximity search (unordered). See the modifier section.
1590
   search. Exemple: "prejudice pride"p
1591
1654
1592
   Recoll currently manages the following default fields:
1655
   Recoll currently manages the following default fields:
1593
1656
1594
     * title, subject or caption are synonyms which specify data to be
1657
     * title, subject or caption are synonyms which specify data to be
1595
       searched for in the document title or subject.
1658
       searched for in the document title or subject.
...
...
1607
1670
1608
   The field syntax also supports a few field-like, but special, criteria:
1671
   The field syntax also supports a few field-like, but special, criteria:
1609
1672
1610
     * dir for filtering the results on file location (Ex:
1673
     * dir for filtering the results on file location (Ex:
1611
       dir:/home/me/somedir). -dir also works to find results out of the
1674
       dir:/home/me/somedir). -dir also works to find results out of the
1612
       specified directory, only after release 1.15.8.
1675
       specified directory, only after release 1.15.8. A tilde inside the
1676
       value will be expanded to the home directory. dir is not a regular
1677
       field and only one value makes sense in a query (you can't use
1678
       dir:dir1 OR dir:dir2). Relative paths make sense, for example,
1679
       dir:share/doc would match either /usr/share/doc or
1680
       /usr/local/share/doc
1681
1682
     * size for filtering the results on file size. Exemple: size<10000. You
1683
       can use <, > or = as operators. You can specify a range like the
1684
       following: size>100 size<1000. The usual k/K, m/M, g/G, t/T can be
1685
       used as (decimal) multipliers. Ex: size>1k to search for files bigger
1686
       than 1000 bytes.
1613
1687
1614
     * date for searching or filtering on dates. The syntax for the argument
1688
     * date for searching or filtering on dates. The syntax for the argument
1615
       is based on the ISO8601 standard for dates and time intervals. Only
1689
       is based on the ISO8601 standard for dates and time intervals. Only
1616
       dates are supported, no times. The general syntax is 2 elements
1690
       dates are supported, no times. The general syntax is 2 elements
1617
       separated by a / character. Each element can be a date or a period of
1691
       separated by a / character. Each element can be a date or a period of
...
...
1826
       documents per file (ie: for zip or chm files). They communicate with
1900
       documents per file (ie: for zip or chm files). They communicate with
1827
       the indexer through a simple protocol, but are nevertheless a bit more
1901
       the indexer through a simple protocol, but are nevertheless a bit more
1828
       complicated than the older kind. Most of these new filters are written
1902
       complicated than the older kind. Most of these new filters are written
1829
       in Python, using a common module to handle the protocol.
1903
       in Python, using a common module to handle the protocol.
1830
1904
1831
   The following will just describe the simple filters, if you are programmer
1905
   The following will just describe the simple filters. If you can program
1832
   enough to write one of the other kind, it shouldn't be too difficult to
1906
   and want to write one of the other kind, it shouldn't be too difficult to
1833
   make sense of one of the existing modules (ie: rclzip).
1907
   make sense of one of the existing modules. For example, look at rclzip
1908
   which uses Zip file paths as internal identifiers (ipath), and rclinfo,
1909
   which uses an integer index.
1910
1911
     ----------------------------------------------------------------------
1912
1913
  4.1.1. Simple filters
1834
1914
1835
   Recoll simple filters are usually shell-scripts, but this is in no way
1915
   Recoll simple filters are usually shell-scripts, but this is in no way
1836
   necessary. These programs are extremely simple and most of the difficulty
1916
   necessary. Extracting the text from the native format is the difficult
1837
   lies in extracting the text from the native format, not outputting what is
1917
   part. Outputting the format expected by Recoll is trivial. Happily enough,
1838
   expected by Recoll. Happily enough, most document formats already have
1918
   most document formats have translators or text extractors which can be
1839
   translators or text extractors which handle the difficult part and can be
1840
   called from the filter. In some case the output of the translating program
1919
   called from the filter. In some cases the output of the translating
1841
   is appropriate, and no intermediate shell-script is needed.
1920
   program is completely appropriate, and no intermediate shell-script is
1921
   needed.
1842
1922
1843
   Filters are called with a single argument which is the source file name.
1923
   Filters are called with a single argument which is the source file name.
1844
   They should output the result to stdout.
1924
   They should output the result to stdout.
1845
1925
1926
   When writing a filter, you should decide if it will output plain text or
1927
   html. Plain text is simpler, but you will not be able to add metadata or
1928
   vary the output character encoding (this will be defined in a
1929
   configuration file). Additionally, some formatting may easier to preserve
1930
   when previewing html. Actually the deciding factor is metadata: Recoll has
1931
   a way to extract metadata from the html header and use it for field
1932
   searches..
1933
1846
   The RECOLL_FILTER_FORPREVIEW environment variable (values yes, no) tells
1934
   The RECOLL_FILTER_FORPREVIEW environment variable (values yes, no) tells
1847
   the filter if the operation is for indexing or previewing. Some filters
1935
   the filter if the operation is for indexing or previewing. Some filters
1848
   use this to output a slightly different format. This is not essential.
1936
   use this to output a slightly different format, for example stripping
1937
   uninteresting repeated keywords (ie: Subject: for email) when indexing.
1938
   This is not essential.
1939
1940
   You should look to one of the simple filters, for exemple rclps for a
1941
   starting point.
1942
1943
   Don't forget to make your filter executable before testing !
1944
1945
     ----------------------------------------------------------------------
1946
1947
  4.1.2. Telling Recoll about the filter
1948
1949
   There are two elements that link a file to the filter which should process
1950
   it: the association of file to mime type and the association of a mime
1951
   type with a filter.
1952
1953
   The association of files to mime types is mostly based on name suffixes.
1954
   The types are defined inside the mimemap file. Example:
1955
1956
1957
 .doc = application/msword
1958
1959
   If no suffix association is found for the file name, Recoll will try to
1960
   execute the file -i command to determine a mime type.
1849
1961
1850
   The association of file types to filters is performed in the mimeconf
1962
   The association of file types to filters is performed in the mimeconf
1851
   file. A sample:
1963
   file. A sample will probably be of better help than a long explanation:
1852
1964
1965
1853
 
[index]
1966
 [index]
1854
 application/msword = exec antiword -t -i 1 -m UTF-8;\
1967
 application/msword = exec antiword -t -i 1 -m UTF-8;\
1855
      mimetype = text/plain ; charset=utf-8
1968
      mimetype = text/plain ; charset=utf-8
1856
1969
1857
 application/ogg = exec rclogg
1970
 application/ogg = exec rclogg
1858
1971
...
...
1874
       and not output by unrtf in the HTML header section.
1987
       and not output by unrtf in the HTML header section.
1875
1988
1876
     * application/x-chm is processed by a persistant filter. This is
1989
     * application/x-chm is processed by a persistant filter. This is
1877
       determined by the execm keyword.
1990
       determined by the execm keyword.
1878
1991
1879
   The easiest way to write a new filter is probably to start from an
1880
   existing one.
1881
1882
   Filters which output text/plain text are generally simpler, but they
1883
   cannot specify the character set and other metadata, so they are limited
1884
   to cases where these elements are not needed.
1885
1886
     ----------------------------------------------------------------------
1992
     ----------------------------------------------------------------------
1887
1993
1888
  4.1.1. Filter HTML output
1994
  4.1.3. Filter HTML output
1889
1995
1890
   The output HTML could be very minimal like the following example:
1996
   The output HTML could be very minimal like the following example:
1891
1997
1892
 <html><head>
1998
 <html><head>
1893
 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
1999
 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
1894
 </head>
2000
 </head>
1895
 <body>some text content</body></html>
2001
 <body>some text content</body></html>
1896
         
2002
          
1897
2003
1898
   You should take care to escape some characters inside the text by
2004
   You should take care to escape some characters inside the text by
1899
   transforming them into appropriate entities. "&" should be transformed
2005
   transforming them into appropriate entities. "&" should be transformed
1900
   into "&amp;", "<" should be transformed into "&lt;". This is not always
2006
   into "&amp;", "<" should be transformed into "&lt;". This is not always
1901
   properly done by translating programs which output HTML, and of course
2007
   properly done by translating programs which output HTML, and of course
...
...
2208
           confdir specifies a Recoll configuration directory
2314
           confdir specifies a Recoll configuration directory
2209
           (the default is built like for any Recoll program).
2315
           (the default is built like for any Recoll program).
2210
           extra_dbs is a list of external databases (xapian directories)
2316
           extra_dbs is a list of external databases (xapian directories)
2211
           writable decides if we can index new data through this connection
2317
           writable decides if we can index new data through this connection
2212
2318
2213
   
2214
2215
     ----------------------------------------------------------------------
2319
     ----------------------------------------------------------------------
2216
2320
2217
    4.3.2.3. Example code
2321
    4.3.2.3. Example code
2218
2322
2219
   The following sample would query the index with a user language string.
2323
   The following sample would query the index with a user language string.
...
...
2239
         print k, ":", getattr(doc, k).encode('utf-8')
2343
         print k, ":", getattr(doc, k).encode('utf-8')
2240
     abs = db.makeDocAbstract(doc, query).encode('utf-8')
2344
     abs = db.makeDocAbstract(doc, query).encode('utf-8')
2241
     print abs
2345
     print abs
2242
     print
2346
     print
2243
2347
2244
 
2348
2245
2349
2246
     ----------------------------------------------------------------------
2350
     ----------------------------------------------------------------------
2247
2351
2248
                   Chapter 5. Installation and configuration
2352
                   Chapter 5. Installation and configuration
2249
2353
...
...
2470
2574
2471
     * --with-file-command Specify the version of the 'file' command to use
2575
     * --with-file-command Specify the version of the 'file' command to use
2472
       (ie: --with-file-command=/usr/local/bin/file). Can be useful to enable
2576
       (ie: --with-file-command=/usr/local/bin/file). Can be useful to enable
2473
       the gnu version on systems where the native one is bad.
2577
       the gnu version on systems where the native one is bad.
2474
2578
2475
     * --without-gui Disable the Qt interface, and auxiliary uses of X11, and
2579
     * --disable-qtgui Disable the Qt interface. Will allow building the
2476
       compile the command line version.
2580
       indexer and the command line search program in absence of a Qt
2581
       environment.
2582
2583
     * --disable-x11mon Disable X11 connection monitoring inside recollindex.
2584
       Together with --disable-qtgui, this allows building recoll without Qt
2585
       and X11.
2477
2586
2478
     * Of course the usual autoconf configure options, like --prefix apply.
2587
     * Of course the usual autoconf configure options, like --prefix apply.
2479
2588
2480
   Normal procedure:
2589
   Normal procedure:
2481
2590
2482
         cd recoll-xxx
2591
         cd recoll-xxx
2483
         configure
2592
         configure
2484
         make
2593
         make
2485
         (practices usual hardship-repelling invocations)
2594
         (practices usual hardship-repelling invocations)
2486
     
2595
      
2487
2596
2488
   There is little auto-configuration. The configure script will mainly link
2597
   There is little auto-configuration. The configure script will mainly link
2489
   one of the system-specific files in the mk directory to mk/sysconf. If
2598
   one of the system-specific files in the mk directory to mk/sysconf. If
2490
   your system is not known yet, it will tell you as much, and you may want
2599
   your system is not known yet, it will tell you as much, and you may want
2491
   to manually copy and modify one of the existing files (the new file name
2600
   to manually copy and modify one of the existing files (the new file name
...
...
2511
     ----------------------------------------------------------------------
2620
     ----------------------------------------------------------------------
2512
2621
2513
5.4. Configuration overview
2622
5.4. Configuration overview
2514
2623
2515
   Most of the parameters specific to the recoll GUI are set through the
2624
   Most of the parameters specific to the recoll GUI are set through the
2516
   Preferences menu and stored in the standard Qt place ($HOME/.qt/recollrc).
2625
   Preferences menu and stored in the standard Qt place
2517
   You probably do not want to edit this by hand.
2626
   ($HOME/.config/Recoll.org/recoll.conf). You probably do not want to edit
2627
   this by hand.
2518
2628
2519
   Recoll indexing options are set inside text configuration files located in
2629
   Recoll indexing options are set inside text configuration files located in
2520
   a configuration directory. There can be several such directories, each of
2630
   a configuration directory. There can be several such directories, each of
2521
   which define the parameters for one index.
2631
   which define the parameters for one index.
2522
2632
...
...
2556
         # Space-separated list of directories to index.
2666
         # Space-separated list of directories to index.
2557
         topdirs =  ~/docs /usr/share/doc
2667
         topdirs =  ~/docs /usr/share/doc
2558
2668
2559
         [~/somedirectory-with-utf8-txt-files]
2669
         [~/somedirectory-with-utf8-txt-files]
2560
         defaultcharset = utf-8
2670
         defaultcharset = utf-8
2561
       
2671
        
2562
2672
2563
   There are three kinds of lines:
2673
   There are three kinds of lines:
2564
2674
2565
     * Comment (starts with #) or empty.
2675
     * Comment (starts with #) or empty.
2566
2676
...
...
2615
           A space-separated list of patterns for names of files or
2725
           A space-separated list of patterns for names of files or
2616
           directories that should be completely ignored. The list defined in
2726
           directories that should be completely ignored. The list defined in
2617
           the default file is:
2727
           the default file is:
2618
2728
2619
 skippedNames = #* bin CVS  Cache cache* caughtspam  tmp .thumbnails .svn \
2729
 skippedNames = #* bin CVS  Cache cache* caughtspam  tmp .thumbnails .svn \
2620
            *~ .beagle .git .hg .bzr loop.ps .xsession-errors \
2730
                *~ .beagle .git .hg .bzr loop.ps .xsession-errors \
2621
            .recoll* xapiandb recollrc recoll.conf
2731
                .recoll* xapiandb recollrc recoll.conf
2622
2732
2623
           The list can be redefined at any sub-directory in the indexed
2733
           The list can be redefined at any sub-directory in the indexed
2624
           area.
2734
           area.
2625
2735
2626
           The top-level directories are not affected by this list (that is,
2736
           The top-level directories are not affected by this list (that is,
...
...
2650
           indexed at startup, but not monitored.
2760
           indexed at startup, but not monitored.
2651
2761
2652
           Example of use for skipping text files only in a specific
2762
           Example of use for skipping text files only in a specific
2653
           directory:
2763
           directory:
2654
2764
2655
 skippedPaths = ~/somedir/*.txt
2765
 skippedPaths = ~/somedir/..txt
2656
             
2766
              
2767
2768
   skippedPathsFnmPathname
2769
2770
           The values in the *skippedPaths variables are matched by default
2771
           with fnmatch(3), with the FNM_PATHNAME and FNM_LEADING_DIR flags.
2772
           This means that '/' characters must be matched explicitely. You
2773
           can set skippedPathsFnmPathname to 0 to disable the use of
2774
           FNM_PATHNAME (meaning that /*/dir3 will match /dir1/dir2/dir3).
2657
2775
2658
   followLinks
2776
   followLinks
2659
2777
2660
           Specifies if the indexer should follow symbolic links while
2778
           Specifies if the indexer should follow symbolic links while
2661
           walking the file tree. The default is to ignore symbolic links to
2779
           walking the file tree. The default is to ignore symbolic links to
...
...
2799
           needed when the index is initialized. If this is not an absolute
2917
           needed when the index is initialized. If this is not an absolute
2800
           path, it will be interpreted relative to the configuration
2918
           path, it will be interpreted relative to the configuration
2801
           directory. The value can have embedded spaces but starting or
2919
           directory. The value can have embedded spaces but starting or
2802
           trailing spaces will be trimmed. You cannot use quotes here.
2920
           trailing spaces will be trimmed. You cannot use quotes here.
2803
2921
2922
   idxstatusfile
2923
2924
           The name of the scratch file where the indexer process updates its
2925
           status. Default: idxstatus.txt inside the configuration directory.
2926
2804
   maxfsoccuppc
2927
   maxfsoccuppc
2805
2928
2806
           Maximum file system occupation before we stop indexing. The value
2929
           Maximum file system occupation before we stop indexing. The value
2807
           is a percentage, corresponding to what the "Capacity" df output
2930
           is a percentage, corresponding to what the "Capacity" df output
2808
           column shows. The default value is 0, meaning no checking.
2931
           column shows. The default value is 0, meaning no checking.
...
...
2864
           space-separated list, each entry being a pattern and a time in
2987
           space-separated list, each entry being a pattern and a time in
2865
           seconds, separated by a colon. You can use double quotes if a path
2988
           seconds, separated by a colon. You can use double quotes if a path
2866
           entry contains white space. Example:
2989
           entry contains white space. Example:
2867
2990
2868
 mondelaypatterns = *.log:20 "this one has spaces*:10"
2991
 mondelaypatterns = *.log:20 "this one has spaces*:10"
2869
             
2992
              
2870
2993
2871
   monixinterval
2994
   monixinterval
2872
2995
2873
           Minimum interval (seconds) for processing the indexing queue. The
2996
           Minimum interval (seconds) for processing the indexing queue. The
2874
           real time monitor does not process each event when it comes in,
2997
           real time monitor does not process each event when it comes in,
...
...
3105
3228
3106
 .blob = application/x-blobapp
3229
 .blob = application/x-blobapp
3107
3230
3108
       Note that the mime type is made up here, and you could call it
3231
       Note that the mime type is made up here, and you could call it
3109
       diesel/oil just the same.
3232
       diesel/oil just the same.
3110
3111
     * In $RECOLL_CONFDIR/mimeview under the [view] section, add:
3233
     * In $RECOLL_CONFDIR/mimeview under the [view] section, add:
3112
3234
3113
 application/x-blobapp = blobviewer %f
3235
 application/x-blobapp = blobviewer %f
3114
3236
3115
       We are supposing that blobviewer wants a file name parameter here, you
3237
       We are supposing that blobviewer wants a file name parameter here, you