Switch to unified view

a/src/README b/src/README
...
...
161
                            Chapter 1. Introduction
161
                            Chapter 1. Introduction
162
162
163
1.1. Giving it a try
163
1.1. Giving it a try
164
164
165
   If you do not like reading manuals (who does?) and would like to give
165
   If you do not like reading manuals (who does?) and would like to give
166
   Recoll a try, just perform installation and start the recoll user
166
   Recoll a try, just install the application and start the recoll graphical
167
   interface, which will index your home directory by default, allowing you
167
   user interface (GUI), which will ask to index your home directory by
168
   to search immediately after indexing completes.
168
   default, allowing you to search immediately after indexing completes.
169
169
170
   Do not do this if your home directory contains a huge number of documents
170
   Do not do this if your home directory contains a huge number of documents
171
   and you do not want to wait or are very short on disk space. In this case,
171
   and you do not want to wait or are very short on disk space. In this case,
172
   you may first want to customize the configuration to restrict the indexed
172
   you may first want to customize the configuration to restrict the indexed
173
   area.
173
   area.
...
...
265
   default configuration will index your home directory with default
265
   default configuration will index your home directory with default
266
   parameters and should be sufficient for giving Recoll a try, but you may
266
   parameters and should be sufficient for giving Recoll a try, but you may
267
   want to adjust it later, which can be done either by editing the text
267
   want to adjust it later, which can be done either by editing the text
268
   files or by using configuration menus in the recoll GUI
268
   files or by using configuration menus in the recoll GUI
269
269
270
   Indexing is started automatically the first time you execute the recoll
270
   The indexing process is started automatically the first time you execute
271
   search graphical user interface, or by executing the recollindex command.
271
   the recoll GUI. Indexing can also be performed by executing the
272
   recollindex command.
272
273
273
   Searches are usually performed inside the recoll graphical user interface
274
   Searches are usually performed inside the recoll GUI, which has many
274
   (GUI) program, which has many options to help you find what you are
275
   options to help you find what you are looking for. However, there are
275
   looking for. However, there are other ways to perform Recoll searches:
276
   other ways to perform Recoll searches: mostly a command line interface, a
276
   mostly a command line tool, a Python programming interface, and a KDE KIO
277
   Python programming interface, a KDE KIO slave module, and a Ubuntu Unity
277
   slave module.
278
   Lens module.
278
279
279
     ----------------------------------------------------------------------
280
     ----------------------------------------------------------------------
280
281
281
                              Chapter 2. Indexing
282
                              Chapter 2. Indexing
282
283
...
...
309
   Recoll knows about quite a few different document types. The parameters
310
   Recoll knows about quite a few different document types. The parameters
310
   for document types recognition and processing are set in configuration
311
   for document types recognition and processing are set in configuration
311
   files.
312
   files.
312
313
313
   Most file types, like HTML or word processing files, only hold one
314
   Most file types, like HTML or word processing files, only hold one
314
   document. Some file types, like mail folder files or zip archives, can
315
   document. Some file types, like email folders or zip archives, can hold
315
   hold many individually indexed documents, which may in turn be themselves
316
   many individually indexed documents, which may in turn be themselves
316
   compound ones. Such hierarchies can go quite deep, and Recoll has no
317
   compound ones. Such hierarchies can go quite deep, and Recoll can process,
317
   problem processing, for example, an ms-word document which would be an
318
   for example, an ms-word document stored as an attachment to an email
318
   attachment to an email message part of a folder file archived inside a zip
319
   message inside an email folder archived in a zip file...
319
   file...
320
320
321
   Recoll indexing processes plain text, HTML, openoffice and e-mail files,
321
   Recoll indexing processes plain text, HTML, OpenDocument
322
   and a few others internally.
322
   (Open/LibreOffice), email formats, and a few others internally.
323
323
324
   Other file types (ie: postscript, pdf, ms-word, rtf ...) need external
324
   Other file types (ie: postscript, pdf, ms-word, rtf ...) need external
325
   applications for preprocessing. The list is in the installation section.
325
   applications for preprocessing. The list is in the installation section.
326
   After every indexing operation, Recoll updates a list of commands that
326
   After every indexing operation, Recoll updates a list of commands that
327
   would be needed for indexing existing files types. This list can be
327
   would be needed for indexing existing files types. This list can be
328
   displayed from the recoll File menu. It is stored in the missing text file
328
   displayed by selecting the menu option File->Show Missing Helpers in the
329
   inside the configuration directory.
329
   recoll GUI. It is stored in the missing text file inside the configuration
330
   directory.
330
331
331
   Without further configuration, Recoll will index all appropriate files
332
   Without further configuration, Recoll will index all appropriate files
332
   from your home directory, with a reasonable set of defaults.
333
   from your home directory, with a reasonable set of defaults.
333
334
334
   In some cases, it may be interesting to index different areas of the file
335
   In some cases, it may be interesting to index different areas of the file
...
...
385
   the documents. It may also be much smaller if the documents contain a lot
386
   the documents. It may also be much smaller if the documents contain a lot
386
   of images or other non-indexed data (an extreme example being a set of mp3
387
   of images or other non-indexed data (an extreme example being a set of mp3
387
   files where only the tags would be indexed).
388
   files where only the tags would be indexed).
388
389
389
   Of course, images, sound and video do not increase the index size, which
390
   Of course, images, sound and video do not increase the index size, which
390
   means that it will be quite typical nowadays (2006), that even a big index
391
   means that nowadays (2012), typically, even a big index will be negligible
391
   will be negligible against the total amount of data on the computer.
392
   against the total amount of data on the computer.
392
393
393
   The index data directory (xapiandb) only contains data that can be
394
   The index data directory (xapiandb) only contains data that can be
394
   completely rebuilt by an index run (as long as the original documents
395
   completely rebuilt by an index run (as long as the original documents
395
   exist), and it can always be destroyed safely.
396
   exist), and it can always be destroyed safely.
396
397
...
...
466
467
467
   Most parameters for a given indexing configuration can be set from a
468
   Most parameters for a given indexing configuration can be set from a
468
   recoll GUI running on this configuration (either as default, or by setting
469
   recoll GUI running on this configuration (either as default, or by setting
469
   RECOLL_CONFDIR or the -c option.)
470
   RECOLL_CONFDIR or the -c option.)
470
471
471
   The interface is started from the Preferences menu. It has two main
472
   The interface is started from the Preferences->Indexing Configuration menu
473
   entry. It is divided in three tabs, Global parameters, Local parameters,
474
   and Beagle web history, which is explained in the next section.
475
472
   panels. The first panel allows setting global variables, like the list of
476
   The first tab allows setting global variables, like the lists of top
473
   top directories or the list of skipped paths. The second panel allows
477
   directories, skipped paths, or stemming languages.
474
   setting variables that can be redefined for subdirectories. This second
478
475
   panel has an initially empty list of customisation directories, to which
479
   The second tab allows setting variables that can be redefined for
476
   you can add. The variables are then set for the currently selected
480
   subdirectories. This second tab has an initially empty list of
477
   directory (or at the top level if the empty line is selected).
481
   customisation directories, to which you can add. The variables are then
482
   set for the currently selected directory (or at the top level if the empty
483
   line is selected).
478
484
479
   The meaning for most entries in the interface is self-evident and
485
   The meaning for most entries in the interface is self-evident and
480
   documented by a ToolTip popup on the text label. For more detail, you will
486
   documented by a ToolTip popup on the text label. For more detail, you will
481
   need to refer to the configuration section of this guide.
487
   need to refer to the configuration section of this guide.
482
488
...
...
527
533
528
   If the recoll program finds no index when it starts, it will automatically
534
   If the recoll program finds no index when it starts, it will automatically
529
   start indexing (except if canceled).
535
   start indexing (except if canceled).
530
536
531
   The recollindex indexing process can be interrupted by sending an
537
   The recollindex indexing process can be interrupted by sending an
532
   interrupt (^C, SIGINT) or terminate (SIGTERM) signal. Some time may elapse
538
   interrupt (Ctrl-C, SIGINT) or terminate (SIGTERM) signal. Some time may
533
   before the process exits, because it needs to properly flush and close the
539
   elapse before the process exits, because it needs to properly flush and
534
   index. The indexing thread can be equivalently stopped from the menu.
540
   close the index. This can also be done from the recoll GUI File->Stop
541
   Indexing menu entry.
535
542
536
   After such an interruption, the index will be somewhat inconsistent
543
   After such an interruption, the index will be somewhat inconsistent
537
   because some operations which are normally performed at the end of the
544
   because some operations which are normally performed at the end of the
538
   indexing pass will have been skipped (for exemple, the stemming and
545
   indexing pass will have been skipped (for example, the stemming and
539
   spelling databases will be inexistant or out of date). You just need to
546
   spelling databases will be inexistant or out of date). You just need to
540
   restart indexing at a later time to restore consistency. The indexing will
547
   restart indexing at a later time to restore consistency. The indexing will
541
   restart at the interruption point (the full file tree will be traversed,
548
   restart at the interruption point (the full file tree will be traversed,
542
   but files that were indexed up to the interruption and are still up to
549
   but files that were indexed up to the interruption and are still up to
543
   date will not need to be reindexed).
550
   date will not need to be reindexed).
...
...
675
       toolbox bar icon) has multiple entry fields, which you may use to
682
       toolbox bar icon) has multiple entry fields, which you may use to
676
       build a logical condition, with additional filtering on file type and
683
       build a logical condition, with additional filtering on file type and
677
       location in the file system.
684
       location in the file system.
678
685
679
   In most cases, you can enter the terms as you think them, even if they
686
   In most cases, you can enter the terms as you think them, even if they
680
   contain embedded punctuation or other non-textual characters. For exemple,
687
   contain embedded punctuation or other non-textual characters. For example,
681
   Recoll can handle things like e-mail addresses, or arbitrary cut and paste
688
   Recoll can handle things like email addresses, or arbitrary cut and paste
682
   from another text window, punctation and all.
689
   from another text window, punctation and all.
683
690
684
   The main case where you should enter text differently from how it is
691
   The main case where you should enter text differently from how it is
685
   printed is for east-asian languages (Chinese, Japanese, Korean). Words
692
   printed is for east-asian languages (Chinese, Japanese, Korean). Words
686
   composed of single or multiple characters should be entered separated by
693
   composed of single or multiple characters should be entered separated by
...
...
861
   This entry is mainly useful for email attachments and permits viewing the
868
   This entry is mainly useful for email attachments and permits viewing the
862
   message to which the document is attached. Note that the entry will also
869
   message to which the document is attached. Note that the entry will also
863
   appear for an email which is part of an mbox folder file, but that you
870
   appear for an email which is part of an mbox folder file, but that you
864
   can't actually visualize the folder (there will be an error dialog if you
871
   can't actually visualize the folder (there will be an error dialog if you
865
   try). Recoll is unfortunately not yet smart enough to disable the entry in
872
   try). Recoll is unfortunately not yet smart enough to disable the entry in
866
   this case. In other cases, the Open option makes sense, for exemple to
873
   this case. In other cases, the Open option makes sense, for example to
867
   start a chm viewer on the parent document for a help page.
874
   start a chm viewer on the parent document for a help page.
868
875
869
     ----------------------------------------------------------------------
876
     ----------------------------------------------------------------------
870
877
871
  3.1.3. The result table
878
  3.1.3. The result table
...
...
905
   will open a new window for side by side viewing).
912
   will open a new window for side by side viewing).
906
913
907
   Starting another search and requesting a preview will create a new preview
914
   Starting another search and requesting a preview will create a new preview
908
   window. The old one stays open until you close it.
915
   window. The old one stays open until you close it.
909
916
910
   You can close a preview tab by typing ^W (Ctrl + W) in the window. Closing
917
   You can close a preview tab by typing Ctrl-W (Ctrl + W) in the window.
911
   the last tab for a window will also close the window.
918
   Closing the last tab for a window will also close the window.
912
919
913
   Of course you can also close a preview window by using the window manager
920
   Of course you can also close a preview window by using the window manager
914
   button in the top of the frame.
921
   button in the top of the frame.
915
922
916
   You can display successive or previous documents from the result list
923
   You can display successive or previous documents from the result list
...
...
922
   area or by clicking into the Search for: text field and entering the
929
   area or by clicking into the Search for: text field and entering the
923
   search string. You can then use the Next and Previous buttons to find the
930
   search string. You can then use the Next and Previous buttons to find the
924
   next/previous occurrence. You can also type F3 inside the text area to get
931
   next/previous occurrence. You can also type F3 inside the text area to get
925
   to the next occurrence.
932
   to the next occurrence.
926
933
927
   If you have a search string entered and you use ^Up/^Down to browse the
934
   If you have a search string entered and you use Ctrl-Up/Ctrl-Down to
928
   results, the search is initiated for each successive document. If the
935
   browse the results, the search is initiated for each successive document.
929
   string is found, the cursor will be positioned at the first occurrence of
936
   If the string is found, the cursor will be positioned at the first
930
   the search string.
937
   occurrence of the search string.
931
938
932
   A right-click menu in the text area allows switching between displaying
939
   A right-click menu in the text area allows switching between displaying
933
   the main text or the contents of fields associated to the document (ie:
940
   the main text or the contents of fields associated to the document (ie:
934
   author, abtract, etc.). This is especially useful in cases where the term
941
   author, abtract, etc.). This is especially useful in cases where the term
935
   match did not occur in the main text but in one of the fields.
942
   match did not occur in the main text but in one of the fields.
936
943
937
   You can print the current preview window contents by typing ^P (Ctrl + P)
944
   You can print the current preview window contents by typing Ctrl-P (Ctrl +
938
   in the window text.
945
   P) in the window text.
939
946
940
     ----------------------------------------------------------------------
947
     ----------------------------------------------------------------------
941
948
942
  3.1.5. Complex/advanced search
949
  3.1.5. Complex/advanced search
943
950
...
...
1279
1286
1280
   Forced opening of a preview window. You can use Shift+Click on a result
1287
   Forced opening of a preview window. You can use Shift+Click on a result
1281
   list Preview link to force the creation of a preview window instead of a
1288
   list Preview link to force the creation of a preview window instead of a
1282
   new tab in the existing one.
1289
   new tab in the existing one.
1283
1290
1284
   Closing previews. Entering ^W in a tab will close it (and, for the last
1291
   Closing previews. Entering Ctrl-W in a tab will close it (and, for the
1285
   tab, close the preview window). Entering Esc will close the preview window
1292
   last tab, close the preview window). Entering Esc will close the preview
1286
   and all its tabs.
1293
   window and all its tabs.
1287
1294
1288
   Printing previews. Entering ^P in a preview window will print the
1295
   Printing previews. Entering Ctrl-P in a preview window will print the
1289
   currently displayed text.
1296
   currently displayed text.
1290
1297
1291
   Quitting. Entering ^Q almost anywhere will close the application.
1298
   Quitting. Entering Ctrl-Q almost anywhere will close the application.
1292
1299
1293
     ----------------------------------------------------------------------
1300
     ----------------------------------------------------------------------
1294
1301
1295
  3.1.11. Customizing the search interface
1302
  3.1.11. Customizing the search interface
1296
1303
...
...
1310
1317
1311
     * Style sheet: The name of a Qt style sheet text file which is applied
1318
     * Style sheet: The name of a Qt style sheet text file which is applied
1312
       to the whole Recoll application on startup. The default value is
1319
       to the whole Recoll application on startup. The default value is
1313
       empty, but there is a skeleton style sheet (recoll.qss) inside the
1320
       empty, but there is a skeleton style sheet (recoll.qss) inside the
1314
       /usr/share/recoll/examples directory. Using a style sheet, you can
1321
       /usr/share/recoll/examples directory. Using a style sheet, you can
1315
       change most Recoll graphical parameters: colors, fonts, etc. See the
1322
       change most recoll graphical parameters: colors, fonts, etc. See the
1316
       sample file for a few simple examples.
1323
       sample file for a few simple examples.
1317
1324
1318
     * Maximum text size highlighted for preview Inserting highlights on
1325
     * Maximum text size highlighted for preview Inserting highlights on
1319
       search term inside the text before inserting it in the preview window
1326
       search term inside the text before inserting it in the preview window
1320
       involves quite a lot of processing, and can be disabled over the given
1327
       involves quite a lot of processing, and can be disabled over the given
...
...
1465
   displayed.
1472
   displayed.
1466
1473
1467
   No more detail will be given about the header part (only useful with the
1474
   No more detail will be given about the header part (only useful with the
1468
   WebKit build), if there are restrictions to what you can do, they are
1475
   WebKit build), if there are restrictions to what you can do, they are
1469
   beyond this author's HTML/CSS/Javascript abilities... There are a few
1476
   beyond this author's HTML/CSS/Javascript abilities... There are a few
1470
   exemples on the page about customising the result list on the Recoll web
1477
   examples on the page about customising the result list on the Recoll web
1471
   site.
1478
   site.
1472
1479
1473
     ----------------------------------------------------------------------
1480
     ----------------------------------------------------------------------
1474
1481
1475
      3.1.11.1.1. The paragraph format
1482
      3.1.11.1.1. The paragraph format
...
...
1700
   ie: the From: header, for an email message), and containing either beatles
1707
   ie: the From: header, for an email message), and containing either beatles
1701
   or lennon and either live or unplugged but not potatoes (in any part of
1708
   or lennon and either live or unplugged but not potatoes (in any part of
1702
   the document).
1709
   the document).
1703
1710
1704
   An element is composed of an optional field specification, and a value,
1711
   An element is composed of an optional field specification, and a value,
1705
   separated by a colon. Exemple: Beatles, author:balzac, dc:title:grandet
1712
   separated by a colon. Example: Beatles, author:balzac, dc:title:grandet
1706
1713
1707
   The colon, if present, means "contains". Xesam defines other relations,
1714
   The colon, if present, means "contains". Xesam defines other relations,
1708
   which are not supported for now.
1715
   which are not supported for now.
1709
1716
1710
   All elements in the search entry are normally combined with an implicit
1717
   All elements in the search entry are normally combined with an implicit
...
...
1719
1726
1720
   As usual, words inside quotes define a phrase (the order of words is
1727
   As usual, words inside quotes define a phrase (the order of words is
1721
   significant), so that title:"prejudice pride" is not the same as
1728
   significant), so that title:"prejudice pride" is not the same as
1722
   title:prejudice title:pride, and is unlikely to find a result.
1729
   title:prejudice title:pride, and is unlikely to find a result.
1723
1730
1724
   Modifiers can be set on a phrase clause, for exemple to specify a
1731
   Modifiers can be set on a phrase clause, for example to specify a
1725
   proximity search (unordered). See the modifier section.
1732
   proximity search (unordered). See the modifier section.
1726
1733
1727
   Recoll currently manages the following default fields:
1734
   Recoll currently manages the following default fields:
1728
1735
1729
     * title, subject or caption are synonyms which specify data to be
1736
     * title, subject or caption are synonyms which specify data to be
...
...
1749
       field and only one value makes sense in a query (you can't use
1756
       field and only one value makes sense in a query (you can't use
1750
       dir:dir1 OR dir:dir2). Relative paths make sense, for example,
1757
       dir:dir1 OR dir:dir2). Relative paths make sense, for example,
1751
       dir:share/doc would match either /usr/share/doc or
1758
       dir:share/doc would match either /usr/share/doc or
1752
       /usr/local/share/doc
1759
       /usr/local/share/doc
1753
1760
1754
     * size for filtering the results on file size. Exemple: size<10000. You
1761
     * size for filtering the results on file size. Example: size<10000. You
1755
       can use <, > or = as operators. You can specify a range like the
1762
       can use <, > or = as operators. You can specify a range like the
1756
       following: size>100 size<1000. The usual k/K, m/M, g/G, t/T can be
1763
       following: size>100 size<1000. The usual k/K, m/M, g/G, t/T can be
1757
       used as (decimal) multipliers. Ex: size>1k to search for files bigger
1764
       used as (decimal) multipliers. Ex: size>1k to search for files bigger
1758
       than 1000 bytes.
1765
       than 1000 bytes.
1759
1766
...
...
1764
       time. Periods are specified as PnYnMnD. The n numbers are the
1771
       time. Periods are specified as PnYnMnD. The n numbers are the
1765
       respective numbers of years, months or days, any of which may be
1772
       respective numbers of years, months or days, any of which may be
1766
       missing. Dates are specified as YYYY-MM-DD. The days and months parts
1773
       missing. Dates are specified as YYYY-MM-DD. The days and months parts
1767
       may be missing. If the / is present but an element is missing, the
1774
       may be missing. If the / is present but an element is missing, the
1768
       missing element is interpreted as the lowest or highest date in the
1775
       missing element is interpreted as the lowest or highest date in the
1769
       index. Exemples:
1776
       index. Examples:
1770
1777
1771
          * 2001-03-01/2002-05-01 the basic syntax for an interval of dates.
1778
          * 2001-03-01/2002-05-01 the basic syntax for an interval of dates.
1772
1779
1773
          * 2001-03-01/P1Y2M the same specified with a period.
1780
          * 2001-03-01/P1Y2M the same specified with a period.
1774
1781
...
...
2007
   the filter if the operation is for indexing or previewing. Some filters
2014
   the filter if the operation is for indexing or previewing. Some filters
2008
   use this to output a slightly different format, for example stripping
2015
   use this to output a slightly different format, for example stripping
2009
   uninteresting repeated keywords (ie: Subject: for email) when indexing.
2016
   uninteresting repeated keywords (ie: Subject: for email) when indexing.
2010
   This is not essential.
2017
   This is not essential.
2011
2018
2012
   You should look to one of the simple filters, for exemple rclps for a
2019
   You should look to one of the simple filters, for example rclps for a
2013
   starting point.
2020
   starting point.
2014
2021
2015
   Don't forget to make your filter executable before testing !
2022
   Don't forget to make your filter executable before testing !
2016
2023
2017
     ----------------------------------------------------------------------
2024
     ----------------------------------------------------------------------
...
...
2435
   In all cases, the strict software dependancies (ie on Xapian or iconv)
2442
   In all cases, the strict software dependancies (ie on Xapian or iconv)
2436
   will be automatically satisfied, you should not have to worry about them.
2443
   will be automatically satisfied, you should not have to worry about them.
2437
2444
2438
   You will only have to check or install supporting applications for the
2445
   You will only have to check or install supporting applications for the
2439
   file types that you want to index beyond those that are natively processed
2446
   file types that you want to index beyond those that are natively processed
2440
   by Recoll (text, HTML, mail files, and a few others).
2447
   by Recoll (text, HTML, email files, and a few others).
2441
2448
2442
   You should also maybe have a look at the configuration section (but this
2449
   You should also maybe have a look at the configuration section (but this
2443
   may not be necessary for a quick test with default parameters). Most
2450
   may not be necessary for a quick test with default parameters). Most
2444
   parameters can be more conveniently set from the GUI interface.
2451
   parameters can be more conveniently set from the GUI interface.
2445
2452
...
...
2557
2564
2558
     * Midi karaoke files need Python and the Midi module
2565
     * Midi karaoke files need Python and the Midi module
2559
2566
2560
     * Konqueror webarchive format with Python (uses the Tarfile module).
2567
     * Konqueror webarchive format with Python (uses the Tarfile module).
2561
2568
2562
     * mimehtml web archive format (support based on the mail filter, which
2569
     * mimehtml web archive format (support based on the email filter, which
2563
       introduces some mild weirdness, but still usable).
2570
       introduces some mild weirdness, but still usable).
2564
2571
2565
   Text, HTML, mail folders, and Scribus files are processed internally. Lyx
2572
   Text, HTML, email folders, and Scribus files are processed internally. Lyx
2566
   is used to index Lyx files. Many filters need iconv and the standard sed
2573
   is used to index Lyx files. Many filters need iconv and the standard sed
2567
   and awk.
2574
   and awk.
2568
2575
2569
     ----------------------------------------------------------------------
2576
     ----------------------------------------------------------------------
2570
2577
...
...
2764
   expanded to the name of the user's home directory, as a shell would do.
2771
   expanded to the name of the user's home directory, as a shell would do.
2765
2772
2766
   White space is used for separation inside lists. List elements with
2773
   White space is used for separation inside lists. List elements with
2767
   embedded spaces can be quoted using double-quotes.
2774
   embedded spaces can be quoted using double-quotes.
2768
2775
2776
   Encoding issues. Most of the configuration parameters are plain ASCII. Two
2777
   particular sets of values may cause encoding issues:
2778
2779
     * File path parameters may contain non-ascii characters and should use
2780
       the exact same byte values as found in the file system directory.
2781
       Usually, this means that the configuration file should use the system
2782
       default locale encoding.
2783
2784
     * The unac_except_trans parameter should be encoded in UTF-8. If your
2785
       system locale is not UTF-8, and you need to also specify non-ascii
2786
       file paths, this poses a difficulty because common text editors cannot
2787
       handle multiple encodings in a single file. In this relatively
2788
       unlikely case, you can edit the configuration file as two separate
2789
       text files with appropriate encodings, and concatenate them to create
2790
       the complete configuration.
2791
2769
     ----------------------------------------------------------------------
2792
     ----------------------------------------------------------------------
2770
2793
2771
  5.4.1. Main configuration file
2794
  5.4.1. Main configuration file
2772
2795
2773
   recoll.conf is the main configuration file. It defines things like what to
2796
   recoll.conf is the main configuration file. It defines things like what to
...
...
2811
           a directory in topdirs might match and would still be indexed).
2834
           a directory in topdirs might match and would still be indexed).
2812
2835
2813
           The list in the default configuration does not exclude hidden
2836
           The list in the default configuration does not exclude hidden
2814
           directories (names beginning with a dot), which means that it may
2837
           directories (names beginning with a dot), which means that it may
2815
           index quite a few things that you do not want. On the other hand,
2838
           index quite a few things that you do not want. On the other hand,
2816
           mail user agents like thunderbird usually store messages in hidden
2839
           email user agents like thunderbird usually store messages in
2817
           directories, and you probably want this indexed. One possible
2840
           hidden directories, and you probably want this indexed. One
2818
           solution is to have .* in skippedNames, and add things like
2841
           possible solution is to have .* in skippedNames, and add things
2819
           ~/.thunderbird or ~/.evolution in topdirs.
2842
           like ~/.thunderbird or ~/.evolution in topdirs.
2820
2843
2821
           Not even the file names are indexed for patterns in this list. See
2844
           Not even the file names are indexed for patterns in this list. See
2822
           the recoll_noindex variable in mimemap for an alternative approach
2845
           the recoll_noindex variable in mimemap for an alternative approach
2823
           which indexes the file names.
2846
           which indexes the file names.
2824
2847
...
...
2963
           character set definition (ie: plain text files). This can be
2986
           character set definition (ie: plain text files). This can be
2964
           redefined for any sub-directory. If it is not set at all, the
2987
           redefined for any sub-directory. If it is not set at all, the
2965
           character set used is the one defined by the nls environment
2988
           character set used is the one defined by the nls environment
2966
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
2989
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
2967
2990
2991
   unac_except_trans
2992
2993
           This is a list of characters, encoded in UTF-8, which should be
2994
           handled specially when converting text to unaccented lowercase.
2995
           For example, in Swedish, the letter a with diaeresis has full
2996
           alphabet citizenship and should not be turned into an a. Each
2997
           element in the space-separated list has the special character as
2998
           first element and the translation following. The handling of both
2999
           the lowercase and upper-case versions of a character should be
3000
           specified, as appartenance to the list will turn-off both standard
3001
           accent and case processing. Example for Swedish:
3002
3003
 unac_except_trans =  aaaa AAaa a:a: A:a: o:o: O:o:
3004
            
3005
3006
           Note that the translation is not limited to a single character,
3007
           you could very well have something like u:ue in the list.
3008
3009
           This parameter can't be defined for subdirectories, it is global,
3010
           because there is no way to do otherwise when querying. If you have
3011
           document sets which would need different values, you will have to
3012
           index and query them separately.
3013
2968
   maildefcharset
3014
   maildefcharset
2969
3015
2970
           This can be used to define the default character set specifically
3016
           This can be used to define the default character set specifically
2971
           for mail messages which don't specify it. This is mainly useful
3017
           for email messages which don't specify it. This is mainly useful
2972
           for readpst (libpst) dumps, which are utf-8 but do not say so.
3018
           for readpst (libpst) dumps, which are utf-8 but do not say so.
2973
3019
2974
   localfields
3020
   localfields
2975
3021
2976
           This allows setting fields for all documents under a given
3022
           This allows setting fields for all documents under a given
...
...
3158
           used inside the [prefixes] and [stored] sections
3204
           used inside the [prefixes] and [stored] sections
3159
3205
3160
   filter-specific sections
3206
   filter-specific sections
3161
3207
3162
           Some filters may need specific configuration for handling fields.
3208
           Some filters may need specific configuration for handling fields.
3163
           Only the mail message filter currently has such a section (named
3209
           Only the email message filter currently has such a section (named
3164
           [mail]). It allows indexing arbitrary mail headers in addition to
3210
           [mail]). It allows indexing arbitrary email headers in addition to
3165
           the ones indexed by default. Other such sections may appear in the
3211
           the ones indexed by default. Other such sections may appear in the
3166
           future.
3212
           future.
3167
3213
3168
   Here follows a small example of a personal fields file. This would extract
3214
   Here follows a small example of a personal fields file. This would extract
3169
   a specific mail header and use it as a searchable field, with data
3215
   a specific email header and use it as a searchable field, with data
3170
   displayable inside result lists. (Side note: as the mail filter does no
3216
   displayable inside result lists. (Side note: as the email filter does no
3171
   decoding on the values, only plain ascii headers can be indexed, and only
3217
   decoding on the values, only plain ascii headers can be indexed, and only
3172
   the first occurrence will be used for headers that occur several times).
3218
   the first occurrence will be used for headers that occur several times).
3173
3219
3174
 [prefixes]
3220
 [prefixes]
3175
 # Index mailmytag contents (with the given prefix)
3221
 # Index mailmytag contents (with the given prefix)