Release notes for Recoll 1.16.x

Caveats

Installing over an older version: 1.16 is mostly compatible with 1.15 indexes, except for a few differences for weird terms containing punctuation signs. Perform a full index pass if installing over an older version. The simplest way to do this is to quit all recoll programs and just delete the index directory (rm -rf ~/.recoll/xapiandb), then start recoll or recollindex. recollindex -z will do the same in most cases.

Also, using the anchored search feature requires a full reindex.

1.16.2: this is a bug fix release, (see the fixed bugs document), with a few limited changes:

The indexer now puts itself in the ionice "idle" class by default (can be changed in the config).
The verbosity level of some messages were adjusted so that a simple sequence of indexed files can now be seen while indexing with the verbosity at level 3 (info).
New command line options for the recollq program add a fully parseable base64-encoded output mode, with full control on the list of fields printed for each result, for use by external programs.

The 1.16.0 GUI can be crashed quite easily, please just upgrade to 1.16.1 or later.

Changes

Recoll 1.16 is an incremental improvements release over 1.15, no major function was introduced or modified.

Images are displayed in preview. You can get at the fields and complete extracted text using the popup menu.
The preview window popup menu has a "save to file" entry to write a subdocument (ie: mail attachement) to a file.
The GUI advanced search panel allows specifying a field for each entry (ie: author/recipient, etc).
It is now possible to anchor searches to the beginning or end of the text or field, by using ^ and $ characters at the beginning or the end of a term or phrase. A maximum distance can be specified as a phrase slack either in the advanced search panel, or as a query language modifier, ie: "^beginterm"o10 would search for beginterm within 10 terms of the beginning of the text. This feature was suggested to me (thanks Gökhan), for searching for a name at the beginning of a text (in the author list, as opposed to anywhere in the text). This is useful for example in the very common case where the metadata for the author list was not created. More details about this feature are to be found in the user manual.
It is possible to configure the result list snippet separator, given as an html fragment. This is an ellipsis by default (…).
We can now perform negative directory filtering (-dir:/some/dir), to return all results except those from the specified directory (recursive). Other attempts at still impossible negative searches (ie: -mime:) now cause explicit errors messages instead of lame results. The inverted directory filtering is accessible from the query language and by checking a checkbox in the advanced search panel.
Result table:
- The detail area now has a popup menu similar to the one in the result list (open parent, save to disk etc.).
- The result table header popup menu has an entry to save the table as a CSV file.
- Estimated result counts are displayed in the status line.
- Set row height according to default font size, and better adjust row height and vertical text position in cells.
It is now possible to set an increased weight for indexing some fields. The title fields gets a boost by default. See the fields default file for details.
The query language allows setting weights on terms, ie, as in: "important"2.5 .
Improved preservation of indentation for text files displayed in the preview window.
Show hidden (dot) files in the indexing configuration GUI dialogs.
Added filters for .war (Konqueror web archive), .mhtm (other web archive format) and rar archives.
Improved handling for native cjk punctuation signs.
Updated the list of native apps in the default mimeview (ie: xv->gwenview, rox->dolphin, etc.)
Added -f option to recollindex to ignore skippedPaths/Names when used with -i. Allows the use of a purely external file selection mechanism.
The performance of email indexing has been slightly improved (less CPU usage).
Real time indexer: several configuration parameters allow adjusting the timing of indexing actions:
- monauxinterval: the interval between auxiliary databases rebuilds (stemdb, aspell).
- monixinterval: The waiting period during which indexing events are accumulated prior to actual indexing (saves work on duplicate events).
- mondelaypatterns: a list of file patterns for which indexing should be delayed longer (quick changing files like logs that should be reindexed much slower than they change).
See the default configuration file for more detail.
Fixed bugs:
- UTF-8 paths inside ZIP archives were mishandled. Also fixes problem with colons inside archive member paths.
- Fixed GUI result list doc parent operations (open/preview) which were broken in 1.15.
- Fixed case where indexing could hang or crash after an error occured while indexing an archive member (which should have affected only the relevant document).
- Real time indexer: uncontrolled concurrent access to the global configuration could cause a startup crash (mostly of big file trees because of timing issues).
- Fixed sorting by document and file size in the result table.
- Email messages for which there would be an error indexing an attachment would not be indexed at all.
- Text files bigger than 2 GB could not be indexed.
- Fixed the handling of compressed man pages.
- Memory usage could grow almost unbounded while deleting documents, because idxflushmb was not used for document deletions.