--- a/website/BUGS.txt
+++ b/website/BUGS.txt
@@ -1,234 +1,339 @@
-Known bugs in current and older versions:
-
-Bugs that are listed in an older version section are supposedly fixed in
-later versions. Bugs listed in the topmost section may also exist in older
-versions.
-
-Latest (recoll 1.10.6 + xapian 1.0.x):
-
-- When Recoll is built with qt 4.4.0, the icons in the result list are all
- displayed at the top of the page and garbled. This appears to be a qt
- bug, fixed in 4.4.1. Use either qt 4.3.x or 4.4.1
-
-- If the locale is not utf-8, non-ascii command line arguments to recoll
- and recollq are not converted to utf-8, which may prevent, for example,
- the kde applet from working. The workaround is to apply the following
- one-line fix to qtgui/main.cpp, recompile and install recoll:
-386c386
-< sSearch->setSearchString(QString::fromUtf8(qstring.c_str()));
----
-> sSearch->setSearchString(QString::fromLocal8Bit(qstring.c_str()));
-
-
-- If the user-chosen result list entry format results in several paragraphs
- (in the qt textedit sense), right clicks will only work inside the first
- one for each entry.
-
-- When a mime type has an external viewer defined, but the actual file is
- compressed (ie: xxx.txt.gz), recoll will try to start the external viewer
- on the compressed file, which will not work in most cases.
-
-- NEAR crashes: 1.6 has added NEAR searches. Unlike what recoll did
- with PHRASES, stemming expansion is performed on terms inside NEAR
- clauses (except if prevented by a capitalized entry of course). There is
- a bug in Xapian (all versions as far as I know), where NEAR does not support
- multiple OR subclauses, as would result from a multiple expansion. This
- manifests itself by a 'not implemented' Xapian exception. Workarounds:
-
- - Prevent expansion of NEAR terms (possibly except one) by
- capitalizing them.
-
- - Or apply the following patch to xapian, inside the
- "api/" directory:
- http://www.recoll.org/xapian/xapNearDistrib-1.0.patch
- or fetch the already patched source:
- http://www.recoll.org/xapian/xapian-core-1.0.7-recollNEARpatch.tar.gz
-
- then recompile, and install.
-
- I hope that an equivalent fix will make it into xapian at some point (the
- current fix is not completely correct but still handles most useful cases).
-
-- If you are seeing a delay of a few seconds before the result list
- displays for the first query of a recoll instance, try changing the
- result list font in the query preferences. This is not a recoll problem,
- I don't know the exact cause (I've seen it happen with "Sans Serif" and
- go away with Helvetica or Arial).
-
-- Under some versions of KDE (ie: Fedora FC5 KDE 3.5.4-0.5.fc5), there is a
- problem with the window stacking order. Opening the "browse" file
- selection dialog from the advanced search dialog will stack the latter
- under the main window, possibly making it invisible. This is quite
- probably a Kwin bug, possibly related to
- http://bugs.kde.org/show_bug.cgi?id=79183 or a correction thereof.
-
-- Under Solaris, it is necessary to perform initial indexing with the
- recollindex program (the recoll index thread doesn't work for creating
- the database). Don't know the reason. Only idea I have is problem with
- exception handling (recoll catches an exception while trying the
- yet inexistant db).
-
-1.10.1 + xapian 1.0.x
-- A relatively simple error case can cause the indexer to stop processing
- an mbox file (forgetting all subsequent messages). More specifically,
- this happens when encountering more than than a few dozen errors while
- handling attachments. This is relatively common: for exemple if an
- external helper application is missing and multiple attachments of the
- affected type are found (ie: multiple images and no
- exiftool). Workaround: install the helper application.
-- The decoding of base-64 data in emails fails in a relatively uncommon
- but sometimes encountered case.
-- In a preview window, when walking the search term hits with the
- Previous/Next buttons, 'Previous' actually acts as 'Next' (it does work
- normally for the local search).
-- Problems in detecting message separators inside Thunderbird mailboxes
- (quite probably mainly for messages imported from outlook?). Can lead to
- unindexed messages, and even apparently indexer crashes in some cases.
-- File names indexed as terms can sometimes overflow the maximum term
- size, halting the indexing.
-- For Phrase/Near searches, only the first term group is highlighted in
- preview.
-
-
-1.10.0
-
-- If a filter fails while trying to extract the data from a file, the file
- will not be indexed at all (not even the file name). The file
- name should be indexed in this case. This happens in particular in the
- very common case where the helper application is not installed (ie:
- missing Exiftool -> no *.jpg names in the index).
-
-- If several query language "ext:" qualifiers are specified, they will be
- joined by an AND instead of OR, resulting in no results. Using an
- explicit OR doesn't work (actually OR + field names is generally
- broken). In some cases, you can use a "type:" qualifier as a workaround.
-
-
-1.9.x
-- Problems have been reported indexing big mailstores (several hundreds of
- thousands of messages): resulting in a very big database and even
- crashes.
-
-
-1.8.2
-- Under ubuntu (at least, maybe debian too), the default awk interpreter
- (mawk) is ancient, and the recoll pdf input filter does not
- work (removes all space characters). This can be solved by installing the
- gawk package.
- $ apt-get install gawk
- $ update-alternatives --set awk /usr/bin/gawk
-
-- There are sometimes problems with document deletions: the index can
- get in a state where deleted or moved documents are not purged from the
- index (the log file says that the doc are deleted, but they aren't
- actually). When this happens, the only solution currently is to reindex
- from scratch (recollindex -z). This is due to a xapian bug, which is
- fixed in xapian 1.0.2, or you can apply the following patch to xapian
- 1.0.1 to fix it:
- http://www.lesbonscomptes.com/recoll/xapian/xapian-delete-document.patch
-
-- The dates shown for email attachments in a result list are the email
- folder modification date. This should be inherited from the parent
- message instead.
-
-- There are a few problems in the qt4 version of recoll:
- - Some accelerators (esc-spc, ctl-arrow) do not work, neither do
- copy/paste between the result list and preview windows and x11
- applications.
- - The qt4 q3textedit::find() method is extremely slow, so that
- positionning to first search term in Recoll preview has been disabled,
- and the application will sometimes appear to be looping when using the
- find feature in the preview window (it's not looping, it's searching...)
-
-1.8.1
-- This is not really a bug but .beagle really should be included in
- "skippedNames", or you end up indexing the beagle text cache, which is
- not really desirable.
-- Doc bug: the manual states that the query language supports a "mime:"
- switch to filter mime types. There is currently no such thing.
-
-***************************************************************************
-1.7.5
-- Debian and Ubuntu: the rclsoff Openoffice filter doesn't work,
- because of an incorrect shell syntax (understood by bash but not sh). To
- fix, you edit /usr[/local]/share/recoll/filters/rclsoff and can change
- the line:
-trap cleanup EXIT SIGHUP SIGQUIT SIGINT SIGTERM
- into:
-trap cleanup EXIT HUP QUIT INT TERM
- or download the updated filter from the filters page:
- http://www.recoll.org/filters/filters.html
-
-1.7.3
-- Processing will stop on first error while indexing an mbox file. This
- could happen just because an attachment could not be decoded, and can
- cause non-indexing of many messages. The most probable cause of error is
- a missing filter (ie for ms-word files), so the temporary workaround
- would be to install the missing filters. This bug is specific to 1.7 and
- 1.6 users need not worry. A correction will be issued very soon.
-- Messages of type multipart/signed are not indexed.
-
-1.6.2
- - Relatively unfrequent issue with message boundary detection in mbox
- files, could cause miscellaneous problems.
- - Executing an external viewer for a file with single-quotes in the name
- would not work.
-
-1.5.10
-- If a defaultcharset was set in the configuration file for a subdirectory,
- it would stay in effect for all subsequent files/directories (except if
- explicitely overridden), potentially causing many transcoding errors.
-
-1.5.[1-7]
-- Dates in result list come from the file's ctimes, which may be confusing
-- Some rare MIME messages with null boundaries can crash the indexer.
-
-1.5.0
-- Under some conditions, recoll startup and exit could be very slow: the
- simple search history list had serious problems with non-ascii strings,
- whose size sometimes doubled at each program startup/stop.
-
-1.3.3
-
-- Several of the external filters did not handle path names with embedded
- spaces (rcluncomp rclsoff rclps rclmedia rcldjvu). This is fixed in 1.4.
-
-- If your QT installation is built with the QT_NO_STL flag, Recoll will not
- compile. I have a patch for this (will be fixed in the next release),
- contact me if you get the problem. Typical error message:
- main.cpp:160: error: no match for 'operator+=' in 'msg += reason'
-
-- The 'None of these words' field in the complex search does not work if
- there are no other filled fields (it transforms into an ordinary
- search). Workaround: enter very common term(s) in the 'any of these
- words' field.
-
-- Indexing cannot currently be conveniently and cleanly stopped when it's
- started. You can kill the process, and keyboard interrupt might work, but
- this may leave the database in a bad state. This is fixed in the upcoming
- release, there is no current workaround.
-
-1.2.2
-- The preview window is supposed to scroll after loading the document so
- that the first search term is visible. This does not work in many cases.
-- The result list title is not shown for sorted lists
-
-Notes on older versions:
-- Trouble compiling on some linux systems (Gentoo and Slackware?). There
- existed a quite common issue where the Recoll link will fail trying to
- use a libstdc++.la file. This was due to a problem with the xapian-config
- program. A workaround has been included in the configure script for
- recoll 1.2.2, and the problem should not occur any more.
-
-- Case-insensitive search should now work in most cases (used to not work
- except for accented ascii).
-
-- All directories and files with names beginning with a dot were ignored
- by the skippedNames directive in the default recoll.conf file from
- older versions (no indexation of mozilla or thunderbird email !). An
- upgrade will not fix this (it will not modify an existing
- configuration). You need to edit recoll.conf by hand and remove the .*
- from skippedNames.
-
-
-
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html>
+ <head>
+ <title>Recoll known bugs</title>
+
+ <meta name="generator" content="HTML Tidy, see www.w3.org">
+ <meta name="Author" content="Jean-Francois Dockes">
+ <meta name="Description" content=
+ "recoll is a simple full-text search system for unix and linux
+ based on the powerful and mature xapian engine">
+ <meta name="Keywords" content=
+ "full text search, desktop search, unix, linux">
+ <meta http-equiv="Content-language" content="en">
+ <meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
+ <meta name="robots" content="All,Index,Follow">
+
+ <link type="text/css" rel="stylesheet" href="styles/style.css">
+ </head>
+
+ <body>
+
+ <div class="rightlinks">
+ <ul>
+ <li><a href="index.html">Home</a></li>
+ <li><a href="download.html">Downloads</a></li>
+ <li><a href="doc.html">Documentation</a></li>
+ </ul>
+ </div>
+
+ <div class="content">
+
+ <h1>Known bugs in current and older versions</h1>
+
+ <p><i>Bugs that are listed in an older version section are
+ supposedly fixed in later versions. Bugs listed in the
+ topmost section may also exist in older versions.</i></p>
+
+ <h2>Latest (recoll 1.11.2 + xapian 1.0.x)</h2>
+ <ul>
+
+ <li>Performing a full index with release 1.11, over a version
+ created with a much older recoll release may sometimes end
+ with an error saying "backend doesn't implement metadata".
+ If this happens, you need to delete the index directory
+ (typically <em>~/.recoll/xapiandb/</em>) and restart
+ indexing. For big indexes, removing the directory preventively
+ may be a smart move to avoid losing time.</li>
+
+ <li> When Recoll is built with qt 4.4.0, the icons in the
+ result list are all displayed at the top of the page and
+ garbled. This appears to be a qt bug, fixed in 4.4.1. Use
+ either qt 4.3.x or 4.4.1
+
+ <li> If the user-chosen result list entry format results in
+ several paragraphs (in the qt textedit sense), right clicks
+ will only work inside the first one for each entry.
+
+ <li>The "Copy file name" and "Copy URL" entries of the
+ right-click menus only copy the data to the X11 primary
+ selection (use middle-button click to paste). This is
+ probably a mistake, the data should be copied to the
+ clipboard too (permitting the use of the "Paste" edit menu
+ entry or Ctrl+V in the target).
+
+ <li> When a mime type has an external viewer defined, but the
+ actual file is compressed (ie: xxx.txt.gz), recoll will try
+ to start the external viewer on the compressed file, which
+ will not work in most cases.
+
+ <li> NEAR crashes: 1.6 has added NEAR searches. Unlike what
+ recoll did with PHRASES, stemming expansion is performed on
+ terms inside NEAR clauses (except if prevented by a
+ capitalized entry of course). There is a bug in Xapian (all
+ versions as far as I know), where NEAR does not support
+ multiple OR subclauses, as would result from a multiple
+ expansion. This manifests itself by a 'not implemented'
+ Xapian exception. Workarounds:
+ <ul>
+ <li>Prevent expansion of NEAR terms (possibly except one) by
+ capitalizing them.
+
+ <li>Or apply the following patch to xapian, inside the
+ "api/" directory:
+ http://www.recoll.org/xapian/xapNearDistrib-1.0.patch
+ or fetch the already patched source:
+ http://www.recoll.org/xapian/xapian-core-1.0.7-recollNEARpatch.tar.gz
+ then recompile, and install.
+ </li>
+ </ul>
+
+ I hope that an equivalent fix will make it into xapian at
+ some point (the current fix is not completely correct but
+ still handles most useful cases).</li>
+
+ <li> If you are seeing a delay of a few seconds before the
+ result list displays for the first query of a recoll
+ instance, try changing the result list font in the query
+ preferences. This is not a recoll problem, I don't know the
+ exact cause (I've seen it happen with "Sans Serif" and go
+ away with Helvetica or Arial).
+
+ <li> Under some versions of KDE (ie: Fedora FC5 KDE
+ 3.5.4-0.5.fc5), there is a problem with the window stacking
+ order. Opening the "browse" file selection dialog from the
+ advanced search dialog will stack the latter under the main
+ window, possibly making it invisible. This is quite probably
+ a Kwin bug, possibly related to
+ http://bugs.kde.org/show_bug.cgi?id=79183 or a correction
+ thereof.
+
+ <li> Under Solaris, it is necessary to perform initial indexing with the
+ recollindex program (the recoll index thread doesn't work for creating
+ the database). Don't know the reason. Only idea I have is problem with
+ exception handling (recoll catches an exception while trying the
+ yet inexistant db).</li>
+ </ul>
+
+ <h2>1.11.1</h2>
+ <ul>
+ <li>Unicode space characters like
+ <em>0x3000, Ideographic space</em>
+ where not detected inside user entries like the main
+ interface search entry. Badly parsed searches would retrieve no
+ results, when the same search entered with ascii space characters
+ would have succeeded.</li>
+ <li>Spaces were inserted inside CJK strings when building
+ abstracts for the result list.</li>
+ </ul>
+
+ <h2>1.10.6</h2>
+ <ul>
+ <li> If the locale is not utf-8, non-ascii command line
+ arguments to recoll and recollq are not converted to utf-8,
+ which may prevent, for example, the kde applet from
+ working. The workaround is to apply the following one-line
+ fix to qtgui/main.cpp, recompile and install recoll:
+ <pre>
+ 386c386
+ < sSearch->setSearchString(QString::fromUtf8(qstring.c_str()));
+ ---
+ > sSearch->setSearchString(QString::fromLocal8Bit(qstring.c_str()));
+ </pre>
+ </li>
+ </ul>
+
+ <h2>1.10.1</h2>
+
+ <ul>
+ <li> A relatively simple error case can cause the indexer to
+ stop processing an mbox file (forgetting all subsequent
+ messages). More specifically, this happens when encountering
+ more than than a few dozen errors while handling
+ attachments. This is relatively common: for exemple if an
+ external helper application is missing and multiple
+ attachments of the affected type are found (ie: multiple
+ images and no exiftool). Workaround: install the helper
+ application.
+ <li> The decoding of base-64 data in emails fails in a relatively uncommon
+ but sometimes encountered case.
+ <li> In a preview window, when walking the search term hits with the
+ Previous/Next buttons, 'Previous' actually acts as 'Next' (it does work
+ normally for the local search).
+ <li> Problems in detecting message separators inside Thunderbird mailboxes
+ (quite probably mainly for messages imported from outlook?). Can lead to
+ unindexed messages, and even apparently indexer crashes in some cases.
+ <li> File names indexed as terms can sometimes overflow the maximum term
+ size, halting the indexing.
+ <li> For Phrase/Near searches, only the first term group is highlighted in
+ preview.
+ </ul>
+
+ <h2>1.10.0</h2>
+ <ul>
+
+ <li> If a filter fails while trying to extract the data from a file, the file
+ will not be indexed at all (not even the file name). The file
+ name should be indexed in this case. This happens in particular in the
+ very common case where the helper application is not installed (ie:
+ missing Exiftool -> no *.jpg names in the index).
+
+ <li> If several query language "ext:" qualifiers are specified, they will be
+ joined by an AND instead of OR, resulting in no results. Using an
+ explicit OR doesn't work (actually OR + field names is generally
+ broken). In some cases, you can use a "type:" qualifier as a workaround.
+
+
+ </ul>
+ <h2>1.9.x</h2>
+ <ul>
+ <li> Problems have been reported indexing big mailstores (several hundreds of
+ thousands of messages): resulting in a very big database and even
+ crashes.
+
+ </ul>
+ <h2>1.8.2</h2>
+ <ul>
+ <li> Under ubuntu (at least, maybe debian too), the default awk interpreter
+ (mawk) is ancient, and the recoll pdf input filter does not
+ work (removes all space characters). This can be solved by installing the
+ gawk package.
+ $ apt-get install gawk
+ $ update-alternatives --set awk /usr/bin/gawk
+
+ <li> There are sometimes problems with document deletions: the index can
+ get in a state where deleted or moved documents are not purged from the
+ index (the log file says that the doc are deleted, but they aren't
+ actually). When this happens, the only solution currently is to reindex
+ from scratch (recollindex -z). This is due to a xapian bug, which is
+ fixed in xapian 1.0.2, or you can apply the following patch to xapian
+ 1.0.1 to fix it:
+ http://www.lesbonscomptes.com/recoll/xapian/xapian-delete-document.patch
+
+ <li> The dates shown for email attachments in a result list are the email
+ folder modification date. This should be inherited from the parent
+ message instead.
+
+ <li> There are a few problems in the qt4 version of recoll:
+ <li> Some accelerators (esc-spc, ctl-arrow) do not work, neither do
+ copy/paste between the result list and preview windows and x11
+ applications.
+ <li> The qt4 q3textedit::find() method is extremely slow, so that
+ positionning to first search term in Recoll preview has been disabled,
+ and the application will sometimes appear to be looping when using the
+ find feature in the preview window (it's not looping, it's searching...)
+
+ </ul>
+ <h2>1.8.1</h2>
+ <ul>
+ <li> This is not really a bug but .beagle really should be included in
+ "skippedNames", or you end up indexing the beagle text cache, which is
+ not really desirable.
+ <li> Doc bug: the manual states that the query language supports a "mime:"
+ switch to filter mime types. There is currently no such thing.
+
+
+ </ul>
+ <h2>1.7.5</h2>
+ <ul>
+ <li> Debian and Ubuntu: the rclsoff Openoffice filter doesn't work,
+ because of an incorrect shell syntax (understood by bash but not sh). To
+ fix, you edit /usr[/local]/share/recoll/filters/rclsoff and can change
+ the line:
+ trap cleanup EXIT SIGHUP SIGQUIT SIGINT SIGTERM
+ into:
+ trap cleanup EXIT HUP QUIT INT TERM
+ or download the updated filter from the filters page:
+ http://www.recoll.org/filters/filters.html
+
+ </ul>
+ <h2>1.7.3</h2>
+ <ul>
+ <li> Processing will stop on first error while indexing an mbox file. This
+ could happen just because an attachment could not be decoded, and can
+ cause non-indexing of many messages. The most probable cause of error is
+ a missing filter (ie for ms-word files), so the temporary workaround
+ would be to install the missing filters. This bug is specific to 1.7 and
+ 1.6 users need not worry. A correction will be issued very soon.
+ <li> Messages of type multipart/signed are not indexed.
+
+ </ul>
+ <h2>1.6.2</h2>
+ <ul>
+ <li> Relatively unfrequent issue with message boundary detection in mbox
+ files, could cause miscellaneous problems.
+ <li> Executing an external viewer for a file with single-quotes in the name
+ would not work.
+
+ </ul>
+ <h2>1.5.10</h2>
+ <ul>
+ <li> If a defaultcharset was set in the configuration file for a subdirectory,
+ it would stay in effect for all subsequent files/directories (except if
+ explicitely overridden), potentially causing many transcoding errors.
+
+ </ul>
+ <h2>1.5.[1-7]</h2>
+ <ul>
+ <li> Dates in result list come from the file's ctimes, which may be confusing
+ <li> Some rare MIME messages with null boundaries can crash the indexer.
+
+ </ul>
+ <h2>1.5.0</h2>
+ <ul>
+ <li> Under some conditions, recoll startup and exit could be very slow: the
+ simple search history list had serious problems with non-ascii strings,
+ whose size sometimes doubled at each program startup/stop.
+
+ </ul>
+ <h2>1.3.3</h2>
+ <ul>
+
+ <li> Several of the external filters did not handle path names with embedded
+ spaces (rcluncomp rclsoff rclps rclmedia rcldjvu). This is fixed in 1.4.
+
+ <li> If your QT installation is built with the QT_NO_STL flag, Recoll will not
+ compile. I have a patch for this (will be fixed in the next release),
+ contact me if you get the problem. Typical error message:
+ main.cpp:160: error: no match for 'operator+=' in 'msg += reason'
+
+ <li> The 'None of these words' field in the complex search does not work if
+ there are no other filled fields (it transforms into an ordinary
+ search). Workaround: enter very common term(s) in the 'any of these
+ words' field.
+
+ <li> Indexing cannot currently be conveniently and cleanly
+ stopped when it's started. You can kill the process, and
+ keyboard interrupt might work, but this may leave the
+ database in a bad state. This is fixed in the upcoming
+ release, there is no current workaround.
+ </ul>
+
+ <h2>1.2.2</h2>
+ <ul>
+ <li> The preview window is supposed to scroll after loading the document so
+ that the first search term is visible. This does not work in many cases.
+ <li> The result list title is not shown for sorted lists
+
+ Notes on older versions:
+ <li> Trouble compiling on some linux systems (Gentoo and Slackware?). There
+ existed a quite common issue where the Recoll link will fail trying to
+ use a libstdc++.la file. This was due to a problem with the xapian-config
+ program. A workaround has been included in the configure script for
+ recoll 1.2.2, and the problem should not occur any more.
+
+ <li> Case-insensitive search should now work in most cases
+ (used to not work except for accented ascii).
+
+ <li> All directories and files with names beginning with a dot were ignored
+ by the skippedNames directive in the default recoll.conf file from
+ older versions (no indexation of mozilla or thunderbird email !). An
+ upgrade will not fix this (it will not modify an existing
+ configuration). You need to edit recoll.conf by hand and remove the .*
+ from skippedNames.</li>
+
+ </ul>
+
+ </div>
+ </body>
+</html>