Parent: [ad9e08] (diff)

Child: [e4da5e] (diff)

Download this file

BUGS.html    393 lines (337 with data), 16.2 kB

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Recoll known bugs</title>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta name="Author" content="Jean-Francois Dockes">
<meta name="Description" content=
"recoll is a simple full-text search system for unix and linux
based on the powerful and mature xapian engine">
<meta name="Keywords" content=
"full text search, desktop search, unix, linux">
<meta http-equiv="Content-language" content="en">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="All,Index,Follow">
<link type="text/css" rel="stylesheet" href="styles/style.css">
</head>
<body>
<div class="rightlinks">
<ul>
<li><a href="index.html">Home</a></li>
<li><a href="download.html">Downloads</a></li>
<li><a href="doc.html">Documentation</a></li>
</ul>
</div>
<div class="content">
<h1>Known bugs in current and older versions</h1>
<p><i>Bugs that are listed in an older version section are
supposedly fixed in later versions. Bugs listed in the
topmost section may also exist in older versions.</i></p>
<h2><a name="b_latest">Latest (recoll 1.12.0 + xapian 1.0.10)</a></h2>
<ul>
<li>To compile the Python interface for recoll 1.12, you need
to edit setup.py and replace "rcldb/pathhash.cpp" with
"utils/fileudi.cpp".</li>
<li>Performing a full index with release 1.11 or later, over a
version created with a much older recoll release may
sometimes end with an error saying "backend doesn't
implement metadata". If this happens, you need to delete
the index directory (typically <em>~/.recoll/xapiandb/</em>)
and restart indexing. For big indexes, removing the
directory preventively may be preferable to avoid losing
time.</li>
<li> When Recoll is built with qt 4.4.0, the icons in the
result list are all displayed at the top of the page and
garbled. This appears to be a qt bug, fixed in 4.4.1. Use
either qt 4.3.x or 4.4.1</li>
<li> If the user-chosen result list entry format results in
several paragraphs (in the qt textedit sense), right clicks
will only work inside the first one for each entry.</li>
<li> When a mime type has an external viewer defined, but the
actual file is compressed (ie: xxx.txt.gz), recoll will try
to start the external viewer on the compressed file, which
will not work in most cases.</li>
<li>NEAR expansion errors: recoll performs stemming expansion
inside NEAR clauses (except if prevented by a capitalized
entry). Because of a Xapian bug (at least up to 1.0.10),
NEAR does not support multiple OR subclauses. This manifests
itself by a 'not implemented' Xapian exception or an
explicit error message. Workarounds:
<ul>
<li>Prevent expansion of NEAR terms (possibly except one) by
capitalizing them.
<li>Or apply the following patch to xapian, inside the
"api/" directory:<br>
0.x versions:
<a href="xapian/xapNearDistrib-0.x.patch">
xapian/xapNearDistrib-0.x.patch</a>
<br>
1.0.[0-9]:
<a href="xapian/xapNearDistrib-1.0.0_9.patch">
xapian/xapNearDistrib-1.0.0_9.patch</a>
<br>
1.0.10:
<a href="xapian/xapNearDistrib-1.0.10.patch">
xapian/xapNearDistrib-1.0.10.patch</a>
<br>
or fetch the already patched source from
<a href="xapian/">xapian/</a>
then recompile, and install.
</li>
</ul>
I hope that an equivalent fix will make it into xapian at
some point (the current fix is not completely correct but
still handles most useful cases).</li>
<li>It seems that the recoll program sometimes segfaults when
exiting after the first execution ?</li>
<li> If you are seeing a delay of a few seconds before the
result list displays for the first query of a recoll
instance, try changing the result list font in the query
preferences. This is not a recoll problem, I don't know the
exact cause (I've seen it happen with "Sans Serif" and go
away with Helvetica or Arial).</li>
<li> Under some versions of KDE (ie: Fedora FC5 KDE
3.5.4-0.5.fc5), there is a problem with the window stacking
order. Opening the "browse" file selection dialog from the
advanced search dialog will stack the latter under the main
window, possibly making it invisible. This is quite probably
a Kwin bug, possibly related to
http://bugs.kde.org/show_bug.cgi?id=79183 or a correction
thereof.</li>
<li> Under Solaris, it is necessary to perform initial indexing with the
recollindex program (the recoll index thread doesn't work for creating
the database). Don't know the reason. Only idea I have is problem with
exception handling (recoll catches an exception while trying the
yet inexistant db).</li>
<li>The default filter for files in microsoft word format
(application/msword, .doc), antiword, has trouble with some
relatively rare files with a very small text, resulting in the
following error message:
<blockquote>
I'm afraid the text stream of this file is too small to
handle.
</blockquote>
Only small files produced by Microsoft Word on a Mac, or by
OpenOffice will trigger this message. As a workaround, install
wvWare and modify mimeconf to use the rcldoc filter, which
will use vwWare if it is available. This will result in
slower indexing for doc files.</li>
</ul>
<h2><a name="b_1_11_4">1.11.4</a></h2>
<ul>
<li>Possibly harmful bug in strerror_r usage (GNU case).</li>
<li>Incorrect handling of "accents" inside Japanese katakana
text.</li>
<li>Using the "Erase history" command on an empty history
would cause recoll to crash.</li>
</ul>
<h2><a name="b_1_11_1">1.11.1</a></h2>
<ul>
<li>Unicode space characters like
<em>0x3000,&nbsp;Ideographic&nbsp;space</em>
where not detected inside user entries like the main
interface search entry. Badly parsed searches would retrieve no
results, when the same search entered with ascii space characters
would have succeeded.</li>
<li>Spaces were inserted inside CJK strings when building
abstracts for the result list.</li>
<li>Accent removal should not be performed for Japanese.</li>
<li>When using the query language, an OR part with more than
two terms will swallow preceding AND terms, one for each
additional OR. Ex: (champagne ext:odt OR ext:sxw OR ext:lyx)
will be interpreted as
"champagne OR ext:odt OR ext:sxw OR ext:lyx"
instead of the correct
"champagne AND (ext:odt OR ext:sxw OR ext:lyx)"
Workaround until the fix is issued: add non-existing terms
before the OR part and check the resulting query:
"champagne bogusxyztv ext:odt OR ext:sxw OR ext:lyx"
</li>
<li>The "Copy file name" and "Copy URL" entries of the
right-click menus only copy the data to the X11 primary
selection (use middle-button click to paste). This is
probably a mistake, the data should be copied to the
clipboard too (permitting the use of the "Paste" edit menu
entry or Ctrl+V in the target).</li>
<li>Possibly harmful bug in strerror_r usage (GNU case).</li>
</ul>
<h2>1.10.6</h2>
<ul>
<li> If the locale is not utf-8, non-ascii command line
arguments to recoll and recollq are not converted to utf-8,
which may prevent, for example, the kde applet from
working. The workaround is to apply the following one-line
fix to qtgui/main.cpp, recompile and install recoll:
<pre>
386c386
&lt; sSearch->setSearchString(QString::fromUtf8(qstring.c_str()));
---
&gt; sSearch->setSearchString(QString::fromLocal8Bit(qstring.c_str()));
</pre>
</li>
</ul>
<h2>1.10.1</h2>
<ul>
<li> A relatively simple error case can cause the indexer to
stop processing an mbox file (forgetting all subsequent
messages). More specifically, this happens when encountering
more than than a few dozen errors while handling
attachments. This is relatively common: for exemple if an
external helper application is missing and multiple
attachments of the affected type are found (ie: multiple
images and no exiftool). Workaround: install the helper
application.
<li> The decoding of base-64 data in emails fails in a relatively uncommon
but sometimes encountered case.
<li> In a preview window, when walking the search term hits with the
Previous/Next buttons, 'Previous' actually acts as 'Next' (it does work
normally for the local search).
<li> Problems in detecting message separators inside Thunderbird mailboxes
(quite probably mainly for messages imported from outlook?). Can lead to
unindexed messages, and even apparently indexer crashes in some cases.
<li> File names indexed as terms can sometimes overflow the maximum term
size, halting the indexing.
<li> For Phrase/Near searches, only the first term group is highlighted in
preview.
</ul>
<h2>1.10.0</h2>
<ul>
<li> If a filter fails while trying to extract the data from a file, the file
will not be indexed at all (not even the file name). The file
name should be indexed in this case. This happens in particular in the
very common case where the helper application is not installed (ie:
missing Exiftool -> no *.jpg names in the index).
<li> If several query language "ext:" qualifiers are specified, they will be
joined by an AND instead of OR, resulting in no results. Using an
explicit OR doesn't work (actually OR + field names is generally
broken). In some cases, you can use a "type:" qualifier as a workaround.
</ul>
<h2>1.9.x</h2>
<ul>
<li> Problems have been reported indexing big mailstores (several hundreds of
thousands of messages): resulting in a very big database and even
crashes.
</ul>
<h2>1.8.2</h2>
<ul>
<li> Under ubuntu (at least, maybe debian too), the default awk interpreter
(mawk) is ancient, and the recoll pdf input filter does not
work (removes all space characters). This can be solved by installing the
gawk package.
$ apt-get install gawk
$ update-alternatives --set awk /usr/bin/gawk
<li> There are sometimes problems with document deletions: the index can
get in a state where deleted or moved documents are not purged from the
index (the log file says that the doc are deleted, but they aren't
actually). When this happens, the only solution currently is to reindex
from scratch (recollindex -z). This is due to a xapian bug, which is
fixed in xapian 1.0.2, or you can apply the following patch to xapian
1.0.1 to fix it:
http://www.lesbonscomptes.com/recoll/xapian/xapian-delete-document.patch
<li> The dates shown for email attachments in a result list are the email
folder modification date. This should be inherited from the parent
message instead.
<li> There are a few problems in the qt4 version of recoll:
<li> Some accelerators (esc-spc, ctl-arrow) do not work, neither do
copy/paste between the result list and preview windows and x11
applications.
<li> The qt4 q3textedit::find() method is extremely slow, so that
positionning to first search term in Recoll preview has been disabled,
and the application will sometimes appear to be looping when using the
find feature in the preview window (it's not looping, it's searching...)
</ul>
<h2>1.8.1</h2>
<ul>
<li> This is not really a bug but .beagle really should be included in
"skippedNames", or you end up indexing the beagle text cache, which is
not really desirable.
<li> Doc bug: the manual states that the query language supports a "mime:"
switch to filter mime types. There is currently no such thing.
</ul>
<h2>1.7.5</h2>
<ul>
<li> Debian and Ubuntu: the rclsoff Openoffice filter doesn't work,
because of an incorrect shell syntax (understood by bash but not sh). To
fix, you edit /usr[/local]/share/recoll/filters/rclsoff and can change
the line:
trap cleanup EXIT SIGHUP SIGQUIT SIGINT SIGTERM
into:
trap cleanup EXIT HUP QUIT INT TERM
or download the updated filter from the filters page:
http://www.recoll.org/filters/filters.html
</ul>
<h2>1.7.3</h2>
<ul>
<li> Processing will stop on first error while indexing an mbox file. This
could happen just because an attachment could not be decoded, and can
cause non-indexing of many messages. The most probable cause of error is
a missing filter (ie for ms-word files), so the temporary workaround
would be to install the missing filters. This bug is specific to 1.7 and
1.6 users need not worry. A correction will be issued very soon.
<li> Messages of type multipart/signed are not indexed.
</ul>
<h2>1.6.2</h2>
<ul>
<li> Relatively unfrequent issue with message boundary detection in mbox
files, could cause miscellaneous problems.
<li> Executing an external viewer for a file with single-quotes in the name
would not work.
</ul>
<h2>1.5.10</h2>
<ul>
<li> If a defaultcharset was set in the configuration file for a subdirectory,
it would stay in effect for all subsequent files/directories (except if
explicitely overridden), potentially causing many transcoding errors.
</ul>
<h2>1.5.[1-7]</h2>
<ul>
<li> Dates in result list come from the file's ctimes, which may be confusing
<li> Some rare MIME messages with null boundaries can crash the indexer.
</ul>
<h2>1.5.0</h2>
<ul>
<li> Under some conditions, recoll startup and exit could be very slow: the
simple search history list had serious problems with non-ascii strings,
whose size sometimes doubled at each program startup/stop.
</ul>
<h2>1.3.3</h2>
<ul>
<li> Several of the external filters did not handle path names with embedded
spaces (rcluncomp rclsoff rclps rclmedia rcldjvu). This is fixed in 1.4.
<li> If your QT installation is built with the QT_NO_STL flag, Recoll will not
compile. I have a patch for this (will be fixed in the next release),
contact me if you get the problem. Typical error message:
main.cpp:160: error: no match for 'operator+=' in 'msg += reason'
<li> The 'None of these words' field in the complex search does not work if
there are no other filled fields (it transforms into an ordinary
search). Workaround: enter very common term(s) in the 'any of these
words' field.
<li> Indexing cannot currently be conveniently and cleanly
stopped when it's started. You can kill the process, and
keyboard interrupt might work, but this may leave the
database in a bad state. This is fixed in the upcoming
release, there is no current workaround.
</ul>
<h2>1.2.2</h2>
<ul>
<li> The preview window is supposed to scroll after loading the document so
that the first search term is visible. This does not work in many cases.
<li> The result list title is not shown for sorted lists
Notes on older versions:
<li> Trouble compiling on some linux systems (Gentoo and Slackware?). There
existed a quite common issue where the Recoll link will fail trying to
use a libstdc++.la file. This was due to a problem with the xapian-config
program. A workaround has been included in the configure script for
recoll 1.2.2, and the problem should not occur any more.
<li> Case-insensitive search should now work in most cases
(used to not work except for accented ascii).
<li> All directories and files with names beginning with a dot were ignored
by the skippedNames directive in the default recoll.conf file from
older versions (no indexation of mozilla or thunderbird email !). An
upgrade will not fix this (it will not modify an existing
configuration). You need to edit recoll.conf by hand and remove the .*
from skippedNames.</li>
</ul>
</div>
</body>
</html>