== Using the log file to investigate indexing issues All *Recoll* processes print trace messages. By default these go to the standard error output, and you may not ever see them (in the case, for example, of the *recoll* GUI started from the desktop interface). There are a number of potential issues with indexing that may need investigation, such as: - A file can't be found by searching even if it appears that it should have be indexed (this could happen because the file is not selected at all or because a filter program crashes). - The indexing process gets stuck and never finishes. - The indexing process ends up with an error. - The indexing process seems to be using too much system capacity. The right way to approach these problems is to use the *recollindex* command line tool (instead of the *recoll* GUI), and to set up the trace log to provide information about what indexing is actually doing. Trace log parameters can be set either from the GUI _Preferences->Indexing Configuration->Global Parameters_ panel, or by editing the configuration file '~/.recoll/recoll.conf'. You should set the following parameters: ---- loglevel = 6 logfilename = stderr thrQSizes = -1 -1 -1 ---- We use _stderr_ instead of an actual file in order to capture direct filter messages (such as a *python* stack trace) along with normal *recollindex* messages. The last line sets recollindex for single-threaded operation, which will make the log much more readable. You should then check that no *recoll* or *recollindex* process is currently running, and kill any you find. Then, if this is an issue about an identified file, try indexing it only: ---- recollindex -i myunfindablefile.xxx > /tmp/myindexlog 2>&1 ---- If this is a general issue with indexing (process not finishing properly), just start it: ---- recollindex > /tmp/myindexlog 2>&1 ---- Usually, having a look at the trace will allow to see what is wrong (e.g.: a configuration issue or missing filter), and solve the problem. In case of indexer misbehaviour (e.g. using too much memory, you should run _tail -f_ on the log to see what is going on. If this is not enough, please link:https://opensourceprojects.eu/p/recoll1/tickets/new/[open a tracker issue] and attach or link to the log data, or just email me (jfd at recoll.org). *recollindex* and *recollindex -i* usually have the same criteria to include a file or not (but see the _Path gotcha_ note below). It may happen that they behave differently, so it may sometimes be useful to run a full *recollindex* even for a specific file, but this will produce a big log file. When you are done, it is better to reset the verbosity to a reasonable level (e.g.: +2+ : just errors, +3+ : information, listing indexed files). === Note: the path gotcha *recollindex -i* will only index files under the directories defined by the +topdirs+ configuration variable (your home directory by default). Unfortunately, the test is done on the file path text, ignoring possible symbolic links. If you give a simple file name as a parameter to *recollindex -i* and there are symbolic links inside the +topdirs+ entries, the comparison may fail. For example, if your home directory is '/home/me/' and '/home/' is a link to '/usr/home/', *recollindex -i somefilename* will actually try to index '/usr/home/somefilename/', and fail (because '/usr/home/me/' is not a subdirectory of '/home/me/'). This will manifest itself in the log by a message like the following. ---- :4:../index/fsindexer.cpp:149:FsIndexer::indexFiles: skipping [/usr/home/me/somefile] (ntd) ---- If this happens, give a full path consistent with what is found in the configuration file (e.g.: _recollindex -i /home/me/somefile_). === File system occupation One of the possible reasons for failed indexing is a +maxfsoccup+ parameter set too low. This is the value of file system occupation, not free space, where indexing will stop. It is set from the GUI indexing configuration or by editing 'recoll.conf'. A value of 0 implies no checking, but a very low, non-zero, value will just prevent indexing.