--- a
+++ b/website/faqsandhowtos/WhyIsMyFileNotIndexed.txt
@@ -0,0 +1,99 @@
+== Using the log file to investigate indexing issues
+
+All *Recoll* processes print trace messages. By default these go to the
+standard error output, and you may not ever see them (in the case, for
+example, of the *recoll* GUI started from the desktop interface). 
+
+There are a number of potential issues with indexing that may need
+investigation, such as: 
+
+- A file can't be found by searching even if it appears that it should have
+  be indexed (this could happen because the file is not selected at all or
+  because a filter program crashes). 
+- The indexing process gets stuck and never finishes.
+- The indexing process ends up with an error.
+- The indexing process seems to be using too much system capacity.
+
+The right way to approach these problems is to use the *recollindex*
+command line tool (instead of the *recoll* GUI), and to set up the
+trace log to provide information about what indexing is actually doing. 
+
+Trace log parameters can be set either from the GUI _Preferences->Indexing
+Configuration->Global Parameters_ panel, or by editing the configuration
+file '~/.recoll/recoll.conf'. You should set the following parameters: 
+
+----
+loglevel = 6
+logfilename = stderr
+thrQSizes = -1 -1 -1
+----
+
+We use _stderr_ instead of an actual file in order to capture direct filter
+messages (such as a *python* stack trace) along with normal
+*recollindex* messages. 
+
+The last line sets recollindex for single-threaded operation, which will
+make the log much more readable. 
+
+You should then check that no *recoll* or *recollindex* process is
+currently running, and kill any you find. 
+
+Then, if this is an issue about an identified file, try indexing it only:
+
+----
+recollindex -i myunfindablefile.xxx > /tmp/myindexlog 2>&1
+----
+
+If this is a general issue with indexing (process not finishing properly),
+just start it: 
+
+----
+recollindex > /tmp/myindexlog 2>&1
+----
+
+Usually, having a look at the trace will allow to see what is wrong (e.g.:
+a configuration issue or missing filter), and solve the problem.  
+
+In case of indexer misbehaviour (e.g. using too much memory, you should run
+_tail -f_ on the log to see what is going on. 
+
+If this is not enough, please
+link:http://bitbucket.org/medoc/recoll/issues/new[open a tracker issue] and
+attach or link to the log data, or just email me (jfd at recoll.org). 
+
+*recollindex* and *recollindex -i* usually have the same criteria to
+include a file or not (but see the _Path gotcha_ note below). It may
+happen that they behave differently, so it may sometimes be useful to run a
+full *recollindex* even for a specific file, but this will produce a
+big log file. 
+
+When you are done, it is  better to reset the verbosity to a reasonable
+level (e.g.: +2+ : just errors, +4+ : basic traces). 
+
+=== Note: the path gotcha
+
+*recollindex -i* will only index files under the directories defined by the
++topdirs+ configuration variable (your home directory by
+default). Unfortunately, the test is done on the file path text, ignoring
+possible symbolic links. If you give a simple file name as a parameter to
+*recollindex -i* and there are symbolic links inside the +topdirs+
+entries, the comparison may fail. For example, if your home directory is
+'/home/me/' and '/home/' is a link to '/usr/home/', *recollindex -i
+somefilename* will actually try to index '/usr/home/somefilename/', and
+fail (because '/usr/home/me/' is not a subdirectory of '/home/me/'). This
+will manifest itself in the log by a message like the following.  
+
+----
+:4:../index/fsindexer.cpp:149:FsIndexer::indexFiles: skipping [/usr/home/me/somefile] (ntd)
+----
+
+If this happens, give a full path consistent with what is found in the
+configuration file (e.g.: _recollindex -i /home/me/somefile_). 
+
+=== File system occupation
+
+One of the possible reasons for failed indexing is a +maxfsoccup+
+parameter set too low. This is the value of file system occupation, not
+free space, where indexing will stop. It is set from the GUI indexing
+configuration or by editing 'recoll.conf'. A value of 0 implies no
+checking, but a very low, non-zero, value will just prevent indexing.