git clone https://@opensourceprojects.eu/git/p/recoll1/code recoll1-code



File Date Author Commit
bincimapmime 2005-04-06 dockes dockes [1293f0] re-port to linux
common 2005-09-22 dockes dockes [e1c3db] adjust start/end of word when trimming
filters 2005-02-09 dockes dockes [152d47] added support for openoffice and word + optimiz...
index 2005-04-07 dockes dockes [11bb23] replaced /usr/bin/file exec with internal code
internfile 2005-04-06 dockes dockes [1293f0] re-port to linux
lib 2005-04-07 dockes dockes [11bb23] replaced /usr/bin/file exec with internal code
mk 2005-04-08 dockes dockes [9ccd15] works on solaris8
qtgui 2005-10-10 dockes dockes [6cfe82] ckpt
query 2005-02-08 dockes dockes [4c54a8] fixes in textsplit
rcldb 2005-04-06 dockes dockes [1293f0] re-port to linux
sampleconf 2005-04-07 dockes dockes [11bb23] replaced /usr/bin/file exec with internal code
unac 2004-12-17 dockes dockes [ab473f] unac 1.7.0
utils 2005-04-08 dockes dockes [9ccd15] works on solaris8
COPYING 2005-02-04 dockes dockes [74434a] uncompression+linux port
INSTALL 2005-04-06 dockes dockes [1293f0] re-port to linux
Makefile 2005-04-08 dockes dockes [9ccd15] works on solaris8
README 2005-05-17 dockes dockes [4bb52a] *** empty log message ***
VERSION 2005-04-06 dockes dockes [1293f0] re-port to linux
excludefile 2005-02-08 dockes dockes [86572d] *** empty log message ***
makesrcdist.sh 2005-04-06 dockes dockes [1293f0] re-port to linux

Read Me

   Back to jf's home page

    

     * Screenshots
     * Downloads
     * Credits
     * User guide
     * Installation
   [IMG]

Recoll

  Introduction

   This is Recoll, a personal full text indexing system.

   Recoll is free and copyrighted under the GPL license, see COPYING inside
   the distribution. A lot of the code is imported from other packages, see
   the Credits.

   Recoll is still in infancy, but it is based on a very strong backend
   (Xapian), and it can actually be useful right now, which is why I release
   it so early. You might be interested in using Recoll to index your home
   directory instead of xapian's Omega, for example, if you do not want to
   run a web server, or your data is not iso-8859-1. But the query features
   are very, very, much weaker.

   See INSTALL inside the distribution for compiling and installing, very
   much by hand for now, I hope it will become better in the near future.

  Features:

     * Indexes text, html, pdf (with xpdf's pdftotext), postscript (with
       ghostscript's pstotext), msword (with antiword), openoffice files,
       maildir and mailbox mail folders (mozilla and thunderbird mail ok).
       Deals with compressed versions of same.
     * Support for multiple charsets. Internal processing and storage uses
       Unicode UTF-8.
     * Stemming performed at query time (can switch stemming language after
       indexing)
     * Easy installation. No database daemon, web server or exotic language
       necessary. The idea is that EVERYBODY should index their files because
       it makes life easier.
     * An ugly GUI, qt-based, written with qt Designer.
     * An indexer which runs either as a thread inside the GUI or as an
       external, cron'able program.

   recoll has been compiled and tested on FreeBSD, Linux and Solaris
   (versions FreeBSD 5.3, red hat 7.3, Solaris 8, but other not too distant
   releases should be ok too).

  Things lacking, coming in the not too far future:

     * A more sophisticated query interface: the current one has no boolean
       capabilities.
     * A better GUI. So many things are badly done or missing that I won't
       try to list them here.
     * An interactive configuration tool. You need to edit files by hand for
       now.
     * Packages, rpm or other. It's all tar files currently.
     * A build system, autoconf et al.
     * Documentation and help.
     * A few more filters for less common file types.

  Using recoll

     * Use File->Index to build/rebuild the database (what to index is
       defined in the configuration file, see the install doc).
     * Enter search terms in the upper left text field. There is no query
       language right now, the search only understand probabilistic terms
       (just words...), and double-quote enclosed phrases. Click Search or
       type CR
     * A result list should appear in the left pane. You can use the
       Next/Prev buttons to paginate.
     * Clicking on an entry in the list will display a preview in the right
       pane -- This can take some time for big postscript or pdf files, as
       the file is converted on the fly for preview --
     * Double-clicking on an entry should launch an external viewer, as
       specified in the mimeconf file (see INSTALL). This doesn't work for
       compressed files for now.

   I very much welcome suggestions or (gasp) code

   In hope that this can be useful to somebody, it already is for me.

  Downloads

   Current version: 0.7 (tar.gz)

   Older: 0.6

  Installation

    Prerequisites

   At the very least, you will need to download and install the xapian core
   package (I am currently using xapian version 0.8.5), and the qt runtime
   and development packages (I am currently using qt 3.3.3).

   You will most probably be able to find a binary package for qt for your
   system. You may have to compile xapian, but this is not difficult.

   You also need libiconv. I am currently using version 1.9. The iconv
   interface is part of libc on Linux systems, you shouldn't need to do
   anything there.

    Compiling, installing, using

   See the INSTALL file.

  Credits

   Recoll is mainly glue code, and most of the intelligent work uses code
   from external projects.

   Recoll borrows (steals?) heavily from the following projects. I tried to
   include the relevant copyright attributions with the code. Any omission is
   unintentional and will be fixed as soon as notified.

     * Xapian: The database module (core) is used unmodified, and quite a lot
       of code has been borrowed from Omega, the web-based search application
       (ie: the html parser, plus miscellaneous bits and ideas).
     * Estraier: Miscellaneous pieces of code and ideas, especially for
       charset handling, and code from external filters.
     * Unac: for accent removal. This is a relatively small package, not that
       easy to find, it has been integrated almost unmodified in the Recoll
       package.
     * Iconv, for character set conversion.
     * Binc IMAP for mail MIME MIME parsing code.

    jean-francois.dockes@wanadoo.fr