* Screenshots * Downloads * Credits * User guide * Installation [IMG] Recoll This is Recoll, a personal full text indexing system. Recoll is free and copyrighted under the GPL license, see COPYING inside the distribution. A lot of the code is imported from other packages, see the Credits. Recoll is still in infancy, but it is based on a very strong backend (Xapian), and I find it quite useful right now. You might be interested in using Recoll to index your home directory instead of using xapian's Omega, for example, if you do not want to run a web server, or your data is not iso-8859-1. But the query features are much less sophisticated for now. See INSTALL inside the distribution for compiling and installing, very much by hand for now, I hope things will get better in the near future. Features: * Supports the following document types: text, html, pdf (with xpdf's pdftotext), postscript (with ghostscript's pstotext), msword (with antiword), openoffice files, maildir and mailbox mail folders (mozilla and thunderbird mail ok). Deals with compressed versions of same. * Relatively powerful query facilities, with boolean searches, phrases, filter on file types and directory tree. * Support for multiple charsets. Internal processing and storage uses Unicode UTF-8. * Stemming performed at query time (can switch stemming language after indexing) * Easy installation. No database daemon, web server or exotic language necessary. The idea is that EVERYBODY should index their files because it makes life easier. * An ugly GUI, qt-based, written with qt Designer. * An indexer which runs either as a thread inside the GUI or as an external, cron'able program. recoll has been compiled and tested on FreeBSD, Linux and Solaris (versions FreeBSD 5.3, red hat 7.3, Solaris 8, but other not too distant releases should be ok too). Things lacking, coming in the not too far future: * A better GUI. So many things are badly done or missing that I won't try to list them here. * An interactive configuration tool. You need to edit files by hand for now. * Packages, rpm or other. It's all tar files currently. * A build system, autoconf et al. * Documentation and help. * A few more filters for less common file types. Using recoll * Use File->Index to build/rebuild the database (what to index is defined in the configuration file, see the install doc). * Enter search terms in the upper left text field. There is no query language right now, the search only understand probabilistic terms (just words...), and double-quote enclosed phrases. Click Search or type CR * A result list should appear in the left pane. You can use the Next/Prev buttons to paginate. * Clicking on an entry in the list will display a preview in the right pane -- This can take some time for big postscript or pdf files, as the file is converted on the fly for preview -- * Double-clicking on an entry should launch an external viewer, as specified in the mimeconf file (see INSTALL). This doesn't work for compressed files for now. I very much welcome suggestions or (gasp) code In hope that this can be useful to somebody, it already is for me. Downloads Current version: 1.0 (tar.gz) Older: 0.7 Installation Prerequisites At the very least, you will need to download and install the xapian core package (I am currently using xapian version 0.8.5), and the qt runtime and development packages (I am currently using qt 3.3.3). You will most probably be able to find a binary package for qt for your system. You may have to compile xapian, but this is not difficult. You also need libiconv. I am currently using version 1.9. The iconv interface is part of libc on Linux systems, you shouldn't need to do anything there. External file types: recoll uses external applications to index some file types. You need to install them for the file types that you wish to have indexed: * MS Word documents: antiword. * PDF files: pdftotext is part of the Xpdf package. * Postscript files: pstotext. Compiling, installing, using See the INSTALL file. Credits Recoll is mainly glue code, and most of the intelligent parts use code from external projects. Recoll borrows (steals?) heavily from the following projects. I tried to include the relevant copyright attributions with the code. Any omission is unintentional and will be fixed as soon as notified. * Xapian: The database module (core) is used unmodified, and quite a lot of code has been borrowed from Omega, the web-based search application (ie: the html parser, plus miscellaneous bits and ideas). * Estraier: Miscellaneous pieces of code and ideas, especially for charset handling, and code from external filters. * Unac: for accent removal. This is a relatively small package, not that easy to find, it has been integrated almost unmodified in the Recoll package. * Iconv, for character set conversion. * Binc IMAP for MIME parsing code. jean-francois.dockes@wanadoo.fr