= Recoll WebUI Apache installation from scratch The https://github.com/koniu/recoll-webui[Recoll WebUI] offers an alternative, WEB-based, interface for querying a Recoll index. It can be quite useful to extend the use of a shared index to multiple workstations, without the need for a local Recoll installation and shared data storage. The Recoll WebUI is based on the http://bottlepy.org/docs/dev/index.html[Bottle Python framework], which has a built-in WEB server, and the simplest deployment approach is to run it standalone. However the built-in server is restricted to handling one request at a time, which is problematic in multi-user situations, especially because some requests, like extracting a result list into a CSV file, can take a significant amount of time. The Bottle framework can work with several multi-threading Python HTTP server libraries, but, given the limitations of the Recoll Python module and the Python interpreter itself, this will not yield optimal performance, and, especially can't efficiently leverage the now ubiquitous multiprocessors. In multi-user situations, you can get better performance and ease of use from the Recoll WebUI by running it under Apache rather than as a standalone process. With this approach, a few requests per second can easily be handled even in the presence of long-running ones. Neither Recoll nor the WebUI are optimized for high multi-user load, and it would be very unwise to use them as the search interface to a busy WEB site. The instructions about using the WebUI under Apache as given in the repository README are a bit terse, and are missing a few details, especially ones which impact performance. Here follows the synopsis of two WebUI installations on initially Apache-less Ubuntu (14.04) and DragonFly BSD systems. The first should extend easily to other Debian-based systems, the second at least to FreeBSD. rpm-based systems are left as an exercise to the reader, at least for now... CAUTION: THE CONFIGURATIONS DESCRIBED HAVE NO ACCESS CONTROL. ANYONE WITH ACCESS TO THE NETWORK WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY DOCUMENT. == On a Debian/Ubuntu system === Install recoll sudo apt-get install recoll python-recoll Configure the indexing and check that the normal search works (I spent quite a lot of time trying to understand why the WebUI did not work, when in fact it was the normal recoll configuration which was broken and the regular search did not work either). Take care to be logged in as the user you want to run the web search as while you do this. === Install the WebUI Clone the github repository, or extract the master tar installation, and move it to '/var/www/recoll-webui-master/'. Take care that it is read/execute accessible by your user. === Install Apache and mod-wsgi sudo apt-get install apache2 libapache2-mod-wsgi I then got the following message: AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message To clear it, I added a ServerName directive to the apache config, maybe you won't need it. Edit '/etc/apache2/sites-available/000-default.conf' and add the following at the top (globally). Things work without this fix anyway, this is just to suppress the error message. You probably need to adjust the address or use a real host name: ServerName 192.168.4.6 Edit '/etc/apache2/mods-enabled/wsgi.conf', add the following at the end of the "IfModule" section. Change the user ('dockes' in the example) taking care that he is the one who owns the index ('.recoll' is in his home directory). WSGIDaemonProcess recoll user=dockes group=dockes \ threads=1 processes=5 display-name=%{GROUP} \ python-path=/var/www/recoll-webui-master WSGIScriptAlias /recoll /var/www/recoll-webui-master/webui-wsgi.py WSGIProcessGroup recoll Order allow,deny allow from all NOTE: the Recoll WebUI application is mostly single-threaded, so it is of little use (and may actually be counter-productive in some cases) to specify multiple threads on the WSGIDaemonProcess line. Specify multiple processes instead to put multiple CPUs to work on simultaneous requests. Then run the following to restart apache: sudo apachectl restart The Recoll WebUI should now be accessible. on 'http://my.server.com/recoll/' NOTE: Take care that you need a '/' at the end of the URL used to access the search (use: 'http://my.server.com/recoll/', not 'http://my.server.com/recoll'), else files other than the script itself are not found (the page looks weird and the search does not work). CAUTION: THERE IS NO ACCESS CONTROL. ANYONE WITH ACCESS TO THE NETWORK WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY DOCUMENT. == Variant for BSD/ports === Packages As root: pkg install recoll Do what you need to do to configure the indexing and check that the normal search works. Take care to be logged in as the user you want to run the web search as while you do this. pkg install apache24 Add apache24_enable="YES" in /etc/rc.conf pkg install ap24-mod_wsgi4 pkg install git === Clone the webui repository cd /usr/local/www/apache24/ git clone https://github.com/koniu/recoll-webui.git recoll-webui-master Important: most input handler helper applications (e.g. 'pdftotext') are installed in '/usr/local/bin' which is not in the PATH as seen by Apache (at least on DragonFly). The simplest way to fix this is to modify the launcher module for the webui app so that it fixes the PATH. Edit 'recoll-webui-master/webui-wsgi.py' and add the following line after the 'import os' line: os.environ['PATH'] = os.environ['PATH'] + ':' + '/usr/local/bin' === Configure apache Edit /usr/local/etc/apache24/modules.d/270_mod_wsgi.conf Uncomment the LoadModule line, and add the directives to alias /recoll/ to the webui script. Change the user (dockes in the example) taking care that he is the one who owns the index (.recoll is in his home directory). Contents of the file: ## $FreeBSD$ ## vim: set filetype=apache: ## ## module file for mod_wsgi ## ## PROVIDE: mod_wsgi ## REQUIRE: LoadModule wsgi_module libexec/apache24/mod_wsgi.so WSGIDaemonProcess recoll user=dockes group=dockes \ threads=1 processes=5 display-name=%{GROUP} \ python-path=/usr/local/www/apache24/recoll-webui-master/ WSGIScriptAlias /recoll /usr/local/www/apache24/recoll-webui-master/webui-wsgi.py WSGIProcessGroup recoll Require all granted === Restart apache As root: apachectl restart