Switch to side-by-side view

--- a
+++ b/website/pages/recoll-webui-install-wsgi.txt
@@ -0,0 +1,200 @@
+= Recoll WebUI Apache installation from scratch 
+
+The https://github.com/koniu/recoll-webui[Recoll WebUI] offers an
+alternative, WEB-based, interface for querying a Recoll index.
+
+It can be quite useful to extend the use of a shared index to multiple
+workstations, without the need for a local Recoll installation and shared
+data storage.
+
+The Recoll WebUI is based on the
+http://bottlepy.org/docs/dev/index.html[Bottle Python framework], which has
+a built-in WEB server, and the simplest deployment approach is to run it
+standalone. However the built-in server is restricted to handling one
+request at a time, which is problematic in multi-user situations,
+especially because some requests, like extracting a result list into a CSV
+file, can take a significant amount of time.
+
+The Bottle framework can work with several multi-threading Python HTTP
+server libraries, but, given the limitations of the Recoll Python module
+and the Python interpreter itself, this will not yield optimal performance,
+and, especially can't efficiently leverage the now ubiquitous
+multiprocessors.
+
+In multi-user situations, you can get better performance and ease of use
+from the Recoll WebUI by running it under Apache rather than as a
+standalone process. With this approach, a few requests per second can
+easily be handled even in the presence of long-running ones.
+
+Neither Recoll nor the WebUI are optimized for high multi-user load, and it
+would be very unwise to use them as the search interface to a busy WEB
+site.
+
+The instructions about using the WebUI under Apache as given in the
+repository README are a bit terse, and are missing a few details,
+especially ones which impact performance.
+
+Here follows the synopsis of two WebUI installations on initially
+Apache-less Ubuntu (14.04) and DragonFly BSD systems. The first should
+extend easily to other Debian-based systems, the second at least to
+FreeBSD. rpm-based systems are left as an exercise to the reader, at least
+for now...
+
+
+CAUTION: THE CONFIGURATIONS DESCRIBED HAVE NO ACCESS CONTROL. ANYONE WITH
+ACCESS TO THE NETWORK WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY
+DOCUMENT.
+
+== On a Debian/Ubuntu system
+
+=== Install recoll 
+
+    sudo apt-get install recoll python-recoll
+
+Configure the indexing and check that the normal search works (I spent
+quite a lot of time trying to understand why the WebUI did not work, when
+in fact it was the normal recoll configuration which was broken and the
+regular search did not work either).
+
+Take care to be logged in as the user you want to run the web search as
+while you do this.
+
+
+=== Install the WebUI
+
+Clone the github repository, or extract the master tar installation, and
+move it to '/var/www/recoll-webui-master/'. Take care that it is read/execute
+accessible by your user.
+
+=== Install Apache and mod-wsgi
+
+
+    sudo apt-get install apache2 libapache2-mod-wsgi
+
+I then got the following message:
+
+    AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
+
+To clear it, I added a ServerName directive to the apache config, maybe you
+won't need it. Edit '/etc/apache2/sites-available/000-default.conf' and add
+the following at the top (globally). Things work without this fix anyway,
+this is just to suppress the error message. You probably need to adjust the
+address or use a real host name:
+
+    ServerName 192.168.4.6
+
+
+Edit '/etc/apache2/mods-enabled/wsgi.conf', add the following at the end of
+the "IfModule" section.
+
+Change the user ('dockes' in the example) taking care that he is the one who
+owns the index ('.recoll' is in his home directory).
+
+    WSGIDaemonProcess recoll user=dockes group=dockes \
+        threads=1 processes=5 display-name=%{GROUP} \
+        python-path=/var/www/recoll-webui-master
+    WSGIScriptAlias /recoll /var/www/recoll-webui-master/webui-wsgi.py
+    <Directory /var/www/recoll-webui-master>
+            WSGIProcessGroup recoll
+            Order allow,deny
+            allow from all
+    </Directory>
+
+NOTE: the Recoll WebUI application is mostly single-threaded, so it is of
+little use (and may actually be counter-productive in some cases) to
+specify multiple threads on the WSGIDaemonProcess line. Specify multiple
+processes instead to put multiple CPUs to work on simultaneous requests.
+
+
+Then run the following to restart apache:
+
+    sudo apachectl restart
+
+The Recoll WebUI should now be accessible. on 'http://my.server.com/recoll/'
+
+NOTE: Take care that you need a '/' at the end of the URL used to access
+the search (use: 'http://my.server.com/recoll/', not
+'http://my.server.com/recoll'), else files other than the script itself are
+not found (the page looks weird and the search does not work).
+
+CAUTION: THERE IS NO ACCESS CONTROL. ANYONE WITH ACCESS TO THE NETWORK
+WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY DOCUMENT.
+
+== Variant for BSD/ports
+
+=== Packages
+
+As root:
+
+    pkg install recoll
+
+
+Do what you need to do to configure the indexing and check that the normal
+search works.
+
+Take care to be logged in as the user you want to run the web search as
+while you do this.
+
+    pkg install apache24
+
+Add apache24_enable="YES" in /etc/rc.conf
+
+    pkg install ap24-mod_wsgi4
+    pkg install git
+
+=== Clone the webui repository
+
+    cd /usr/local/www/apache24/
+    git clone https://github.com/koniu/recoll-webui.git recoll-webui-master
+
+Important: most input handler helper applications (e.g. 'pdftotext') are
+installed in '/usr/local/bin' which is not in the PATH as seen by Apache
+(at least on DragonFly). The simplest way to fix this is to modify the
+launcher module for the webui app so that it fixes the PATH.
+
+Edit 'recoll-webui-master/webui-wsgi.py' and add the following line after
+the 'import os' line:
+
+    os.environ['PATH'] = os.environ['PATH'] + ':' + '/usr/local/bin'
+
+
+
+=== Configure apache
+
+Edit /usr/local/etc/apache24/modules.d/270_mod_wsgi.conf
+
+Uncomment the LoadModule line, and add the directives to alias /recoll/ to
+the webui script.
+
+Change the user (dockes in the example) taking care that he is the one who
+owns the index (.recoll is in his home directory).
+
+Contents of the file:
+
+    ## $FreeBSD$
+    ## vim: set filetype=apache:
+    ##
+    ## module file for mod_wsgi
+    ##
+    ## PROVIDE: mod_wsgi
+    ## REQUIRE:
+    
+    LoadModule wsgi_module        libexec/apache24/mod_wsgi.so
+    
+    WSGIDaemonProcess recoll user=dockes group=dockes \
+        threads=1 processes=5 display-name=%{GROUP} \
+        python-path=/usr/local/www/apache24/recoll-webui-master/
+    WSGIScriptAlias /recoll /usr/local/www/apache24/recoll-webui-master/webui-wsgi.py
+    
+    <Directory /usr/local/www/apache24/recoll-webui-master>
+            WSGIProcessGroup recoll
+            Require all granted
+    </Directory>
+
+=== Restart apache
+
+As root:
+
+    apachectl restart
+
+