|
a |
|
b/website/pages/recoll-webui-install-wsgi.txt |
|
|
1 |
= Recoll WebUI Apache installation from scratch
|
|
|
2 |
|
|
|
3 |
The https://github.com/koniu/recoll-webui[Recoll WebUI] offers an
|
|
|
4 |
alternative, WEB-based, interface for querying a Recoll index.
|
|
|
5 |
|
|
|
6 |
It can be quite useful to extend the use of a shared index to multiple
|
|
|
7 |
workstations, without the need for a local Recoll installation and shared
|
|
|
8 |
data storage.
|
|
|
9 |
|
|
|
10 |
The Recoll WebUI is based on the
|
|
|
11 |
http://bottlepy.org/docs/dev/index.html[Bottle Python framework], which has
|
|
|
12 |
a built-in WEB server, and the simplest deployment approach is to run it
|
|
|
13 |
standalone. However the built-in server is restricted to handling one
|
|
|
14 |
request at a time, which is problematic in multi-user situations,
|
|
|
15 |
especially because some requests, like extracting a result list into a CSV
|
|
|
16 |
file, can take a significant amount of time.
|
|
|
17 |
|
|
|
18 |
The Bottle framework can work with several multi-threading Python HTTP
|
|
|
19 |
server libraries, but, given the limitations of the Recoll Python module
|
|
|
20 |
and the Python interpreter itself, this will not yield optimal performance,
|
|
|
21 |
and, especially can't efficiently leverage the now ubiquitous
|
|
|
22 |
multiprocessors.
|
|
|
23 |
|
|
|
24 |
In multi-user situations, you can get better performance and ease of use
|
|
|
25 |
from the Recoll WebUI by running it under Apache rather than as a
|
|
|
26 |
standalone process. With this approach, a few requests per second can
|
|
|
27 |
easily be handled even in the presence of long-running ones.
|
|
|
28 |
|
|
|
29 |
Neither Recoll nor the WebUI are optimized for high multi-user load, and it
|
|
|
30 |
would be very unwise to use them as the search interface to a busy WEB
|
|
|
31 |
site.
|
|
|
32 |
|
|
|
33 |
The instructions about using the WebUI under Apache as given in the
|
|
|
34 |
repository README are a bit terse, and are missing a few details,
|
|
|
35 |
especially ones which impact performance.
|
|
|
36 |
|
|
|
37 |
Here follows the synopsis of two WebUI installations on initially
|
|
|
38 |
Apache-less Ubuntu (14.04) and DragonFly BSD systems. The first should
|
|
|
39 |
extend easily to other Debian-based systems, the second at least to
|
|
|
40 |
FreeBSD. rpm-based systems are left as an exercise to the reader, at least
|
|
|
41 |
for now...
|
|
|
42 |
|
|
|
43 |
|
|
|
44 |
CAUTION: THE CONFIGURATIONS DESCRIBED HAVE NO ACCESS CONTROL. ANYONE WITH
|
|
|
45 |
ACCESS TO THE NETWORK WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY
|
|
|
46 |
DOCUMENT.
|
|
|
47 |
|
|
|
48 |
== On a Debian/Ubuntu system
|
|
|
49 |
|
|
|
50 |
=== Install recoll
|
|
|
51 |
|
|
|
52 |
sudo apt-get install recoll python-recoll
|
|
|
53 |
|
|
|
54 |
Configure the indexing and check that the normal search works (I spent
|
|
|
55 |
quite a lot of time trying to understand why the WebUI did not work, when
|
|
|
56 |
in fact it was the normal recoll configuration which was broken and the
|
|
|
57 |
regular search did not work either).
|
|
|
58 |
|
|
|
59 |
Take care to be logged in as the user you want to run the web search as
|
|
|
60 |
while you do this.
|
|
|
61 |
|
|
|
62 |
|
|
|
63 |
=== Install the WebUI
|
|
|
64 |
|
|
|
65 |
Clone the github repository, or extract the master tar installation, and
|
|
|
66 |
move it to '/var/www/recoll-webui-master/'. Take care that it is read/execute
|
|
|
67 |
accessible by your user.
|
|
|
68 |
|
|
|
69 |
=== Install Apache and mod-wsgi
|
|
|
70 |
|
|
|
71 |
|
|
|
72 |
sudo apt-get install apache2 libapache2-mod-wsgi
|
|
|
73 |
|
|
|
74 |
I then got the following message:
|
|
|
75 |
|
|
|
76 |
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
|
|
|
77 |
|
|
|
78 |
To clear it, I added a ServerName directive to the apache config, maybe you
|
|
|
79 |
won't need it. Edit '/etc/apache2/sites-available/000-default.conf' and add
|
|
|
80 |
the following at the top (globally). Things work without this fix anyway,
|
|
|
81 |
this is just to suppress the error message. You probably need to adjust the
|
|
|
82 |
address or use a real host name:
|
|
|
83 |
|
|
|
84 |
ServerName 192.168.4.6
|
|
|
85 |
|
|
|
86 |
|
|
|
87 |
Edit '/etc/apache2/mods-enabled/wsgi.conf', add the following at the end of
|
|
|
88 |
the "IfModule" section.
|
|
|
89 |
|
|
|
90 |
Change the user ('dockes' in the example) taking care that he is the one who
|
|
|
91 |
owns the index ('.recoll' is in his home directory).
|
|
|
92 |
|
|
|
93 |
WSGIDaemonProcess recoll user=dockes group=dockes \
|
|
|
94 |
threads=1 processes=5 display-name=%{GROUP} \
|
|
|
95 |
python-path=/var/www/recoll-webui-master
|
|
|
96 |
WSGIScriptAlias /recoll /var/www/recoll-webui-master/webui-wsgi.py
|
|
|
97 |
<Directory /var/www/recoll-webui-master>
|
|
|
98 |
WSGIProcessGroup recoll
|
|
|
99 |
Order allow,deny
|
|
|
100 |
allow from all
|
|
|
101 |
</Directory>
|
|
|
102 |
|
|
|
103 |
NOTE: the Recoll WebUI application is mostly single-threaded, so it is of
|
|
|
104 |
little use (and may actually be counter-productive in some cases) to
|
|
|
105 |
specify multiple threads on the WSGIDaemonProcess line. Specify multiple
|
|
|
106 |
processes instead to put multiple CPUs to work on simultaneous requests.
|
|
|
107 |
|
|
|
108 |
|
|
|
109 |
Then run the following to restart apache:
|
|
|
110 |
|
|
|
111 |
sudo apachectl restart
|
|
|
112 |
|
|
|
113 |
The Recoll WebUI should now be accessible. on 'http://my.server.com/recoll/'
|
|
|
114 |
|
|
|
115 |
NOTE: Take care that you need a '/' at the end of the URL used to access
|
|
|
116 |
the search (use: 'http://my.server.com/recoll/', not
|
|
|
117 |
'http://my.server.com/recoll'), else files other than the script itself are
|
|
|
118 |
not found (the page looks weird and the search does not work).
|
|
|
119 |
|
|
|
120 |
CAUTION: THERE IS NO ACCESS CONTROL. ANYONE WITH ACCESS TO THE NETWORK
|
|
|
121 |
WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY DOCUMENT.
|
|
|
122 |
|
|
|
123 |
== Variant for BSD/ports
|
|
|
124 |
|
|
|
125 |
=== Packages
|
|
|
126 |
|
|
|
127 |
As root:
|
|
|
128 |
|
|
|
129 |
pkg install recoll
|
|
|
130 |
|
|
|
131 |
|
|
|
132 |
Do what you need to do to configure the indexing and check that the normal
|
|
|
133 |
search works.
|
|
|
134 |
|
|
|
135 |
Take care to be logged in as the user you want to run the web search as
|
|
|
136 |
while you do this.
|
|
|
137 |
|
|
|
138 |
pkg install apache24
|
|
|
139 |
|
|
|
140 |
Add apache24_enable="YES" in /etc/rc.conf
|
|
|
141 |
|
|
|
142 |
pkg install ap24-mod_wsgi4
|
|
|
143 |
pkg install git
|
|
|
144 |
|
|
|
145 |
=== Clone the webui repository
|
|
|
146 |
|
|
|
147 |
cd /usr/local/www/apache24/
|
|
|
148 |
git clone https://github.com/koniu/recoll-webui.git recoll-webui-master
|
|
|
149 |
|
|
|
150 |
Important: most input handler helper applications (e.g. 'pdftotext') are
|
|
|
151 |
installed in '/usr/local/bin' which is not in the PATH as seen by Apache
|
|
|
152 |
(at least on DragonFly). The simplest way to fix this is to modify the
|
|
|
153 |
launcher module for the webui app so that it fixes the PATH.
|
|
|
154 |
|
|
|
155 |
Edit 'recoll-webui-master/webui-wsgi.py' and add the following line after
|
|
|
156 |
the 'import os' line:
|
|
|
157 |
|
|
|
158 |
os.environ['PATH'] = os.environ['PATH'] + ':' + '/usr/local/bin'
|
|
|
159 |
|
|
|
160 |
|
|
|
161 |
|
|
|
162 |
=== Configure apache
|
|
|
163 |
|
|
|
164 |
Edit /usr/local/etc/apache24/modules.d/270_mod_wsgi.conf
|
|
|
165 |
|
|
|
166 |
Uncomment the LoadModule line, and add the directives to alias /recoll/ to
|
|
|
167 |
the webui script.
|
|
|
168 |
|
|
|
169 |
Change the user (dockes in the example) taking care that he is the one who
|
|
|
170 |
owns the index (.recoll is in his home directory).
|
|
|
171 |
|
|
|
172 |
Contents of the file:
|
|
|
173 |
|
|
|
174 |
## $FreeBSD$
|
|
|
175 |
## vim: set filetype=apache:
|
|
|
176 |
##
|
|
|
177 |
## module file for mod_wsgi
|
|
|
178 |
##
|
|
|
179 |
## PROVIDE: mod_wsgi
|
|
|
180 |
## REQUIRE:
|
|
|
181 |
|
|
|
182 |
LoadModule wsgi_module libexec/apache24/mod_wsgi.so
|
|
|
183 |
|
|
|
184 |
WSGIDaemonProcess recoll user=dockes group=dockes \
|
|
|
185 |
threads=1 processes=5 display-name=%{GROUP} \
|
|
|
186 |
python-path=/usr/local/www/apache24/recoll-webui-master/
|
|
|
187 |
WSGIScriptAlias /recoll /usr/local/www/apache24/recoll-webui-master/webui-wsgi.py
|
|
|
188 |
|
|
|
189 |
<Directory /usr/local/www/apache24/recoll-webui-master>
|
|
|
190 |
WSGIProcessGroup recoll
|
|
|
191 |
Require all granted
|
|
|
192 |
</Directory>
|
|
|
193 |
|
|
|
194 |
=== Restart apache
|
|
|
195 |
|
|
|
196 |
As root:
|
|
|
197 |
|
|
|
198 |
apachectl restart
|
|
|
199 |
|
|
|
200 |
|