Switch to unified view

a b/website/pages/recoll-webui-install-wsgi.txt
1
= Recoll WebUI Apache installation from scratch 
2
3
The https://github.com/koniu/recoll-webui[Recoll WebUI] offers an
4
alternative, WEB-based, interface for querying a Recoll index.
5
6
It can be quite useful to extend the use of a shared index to multiple
7
workstations, without the need for a local Recoll installation and shared
8
data storage.
9
10
The Recoll WebUI is based on the
11
http://bottlepy.org/docs/dev/index.html[Bottle Python framework], which has
12
a built-in WEB server, and the simplest deployment approach is to run it
13
standalone. However the built-in server is restricted to handling one
14
request at a time, which is problematic in multi-user situations,
15
especially because some requests, like extracting a result list into a CSV
16
file, can take a significant amount of time.
17
18
The Bottle framework can work with several multi-threading Python HTTP
19
server libraries, but, given the limitations of the Recoll Python module
20
and the Python interpreter itself, this will not yield optimal performance,
21
and, especially can't efficiently leverage the now ubiquitous
22
multiprocessors.
23
24
In multi-user situations, you can get better performance and ease of use
25
from the Recoll WebUI by running it under Apache rather than as a
26
standalone process. With this approach, a few requests per second can
27
easily be handled even in the presence of long-running ones.
28
29
Neither Recoll nor the WebUI are optimized for high multi-user load, and it
30
would be very unwise to use them as the search interface to a busy WEB
31
site.
32
33
The instructions about using the WebUI under Apache as given in the
34
repository README are a bit terse, and are missing a few details,
35
especially ones which impact performance.
36
37
Here follows the synopsis of two WebUI installations on initially
38
Apache-less Ubuntu (14.04) and DragonFly BSD systems. The first should
39
extend easily to other Debian-based systems, the second at least to
40
FreeBSD. rpm-based systems are left as an exercise to the reader, at least
41
for now...
42
43
44
CAUTION: THE CONFIGURATIONS DESCRIBED HAVE NO ACCESS CONTROL. ANYONE WITH
45
ACCESS TO THE NETWORK WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY
46
DOCUMENT.
47
48
== On a Debian/Ubuntu system
49
50
=== Install recoll 
51
52
    sudo apt-get install recoll python-recoll
53
54
Configure the indexing and check that the normal search works (I spent
55
quite a lot of time trying to understand why the WebUI did not work, when
56
in fact it was the normal recoll configuration which was broken and the
57
regular search did not work either).
58
59
Take care to be logged in as the user you want to run the web search as
60
while you do this.
61
62
63
=== Install the WebUI
64
65
Clone the github repository, or extract the master tar installation, and
66
move it to '/var/www/recoll-webui-master/'. Take care that it is read/execute
67
accessible by your user.
68
69
=== Install Apache and mod-wsgi
70
71
72
    sudo apt-get install apache2 libapache2-mod-wsgi
73
74
I then got the following message:
75
76
    AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
77
78
To clear it, I added a ServerName directive to the apache config, maybe you
79
won't need it. Edit '/etc/apache2/sites-available/000-default.conf' and add
80
the following at the top (globally). Things work without this fix anyway,
81
this is just to suppress the error message. You probably need to adjust the
82
address or use a real host name:
83
84
    ServerName 192.168.4.6
85
86
87
Edit '/etc/apache2/mods-enabled/wsgi.conf', add the following at the end of
88
the "IfModule" section.
89
90
Change the user ('dockes' in the example) taking care that he is the one who
91
owns the index ('.recoll' is in his home directory).
92
93
    WSGIDaemonProcess recoll user=dockes group=dockes \
94
        threads=1 processes=5 display-name=%{GROUP} \
95
        python-path=/var/www/recoll-webui-master
96
    WSGIScriptAlias /recoll /var/www/recoll-webui-master/webui-wsgi.py
97
    <Directory /var/www/recoll-webui-master>
98
            WSGIProcessGroup recoll
99
            Order allow,deny
100
            allow from all
101
    </Directory>
102
103
NOTE: the Recoll WebUI application is mostly single-threaded, so it is of
104
little use (and may actually be counter-productive in some cases) to
105
specify multiple threads on the WSGIDaemonProcess line. Specify multiple
106
processes instead to put multiple CPUs to work on simultaneous requests.
107
108
109
Then run the following to restart apache:
110
111
    sudo apachectl restart
112
113
The Recoll WebUI should now be accessible. on 'http://my.server.com/recoll/'
114
115
NOTE: Take care that you need a '/' at the end of the URL used to access
116
the search (use: 'http://my.server.com/recoll/', not
117
'http://my.server.com/recoll'), else files other than the script itself are
118
not found (the page looks weird and the search does not work).
119
120
CAUTION: THERE IS NO ACCESS CONTROL. ANYONE WITH ACCESS TO THE NETWORK
121
WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY DOCUMENT.
122
123
== Variant for BSD/ports
124
125
=== Packages
126
127
As root:
128
129
    pkg install recoll
130
131
132
Do what you need to do to configure the indexing and check that the normal
133
search works.
134
135
Take care to be logged in as the user you want to run the web search as
136
while you do this.
137
138
    pkg install apache24
139
140
Add apache24_enable="YES" in /etc/rc.conf
141
142
    pkg install ap24-mod_wsgi4
143
    pkg install git
144
145
=== Clone the webui repository
146
147
    cd /usr/local/www/apache24/
148
    git clone https://github.com/koniu/recoll-webui.git recoll-webui-master
149
150
Important: most input handler helper applications (e.g. 'pdftotext') are
151
installed in '/usr/local/bin' which is not in the PATH as seen by Apache
152
(at least on DragonFly). The simplest way to fix this is to modify the
153
launcher module for the webui app so that it fixes the PATH.
154
155
Edit 'recoll-webui-master/webui-wsgi.py' and add the following line after
156
the 'import os' line:
157
158
    os.environ['PATH'] = os.environ['PATH'] + ':' + '/usr/local/bin'
159
160
161
162
=== Configure apache
163
164
Edit /usr/local/etc/apache24/modules.d/270_mod_wsgi.conf
165
166
Uncomment the LoadModule line, and add the directives to alias /recoll/ to
167
the webui script.
168
169
Change the user (dockes in the example) taking care that he is the one who
170
owns the index (.recoll is in his home directory).
171
172
Contents of the file:
173
174
    ## $FreeBSD$
175
    ## vim: set filetype=apache:
176
    ##
177
    ## module file for mod_wsgi
178
    ##
179
    ## PROVIDE: mod_wsgi
180
    ## REQUIRE:
181
    
182
    LoadModule wsgi_module        libexec/apache24/mod_wsgi.so
183
    
184
    WSGIDaemonProcess recoll user=dockes group=dockes \
185
        threads=1 processes=5 display-name=%{GROUP} \
186
        python-path=/usr/local/www/apache24/recoll-webui-master/
187
    WSGIScriptAlias /recoll /usr/local/www/apache24/recoll-webui-master/webui-wsgi.py
188
    
189
    <Directory /usr/local/www/apache24/recoll-webui-master>
190
            WSGIProcessGroup recoll
191
            Require all granted
192
    </Directory>
193
194
=== Restart apache
195
196
As root:
197
198
    apachectl restart
199
200