|
a/website/idxthreads/forkingRecoll.txt |
|
b/website/idxthreads/forkingRecoll.txt |
|
... |
|
... |
5 |
|
5 |
|
6 |
== Abstract
|
6 |
== Abstract
|
7 |
|
7 |
|
8 |
== Introduction
|
8 |
== Introduction
|
9 |
|
9 |
|
10 |
Recoll is a big process which executes many others, mostly for extracting
|
10 |
The Recoll indexer, *recollindex*, is a big process which executes many
|
11 |
text from documents. Some of the executed processes are quite short-lived,
|
11 |
others, mostly for extracting text from documents. Some of the executed
|
12 |
and the time used by the process execution machinery can actually dominate
|
12 |
processes are quite short-lived, and the time used by the process execution
|
13 |
the time used to translate data. This document explores possible approaches
|
13 |
machinery can actually dominate the time used to translate data. This
|
14 |
to improving performance without adding excessive complexity or damaging
|
14 |
document explores possible approaches to improving performance without
|
15 |
reliability.
|
15 |
adding excessive complexity or damaging reliability.
|
16 |
|
16 |
|
17 |
Studying fork/exec performance is not exactly a new venture, and there are
|
17 |
Studying fork/exec performance is not exactly a new venture, and there are
|
18 |
many texts which address the subject. While researching, though, I found
|
18 |
many texts which address the subject. While researching, though, I found
|
19 |
out that not so many were accurate and that a lot of questions were left as
|
19 |
out that not so many were accurate and that a lot of questions were left as
|
20 |
an exercise to the reader.
|
20 |
an exercise to the reader.
|
|
... |
|
... |
30 |
|
30 |
|
31 |
+exec()+ then replaces part of the newly executing process with an address
|
31 |
+exec()+ then replaces part of the newly executing process with an address
|
32 |
space initialized from an executable file, inheriting some of the resources
|
32 |
space initialized from an executable file, inheriting some of the resources
|
33 |
under various conditions.
|
33 |
under various conditions.
|
34 |
|
34 |
|
35 |
As processes became bigger the copy-before-discard operation wasted
|
35 |
This was all fine with the small processes of the first Unix systems, but
|
36 |
significant resources, and was optimized using two methods (at very
|
36 |
as time progressed, processes became bigger and the copy-before-discard
|
37 |
different points in time):
|
37 |
operation was found to waste significant resources. It was optimized using
|
|
|
38 |
two methods (at very different points in time):
|
38 |
|
39 |
|
39 |
- The first approach was to supplement +fork()+ with the +vfork()+ call, which
|
40 |
- The first approach was to supplement +fork()+ with the +vfork()+ call, which
|
40 |
is similar but does not duplicate the address space: the new process
|
41 |
is similar but does not duplicate the address space: the new process
|
41 |
thread executes in the old address space. The old thread is blocked
|
42 |
thread executes in the old address space. The old thread is blocked
|
42 |
until the new one calls +exec()+ and frees up access to the memory
|
43 |
until the new one calls +exec()+ and frees up access to the memory
|
|
... |
|
... |
174 |
a single thread, and +fork()+ if it ran multiple ones.
|
175 |
a single thread, and +fork()+ if it ran multiple ones.
|
175 |
|
176 |
|
176 |
After another careful look at the code, I could see few issues with
|
177 |
After another careful look at the code, I could see few issues with
|
177 |
using +vfork()+ in the multithreaded indexer, so this was committed.
|
178 |
using +vfork()+ in the multithreaded indexer, so this was committed.
|
178 |
|
179 |
|
179 |
The only change necessary was to get rid on an implementation of the
|
180 |
The only change necessary was to get rid of an implementation of the
|
180 |
lacking Linux +closefrom()+ call (used to close all open descriptors above a
|
181 |
lacking Linux +closefrom()+ call (used to close all open descriptors above a
|
181 |
given value). The previous Recoll implementation listed the +/proc/self/fd+
|
182 |
given value). The previous Recoll implementation listed the +/proc/self/fd+
|
182 |
directory to look for open descriptors but this was unsafe because of of
|
183 |
directory to look for open descriptors but this was unsafe because of of
|
183 |
possible memory allocations in +opendir()+ etc.
|
184 |
possible memory allocations in +opendir()+ etc.
|
184 |
|
185 |
|
|
... |
|
... |
198 |
No surprise here, given the implementation of +posix_spawn()+, it gets the
|
199 |
No surprise here, given the implementation of +posix_spawn()+, it gets the
|
199 |
same times as the +fork()+/+vfork()+ options.
|
200 |
same times as the +fork()+/+vfork()+ options.
|
200 |
|
201 |
|
201 |
The tests were performed on an Intel Core i5 750 (4 cores, 4 threads).
|
202 |
The tests were performed on an Intel Core i5 750 (4 cores, 4 threads).
|
202 |
|
203 |
|
203 |
The last line is just for the fun: *recollindex* 1.18 (single-threaded)
|
|
|
204 |
needed almost 6 times as long to process the same files...
|
|
|
205 |
|
|
|
206 |
It would be painful to play it safe and discard the 60% reduction in
|
204 |
It would be painful to play it safe and discard the 60% reduction in
|
207 |
execution time offered by using +vfork()+.
|
205 |
execution time offered by using +vfork()+, so this was adopted for Recoll
|
208 |
|
|
|
209 |
To this day, no problems were discovered, but, still crossing fingers...
|
206 |
1.21. To this day, no problems were discovered, but, still crossing
|
|
|
207 |
fingers...
|
|
|
208 |
|
|
|
209 |
The last line in the table is just for the fun: *recollindex* 1.18
|
|
|
210 |
(single-threaded) needed almost 6 times as long to process the same
|
|
|
211 |
files...
|
210 |
|
212 |
|
211 |
////
|
213 |
////
|
212 |
Objections to vfork:
|
214 |
Objections to vfork:
|
213 |
sigaction locks
|
215 |
sigaction locks
|
214 |
https://bugzilla.redhat.com/show_bug.cgi?id=193631
|
216 |
https://bugzilla.redhat.com/show_bug.cgi?id=193631
|