Create configuration option for enabling indexing only when the computer is idle (powered on but not in active use).
Discussion
-
Harri T.
2017-09-27And because Recoll is a desktop search engine, better definition for "idle" would be "logged in with X session but not used input devices for a certain amount of time".
-
medoc
2017-10-14Sorry for the long delay in answering, this has been a complicated month.
While I agree with you on principle, I don't really know how to implement this. Do you know if there an API somewhere for 'not used input devices since xx' ? Preferably desktop-agnostic and working with the new non-X11 stuff too.
Otherwise, the indexer is both niced and ioniced to the bottom, so the only way it is going to cause problems is by using too much memory. This is a rare occurence on current desktops (needs really pathological files). If you have an exemple of such a file, I'd be interested.
Also you can set filtermaxmbytes to limit the amount of memory used by data extractors (the xls extractor, for example, does have a tendancy to go wild on certain files, before failing anyway).
-
medoc
2017-11-29Thought a bit more about this, and actually, I think that there are potentially many conditions on which people would want the indexer to be active or not. Consequently, I think that it's better to handle the issue externally (in a script).
Someone contributed a script to turn the indexer on/off depending on power battery/AC status: https://www.lesbonscomptes.com/recoll/faqsandhowtos/IndexOnAc.html
I think that this could relatively easily be adapted to start/stop the indexer depending on the screen saver status for example (the screen saver is supposedly good at detecting user activity). See for example https://unix.stackexchange.com/questions/197032/detect-if-screensaver-is-active#214074 for determining screen saver status.
-
medoc
2017-11-29- status: open --> closed
- milestone: -->
-
medoc
2017-11-29- status: closed --> wont-fix
-
Anonymous
2018-01-01Even though I have 8 core Xeon CPU and M.2 SSD, recollindex causes heavy load every now and then:
top - 11:03:49 up 1:07, 1 user, load average: 1,32, 1,32, 1,69 Tasks: 318 total, 2 running, 316 sleeping, 0 stopped, 0 zombie %Cpu(s): 0,7 us, 0,2 sy, 12,5 ni, 86,4 id, 0,0 wa, 0,0 hi, 0,2 si, 0,0 st KiB Mem : 16247588 total, 880188 free, 4552976 used, 10814424 buff/cache KiB Swap: 16410620 total, 16272636 free, 137984 used. 10970816 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29332 harri 39 19 1537772 1,382g 7612 R 100,0 8,9 7:12.95 recollindex 778 root 20 0 659528 119008 107168 S 3,3 0,7 2:50.81 Xorg 2967 harri 20 0 578624 25856 17824 S 2,0 0,2 0:00.51 xfce4-terminal 5015 harri 20 0 1191248 116148 65156 S 1,3 0,7 0:17.30 opera 2977 harri 20 0 504488 25080 17092 S 0,7 0,2 0:00.34 xfce4-terminal 2797 harri 20 0 342036 12136 8156 S 0,3 0,1 0:00.50 xfce4-session 2819 harri 20 0 199768 21808 15288 S 0,3 0,1 0:15.58 xfwm4 2823 harri 20 0 436172 26612 15288 S 0,3 0,2 0:03.39 xfce4-panel 9577 root 20 0 0 0 0 S 0,3 0,0 0:00.04 kworker/2:2 ...
CPU fan gets louder and sometimes even mouse becomes unusable. That's why I prefer indexing only when idle.
recoll -v Recoll 1.23.6 + Xapian 1.4.3
Last edit: Anonymous 2018-01-01
-
medoc
2018-01-02- status: wont-fix --> open
-
medoc
2018-01-02It's normal that recollindex will use 100% cpu when working. Your machine does not seem overloaded though (86% idle, no io waits, low load average).
The most probable issue is that some process became or still is very big (the machine did use swap at some point despite having 16GB of memory). The freezes you observe probably happen while needed memory is swapped in.
The big process could be recollindex (not the case in the screenshot, but could happen), or one of the auxiliary document format handlers, or something else. Recollindex does appear to be unusually big. My best guess, but it's only a guess, is that a specific document is getting one of the handlers and/or recollindex to grow a lot.
I'd try to see if there are big processes around when the problem happen, and we can take it from there. If memory eviction is the culprit, indexing while idle would probably not fix the problem anyway, as the data will need to be swapped in when you need it.
You can also limit the amount of cpu which recollindex can use by limiting the number of threads, but I would be surprised if this was the issue (though it will also indirectly curb memory usage). https://www.lesbonscomptes.com/recoll/usermanual/webhelp/docs/RCL.INDEXING.CONFIG.THREADS.html
Maybe try something like the following in recoll.conf:
thrQSizes = 2 2 2 thrTCounts = 4 2 1