None
open
nobody
None
2018-01-02
2017-09-26
Harri T.
No

Create configuration option for enabling indexing only when the computer is idle (powered on but not in active use).

Discussion

  • Harri T.
    Harri T.
    2017-09-27

    And because Recoll is a desktop search engine, better definition for "idle" would be "logged in with X session but not used input devices for a certain amount of time".

     
  • medoc
    medoc
    2017-10-14

    Sorry for the long delay in answering, this has been a complicated month.

    While I agree with you on principle, I don't really know how to implement this. Do you know if there an API somewhere for 'not used input devices since xx' ? Preferably desktop-agnostic and working with the new non-X11 stuff too.

    Otherwise, the indexer is both niced and ioniced to the bottom, so the only way it is going to cause problems is by using too much memory. This is a rare occurence on current desktops (needs really pathological files). If you have an exemple of such a file, I'd be interested.

    Also you can set filtermaxmbytes to limit the amount of memory used by data extractors (the xls extractor, for example, does have a tendancy to go wild on certain files, before failing anyway).

    http://www.lesbonscomptes.com/recoll/usermanual/webhelp/docs/RCL.INSTALL.CONFIG.RECOLLCONF.PERFS.html

     
  • medoc
    medoc
    2017-11-29

    • status: open --> closed
    • milestone: -->
     
  • medoc
    medoc
    2017-11-29

    • status: closed --> wont-fix
     
  • Anonymous
    Anonymous
    2018-01-01

    Even though I have 8 core Xeon CPU and M.2 SSD, recollindex causes heavy load every now and then:

    top - 11:03:49 up  1:07,  1 user,  load average: 1,32, 1,32, 1,69
    Tasks: 318 total,   2 running, 316 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  0,7 us,  0,2 sy, 12,5 ni, 86,4 id,  0,0 wa,  0,0 hi,  0,2 si,  0,0 st
    KiB Mem : 16247588 total,   880188 free,  4552976 used, 10814424 buff/cache
    KiB Swap: 16410620 total, 16272636 free,   137984 used. 10970816 avail Mem
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                            
    29332 harri     39  19 1537772 1,382g   7612 R 100,0  8,9   7:12.95 recollindex                                                        
      778 root      20   0  659528 119008 107168 S   3,3  0,7   2:50.81 Xorg                                                               
     2967 harri     20   0  578624  25856  17824 S   2,0  0,2   0:00.51 xfce4-terminal                                                     
     5015 harri     20   0 1191248 116148  65156 S   1,3  0,7   0:17.30 opera                                                              
     2977 harri     20   0  504488  25080  17092 S   0,7  0,2   0:00.34 xfce4-terminal                                                     
     2797 harri     20   0  342036  12136   8156 S   0,3  0,1   0:00.50 xfce4-session                                                      
     2819 harri     20   0  199768  21808  15288 S   0,3  0,1   0:15.58 xfwm4                                                              
     2823 harri     20   0  436172  26612  15288 S   0,3  0,2   0:03.39 xfce4-panel                                                        
     9577 root      20   0       0      0      0 S   0,3  0,0   0:00.04 kworker/2:2          
    ...
    

    CPU fan gets louder and sometimes even mouse becomes unusable. That's why I prefer indexing only when idle.

    recoll -v
    Recoll 1.23.6 + Xapian 1.4.3
    
     
    Last edit: Anonymous 2018-01-01
  • medoc
    medoc
    2018-01-02

    • status: wont-fix --> open
     
  • medoc
    medoc
    2018-01-02

    It's normal that recollindex will use 100% cpu when working. Your machine does not seem overloaded though (86% idle, no io waits, low load average).

    The most probable issue is that some process became or still is very big (the machine did use swap at some point despite having 16GB of memory). The freezes you observe probably happen while needed memory is swapped in.

    The big process could be recollindex (not the case in the screenshot, but could happen), or one of the auxiliary document format handlers, or something else. Recollindex does appear to be unusually big. My best guess, but it's only a guess, is that a specific document is getting one of the handlers and/or recollindex to grow a lot.

    I'd try to see if there are big processes around when the problem happen, and we can take it from there. If memory eviction is the culprit, indexing while idle would probably not fix the problem anyway, as the data will need to be swapped in when you need it.

    You can also limit the amount of cpu which recollindex can use by limiting the number of threads, but I would be surprised if this was the issue (though it will also indirectly curb memory usage). https://www.lesbonscomptes.com/recoll/usermanual/webhelp/docs/RCL.INDEXING.CONFIG.THREADS.html

    Maybe try something like the following in recoll.conf:

    thrQSizes = 2 2 2
    thrTCounts =  4 2 1
    
     

Cancel   Add attachment