Thanks for making recoll. It is really great!
I would like to be able to index removable media and have the index stored within. This issue is discussed here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=784540 and the problem seems to be the lack of support for relative paths for topdir. Has there been progress on this? This would really be a great feature to have.
Discussion
-
medoc
2017-11-29I have not worked on using relative paths because recoll now has a feature which can replace them: the path translation facility:
https://www.lesbonscomptes.com/recoll/usermanual/webhelp/docs/RCL.SEARCH.PTRANS.htmlThis is less transparent than volume-relative paths would be, but the latter introduce other issues, and I thought that it was just not worth it to tackle the complexity.
It might be possible to further automate the path translation setup. I'd be interested by the details of your use case to see if something can be done.
-
Anonymous
2017-12-04I want to have a self-contained index with my data so that I can share it with http://www.datalad.org/ or git-annex.
The following works, but is not ideal:
- I created a "local repository" in /tmp/recoll.
- I created an index inside this folder with 'recoll -c .recoll'
- I moved the folder to /tmp/recoll2 (imagine this as a cloned repository)
- Searching the index works, but all the paths are wrong. This can be fixed manually with ptrans.
-
medoc
2017-12-05Do you need to update the index in the changed location ?
-
medoc
2017-12-06I have implemented something which may solve your use case. Can you take a look at the doc ? https://www.lesbonscomptes.com/recoll/usermanual/webhelp/docs/RCL.MOVABLE.html
It's only in git code for now.
-
Anonymous
2017-12-10Thanks for the fast work on implementing this. If I understood
correctly, it should work to just run this script
(https://gist.github.com/kskyten/8610c40b1b3fd36dff9a6548677578ec)
inside a folder to make a portable index. Is it enough to just
have the 'orgidxconfdir' variable or does the config file also have
to contain 'topdirs'?Ideally, I would like it to be completely decentralized i.e not
having to refer to the original location and being able to update
the index. I guess this is not possible without support for
relative paths.A somewhat related question: Is there an easy way to manipulate the
config file from the command line? I'm envisioning a CLI for this
use case similar to that of git e.g 'recollcli init' would do
something like the script I posted above, 'recollcli add foo' would
add the location foo to the config file etc.
-
medoc
2017-12-10The config file needs topdirs, otherwise, there is no way to know what to index.
I think that your script should do the job just fine. You don't need to create the empty files, but they won't hurt either.
It should be possible to add the capability to update the index for a moved dataset. I'll have a look at it. Curiously this thing with orgidxconfdir, which is largely equivalent to supporting relative paths, makes everything conceptually simpler. I'd have a hard time guessing what the use of relative paths could affect internally (maybe nothing in fact, but I'd have to look all over the place). Keeping absolute paths together with a way to fix them makes for a very localized modification on the doc extraction side. Hopefully it will be the same with the indexing side. It's on the todo.
There is currently no tool to edit recoll.conf. If you are not using sections with bracket headers, just appending data to the file will do the trick. With brackets, it becomes a bit more complicated, but if you control the format (no editing with the GUI), keeping everything on single lines (no line folding wiht backslashes), it becomes quite easy to edit it with grep or such (grep to get the old value, grep -v to eliminate it, then prepend the new one to the file). Or something like it :) It would be quite easy to write a command line editor based on conftree.[h/cpp], but it does not exist.