Fedora 29
Recoll 1.24.3
Xapian 1.4.9
Thanks for the great program.
When indexing, I noticed that a large amount of data is written to disk compared to the final size of the index. The concern is unnecessarily excessive writing wear on a solid state disk (SSD).
Example 1
Creating a database of a few home folders using recollindex. Total bytes written is monitored using iostat and separately Gnome System Monitor (whose values agree with each other).
xapiandb folder size 3.7 GB
SSD disk write total 14.9 GB
Example 2
Creating a database of a different folder using recollindex.
xapiandb folder size 10.1 GB
SSD disk write total 133.1 GB
Example 3
Creating a database of a lot of folders (from spinning HDDs) using recollindex. This indexing run was stopped before finishing after about 2 days.
xapiandb folder size 247 GB
SSD disk write total approx over 10 TB
This 10TB written is approx 1-2% of the total SSD life span of bytes written, for one recollindex run that didn’t finish.
Is this level of written bytes typical? And could anything be done to reduce it?
Writing the xapiandb folder to a spinning HDD could be done, but would be very slow. I noticed setting idxflushmb = 6000 lowered the xapiandb folder size and bytes written by about 20%.
Is the large amount of recollindex bytes written related to xapian using atomic commits? I noticed that xapian can use a ‘dangerous mode’ of updating the database in place.
Could recollindex optionally use this DB_DANGEROUS mode (off by default)? This could be useful for the first xapiandb creation, where no one will be querying the database, and if the power fails etc (unlikely) then the xapiandb folder is manually deleted and recollindex started again. Plus the final xapiandb folder will be smaller.
.........................
https://xapian.org/docs/apidoc/html/namespaceXapian.html#afff6e2208f3d724637eff3ca442190b6
DB_DANGEROUS
Update the database in-place.
Xapian's disk-based backends use block-based storage, with copy-on-write to allow the previous revision to be searched while a new revision forms.
This option means changed blocks get written back over the top of the old version. The benefits of this are that less I/O is required during indexing, and the result of indexing is more compact. The downsides are that you can't concurrently search while indexing, transactions can't be cancelled, and if indexing ends uncleanly (i.e. without commit() or WritableDatabase's destructor being called) then the database won't be usable.
Currently all the base files will be removed upon the first modification, and new base files will be written upon commit. This prevents new readers from opening the database while it unsafe to do so, but there's not currently a mechanism in Xapian to handle notifying existing readers.
.........................
Discussion
-
medoc
2019-01-09Hi,
I just gave a try to DB_DANGEROUS (and DB_NO_SYNC too), but this does not seem to change the amount of disk writes a lot.
This kind of makes sense: DB_DANGEROUS changes the place where writes are performed (existing block rather than copy), but probably not much the amount of writes. DB_NO_SYNC works on small indexes, but once the index becomes much bigger than the buffer cache, it's not very efficient any more.
DB_DANGEROUS results in a smaller index though, so less writes overall, but the difference is not spectacular.
It seems that the ratio between amount of writes and index size rises with the index size.
Maybe it would make sense to create the index in several pieces and then merge them. I'm really not sure. The best place to ask about this writing problem would be the Xapian discuss mailing list.
https://lists.xapian.org/mailman/listinfo/xapian-discussI'm subscribed, so if there is a need for recoll information, I can supply it there as needed.
Also, in my experience, it's not necessarily a major performance issue to have the index on spinning disk. When indexing many small files it's actually more important that the source is on SSD. Did you actually give it a try ?
Last, and mostly about your last try (I get approximately the same ratios as you for the smaller ones): sorry, but I have to ask: where is your swap partition ?
-
Anonymous
2019-01-20That’s good information that DB_DANGEROUS does not make a significant difference to total SSD disk writes.
As suggested, I created multiple indexes by running recollindex on separate content sections, and then just attached them by using them as external indexes in recoll gui (rather than merging the indexes). Seems to work well. The total amount of SSD disk writes when using recollindex on sections appeared to be less than a third of the SSD disk writes when using recollindex on all content in one run.
Its not possible to have the source content on SSD, as there is too much to store there. The time to create the set of indexes is not a problem (~2-3days). Creating the indexes on spinning disk did seem a lot slower in a quick test.
The swap partition is also on the SSD, but there was little use of the swap from casual observation. RECOLL_TMPDIR was also pointing to an empty folder on the SSD (due to default Fedora /tmp being limited in size causing problems with archive extractions), but this folder was not used much either. Turning off swap and pointing RECOLL_TMPDIR elsewhere made no difference to SSD total disk writes in a limited recollindex run.
For some reference data from recollindex runs….
Column1 - xapiandb folder size (GB)
Column2 - SSD disk write total (GB)2.7 - 4.4
3.7 - 14.9
3.8 - 18.9
4.8 - 35.4
6.2 - 34.4
10.1 - 133.1
12.7 - 88.8
22.6 - 474
28.8 - 545.9
39.1 - 834.9
46.3 - 537
53.2 - 850
Last edit: Anonymous 2019-01-20
-
Anonymous
2019-01-20That’s good information that DB_DANGEROUS does not make a significant difference to total SSD disk writes.
As suggested, I created multiple indexes by running recollindex on separate content sections, and then just attached them by using them as external indexes in recoll gui (rather than merging the indexes). Seems to work well. The total amount of SSD disk writes when using recollindex on sections appeared to be less than a third of the SSD disk writes when using recollindex on all content in one run.
Its not possible to have the source content on SSD, as there is too much to store there. The time to create the set of indexes is not a problem (~2-3days). Creating the indexes on spinning disk did seem a lot slower in a quick test.
The swap partition is also on the SSD, but there was little use of the swap from casual observation. RECOLL_TMPDIR was also pointing to an empty folder on the SSD (due to default Fedora /tmp being limited in size causing problems with archive extractions), but this folder was not used much either. Turning off swap and pointing RECOLL_TMPDIR elsewhere made no difference to SSD total disk writes in a limited recollindex run.
For some reference data from recollindex runs….
Column1 - xapiandb folder size (GB)
Column2 - SSD disk write total (GB)2.7 - 4.4
3.7 - 14.9
3.8 - 18.9
4.8 - 35.4
6.2 - 34.4
10.1 - 133.1
12.7 - 88.8
22.6 - 474
28.8 - 545.9
39.1 - 834.9
46.3 - 537
53.2 - 850
-
medoc
2019-01-26Thanks for the new data. I have written to the Xapian mailing list about this, no answer for now.
By the way, did you monitor the indexer memory usage with the flush threshold set at 6000 ? It must have been huge, maybe triggering some page-outs and mitigating the gain obtained by the increase in buffering.
I am going to ping the mailing list, but unfortunately, the most likely workaround for this is to divide the index, as you did, add as much RAM as possible, and set the flush threshold so that no paging is triggered.