File | Date | Author | Commit |
---|---|---|---|
libs | 2014-01-07 | Carlos Coutinho | [ffd0f2] First Version on Git, synchronised with TIMBUS ... |
ontologies | 2014-04-23 | Johannes Binder | [45d98c] Use official base IRI for the toolKB_instance, ... |
src | 2014-05-23 | Johannes Binder | [cba377] Consider xpuids in addition to puids, ignore fm... |
.gitignore | 2014-05-23 | Johannes Binder | [cba377] Consider xpuids in addition to puids, ignore fm... |
README.md | 2014-05-23 | Johannes Binder | [cba377] Consider xpuids in addition to puids, ignore fm... |
license_header.txt | 2014-01-07 | Carlos Coutinho | [ffd0f2] First Version on Git, synchronised with TIMBUS ... |
pom.xml | 2014-05-12 | Johannes Binder | [5a9341] Ignore missing license for tmp files |
Read Me
kbgen
This tool populates a toolKB ontology [1] with tools and file formats that are extracted
from Freebase [2] and Pronom [3].
A resulting ontology can be found in [4].
Usage
Build and run with: java -jar target/kbgen-1.0-SNAPSHOT.jar
The tool uses ontologies/toolKB_instance_empty.owl to insert formats and tools that are extracted from Freebase and Pronom.
The resulting ontology is stored in toolKB_instance.owl.
In case of memory errors increase the memory limit, e.g. using the VM option -Xmx2g
To handle different versions of file formats that are not part of freebase it is possible to provide CSV files that contain
formats which additionally should be considered. The CSV files are separated by the type of tool to file format mapping (read, write, read/write),
and are searched by following name in the working directory:
additional_formats_{r|w|rw}.csv
The format of the CSV files is:
([name], [puid], [tool]*).
The Pronom importer does not retrieve newer formats (Pronom IDs higher than about 450). So it might be necessary to run the kbgen,
add required formats that are missing to the cache_pronom_formats.json file, and rerun the kbgen.
Build
Use Maven to build the project.
References
[1] http://timbus.teco.edu/ontologies/preservationIdentifier/toolKB.owl
[2] http://www.freebase.com/
[3] http://www.nationalarchives.gov.uk/PRONOM/
[4] http://timbus.teco.edu/ontologies/preservationIdentifier/toolKB_instance.owl