git clone https://@opensourceprojects.eu/git/p/timbus/context-population/extractors/dspace-aip-extractor timbus-context-population-extractors-dspace-aip-extractor



File Date Author Commit
.idea 2015-01-07 miguelnunes miguelnunes [0a86fe] Updated Readme
src 2015-01-02 miguelnunes miguelnunes [59f51b] Merge branch 'alpha'
.gitignore 2015-01-07 miguelnunes miguelnunes [0a86fe] Updated Readme
Readme.md 2015-01-14 miguelnunes miguelnunes [ddcd7f] Updated Readme
dspace-aip-extractor.iml 2015-01-02 miguelnunes miguelnunes [59f51b] Merge branch 'alpha'
pom.xml 2015-01-02 miguelnunes miguelnunes [59f51b] Merge branch 'alpha'

Read Me

Dspace AIP Extractor

Dspace is a system developed by DuraSpace, an open source organization, together with several Universities and Institutions, and is a repository web service that is able to safely store a series of different documents - The system allows the organization of these documents, easy backup and intuitive administration.
As of version 1.7 of Dspace, a functionality called AIP - Archival Information Packages - was introduced. It was created to facilitate the backup of Data and it consists on a format that stores all groups, collections and Items (the documents themselves) from a Dspace instance.
The Dspace AIP Extractor uses this functionality to perform an AIP backup and store in the target machine. It then returns the system path where the generated AIP was stored.
Besides the AIP backup, this module also crawls through the Dspace main installation folder and searches for possibly customized files. When finding them, it returns the corresponding path.

 

How to get the code

git clone https://opensourceprojects.eu/git/p/timbus/context-population/extractors/dspace-aip-extractor

 

Install Requirements

  1. Oracle Java JDK 1.7
  2. Apache Maven installed

Requirements for the extraction target

  1. Linux installed
  2. Dspace 1.7 or newer installed
  3. SSH server running with authenticated user

How to install

This project, like most others in Timbus, is built through Maven. All that is required to build the entire project is to run the following command on the root project folder:

1
$> mvn clean package

This will create a target folder in which it saves two different .jar files - The cli module, which is used to run locally on the machine and the bundle module. which is to be deployed into Virgo Container.
A tutorial on how to properly install Virgo and deploy Timbus artefacts into it can be found here

 

Collected Information

Besides SSH authentication parameters, this extractor requires Dspace specific information - Dspace installation folder, user and optionally the site handle prefix (if this is not provided, the extractor will fetch this information from dspace.cfg file in config folder of dspace installation) - in order to perform an AIP backup operation (More information here).
So, the extractor provides two pieces of information:
- The AIP backup file path
- All relevant files that were possibly customized and, consequently, should be preserved.

Note: Depending on the dspace configuration, root privileges may be required to use this extractor.

 

Example output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
{
    "extractor": "Dspace AIP Extractor",
    "format": {
        "id": "ea6f170e-5d74-40ae-bb02-53b07d84b9ac",
        "multiple": false
    },
    "uuid": "3d3f0a7b-73f9-11e4-96c6-9b505f3ad7f0",
    "result": {
        "machineId": "xri://+machine?+hostid=89c1f2c4/+hostname=repositorio-hospitaldebraga.pre.rcaap.pt",
        "fqdn": "10.10.96.42",
        "port": 22,
        "data": {
            "aipbackup": {
            "backupFolder": "/dspace/backups/1416847504979/",
            "fileName": "site-wide.zip"
            },
            "dependencies": {
                "buildpath.bat": "/dspace/bin/buildpath.bat",
                "dspace": "/dspace/bin/dspace",
                "dspace.bat": "/dspace/bin/dspace.bat",
                "dspace-info.pl": "/dspace/bin/dspace-info.pl",
                "dspace++init.sh": "/dspace/bin/dspace++init.sh",
                "dspace_migrate": "/dspace/bin/dspace_migrate",
                "log-reporter": "/dspace/bin/log-reporter",
                "make-handle-config": "/dspace/bin/make-handle-config",
                "openaire-refresh-list": "/dspace/bin/openaire-refresh-list",
                "requestitem-init": "/dspace/bin/requestitem-init",
                "start-handle-server": "/dspace/bin/start-handle-server",
                "stats-add-institution-ip": "/dspace/bin/stats-add-institution-ip",
                "stats-aggregate": "/dspace/bin/stats-aggregate",
                "stats-country": "/dspace/bin/stats-country",
                "stats-detect-spiders": "/dspace/bin/stats-detect-spiders",
                "stats-init": "/dspace/bin/stats-init",
                "stats-keepwatched": "/dspace/bin/stats-keepwatched",
                ....
            }
        }
    }
}

 

TIMBUS Use Cases

RCAAP digital preservation UC

This use case consists on an open digital repository platform that combines most of Portugal's most relevant scientific digital repositories.
It is a centralized platform that allows to search among all the repositories in a seamlessly way.
As each repository is a single Dspace instance, the Dspace AIP extractor is used to perform individual backups of all instances.

 

Author

Miguel Gama Nunes miguel.nunes@caixamagica.pt

 

License

Copyright (c) 2014, Caixa Magica Software Lda (CMS).
The work has been developed in the TIMBUS Project and the above-mentioned are Members of the TIMBUS Consortium.
TIMBUS is supported by the European Union under the 7th Framework Programme for research and technological development and demonstration activities (FP7/2007-2013) under grant agreement no. 269940.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTIBITLY, or FITNESS FOR A PARTICULAR PURPOSE. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law or agreed to in writing, shall any Contributor be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work.
See the License for the specific language governing permissions and limitation under the License.