A system that can generate digital preservation packages in accordance with the OAIS framework of digital preservation. Archivematica can create SIPS and AIPS (submission information packages, and archival information packages). Both are structured in the Bagit RFC8493 specification. Archivematica can generate DIPs for dissemination in systems other than Archivematica.
Analogous to an ‘Archivematica’ it is the user-facing component of the system. It is a web-interface. Its horizontal menu is a series of tabs that relate back to the OAIS stages of the preservation workflow. The two main screens, Transfer and Ingest display the status of various microservice jobs.
A synonym for a pipeline or an ‘Archivematica’. An organization might run multiple pipelines across different servers to enable it to process different material types in different ways.
A separate system which manages the different package types output by different Archivematica pipelines. A storage service can connect to multiple pipelines. A pipeline is usually connected to a single storage service.
Transfer types are analogous to material flows in other systems such as the Rosetta digital preservation system. A transfer type allows different material, e.g. a dataset to be handled differently by an Archivematica workflow.
Both Archivematica and the Storage Service have APIs that can be used to start transfers, get transfer status updates, and manage packages; as well as other assorted functions such as downloading individual files from AIPs.
A method of connecting to a file system or quasi-file-system using different protocols. Examples include local file system, Amazon S3, and Dataverse API. The storage space lets the user establish locations for transferring data, processing data, as well as creating an area for the storage of archival packages.
A storage location is a specific area of a storage space designated to the purpose of ‘transfer source’, ‘AIP storage’, ‘DIP storage’, ‘processing’, ‘replication’ and a number of other purposes such as an area for failed or rejected transfers.
The process of taking material from a transfer source, running that material through various microservice jobs and generating a SIP package in Archivematica. Transfer is responsible for processes such as virus scanning or file format identification.
The process of turning a SIP into an AIP in Archivematica. The Ingest phase includes normalization (creating derivative objects (files) for access, or preservation-friendly objects). It is also responsible for generating an AIP METS file containing information about the structure and content of the AIP.
A transfer type in Archivematica. A standard transfer implements a basic preservation workflow and makes no additional assumptions about the structure of the content transferred or about how it should be arranged on disk.
Abbreviated to processingMCP.xml the processing configuration affects every transfer type and provides granular control over the various microservices that may or may not run. For example, users can select whether or not to normalize files in the processing configuration, or let it remain an option in the ingest workflow.
Automated Processing Configuration
A description of a processing configuration file where all decision points have been resolved, e.g. a user can select at run-time where to send an AIP to, e.g. a media storage location, or generic storage location. Or they can determine this up front with customized processing configurations. A transfer or ingest with all decision points resolved by a processing configuration can be described as being ‘automated'.
A transfer type in Archivematica where the content received is in a Bag(it) format, and validated as such before being processed into an AIP. If additional information about the transfer is provided in the bag-info.txt it is mapped to the AIP METS.
Identical to the unzipped bag transfer type except the bag is decompressed before being validated and processed further. Archivematica uses the Library of Congress Bagit.py tool to work with bags.
A transfer type which enables users to connect to a Dataverse via the Dataverse API and download datasets to be processed into an AIP. Metadata associated with a Dataverse dataset is mapped to AIP METS. At present it is only possible to retrieve a dataset from Dataverse, not upload an Archivematica output to it.
A mechanism that allows a disk image type to be selected and processed by Archivematica. The transfer type will employ different tools to enable standard features such as file format identification of the disk image’s content. Additional disk image specific metadata can be added by the user at point of transfer which will be preserved in the AIP METS.
Dublin Core Metadata
Items, directories, and the AIP itself can be described using the set of Dublin Core Metadata Elements (DCMI). All elements are repeatable and will appear in the descriptive metadata section of the AIP METS. Metadata can be provided in CSV or JSON as part of a transfer, or edited during the transfer or ingest workflows.
Archivematica uses UUIDs for everything! UUIDs identify different microservice jobs, different transfers or Ingests. The AIP is assigned a UUID as well as the AIPs contents. A UUID might look as follows “ebc9fc1c-6243-4461-842c-215eba47e379”
If a transfer has been normalized for access then a DIP will be created by Archivematica. It can be stored (optional) or uploaded to another system such as AtoM (Access to Memory) (also optional). Dublin core metadata associated with the transfer as well as the access derivatives will be contained in the DIP package created.
A set of utility scripts that are used to perform tasks on archival data during pre-ingest, ingest, or post-ingest. The primary use-case for automation-tools is to allow users to continually process content at set intervals; automatically, and without the need for intervention.
Fixity is a utility application that lets users perform fixity checks across the entire storage service. Fixity is also an API endpoint which writes the status back to the storage service database. The results are returned by the endpoint and are also visible in the storage service’s packages tab.
The transfer tab shows the success statuses of all the microservice jobs that are run in the process of creating a SIP.
A SIP can be sent to a backlog and retrieved later for arrangement in the appraisal tab. The backlog tab lists all of the SIPs that can be accessed.
The appraisal tab can be used to arrange SIP contents and combine multiple SIPs to become one AIP. The appraisal tab can also display information extracted from the bulk-extractor tool as well as provide previews of a small number of file types.
The ingest tab shows the success statuses of all the microservice jobs that are run in the process of creating a AIP.
Preservation planning is synonymous with the format policy register (FPR). Specific commands to be run on specific file format types are controlled from here. The FPR maintains commands for Identification, characterization, normalization, transcription, validation, and verification. Format types must have a PRONOM entry and be able to be identified by the Siegfried or FIDO identification tools.
Archival Storage Tab
AIPs can be retrieved and downloaded from the archival storage tab. AIPs can be searched for based on the content of their dmdSecs in the METS metadata (if an AIP has been created with Dublin Core (DC) metadata). Pointer files can also be retrieved and viewed from this tab.
Pointer files are METS files that describe an archival package’s container. The metadata in a pointer file is designed to make the archival package as accessible as possible in the future. For example, if a package is compressed or encrypted, then the first most important thing to know is that it is. The second most important information then is a description (pointers) how to then decompress or unencrypt. This information as well as other instructions may be found in the pointer file type.
The administration tab enables users to configure parts of an Archivematica dashboard, for example, create new users; connect the pipeline to a storage service; or establish the parameters of a processing configuration.