Tools Flashcards Preview

Digital Preservation > Tools > Flashcards

Flashcards in Tools Deck (44)
Loading flashcards...
What is...


PRONOM is a registry of file formats that is maintained by The National Archives, UK. 

PRONOM delivers new file format information to a tool called DROID which it can use to identify files in collections and assign a unique identifier. 

What is...


DROID is a tool that can be used to automatically identify a file's format using 'file format signatures' that it downloads from PRONOM. 

DROID assigns a unique identifier to a file format called a PUID (PRONOM Unique Identifier)

What is...


Nanite is a programming library that wraps DROID in a way that makes it possible for software developers to incorporate file format identification in their programs. 

What is...


JHOVE is a tool that checks whether a file format accurately conforms to its specification, for example, it can check whether date formats used in certain file types are standardized. 

JHOVE can do this for approximately 12 formats, but the software makes it easy for more to be programmed. 

What is...


Siegfried is a DROID like, command-line tool that also uses PRONOM information to identify file formats. 

Siegfied is approximate to DROID, but adds other mechanisms to identify file formats and alternative ways for users to interact with it.

What is...


File is a linux based tool for identifying file formats. Unlike DROID and Siegfried it does not return unique identifiers for what it finds. 

FIle uses a different mechanism and different corpus of information to identify the format of a digital object. 

What is...


An online resource of community contributed ‘recipes’ (commands) for processing audio visual files through the open source audio visual transcode and characterization tool ffmpeg.

What is...


  • A free and open source tool for working with audio and video.
  • ffmpeg can characterize multimedia, even output visual analyses.
  • ffmpeg can transcode it into other file formats, and perform many other manipulations.
  • Developed and maintained by the ffmpeg team.

What is...


A utility for transferring data across file systems while maintaining key file system properties such as last-modified date, and user's permissions.

What is...

Not strictly for digital preservation, but useful nonetheless, will annotate Linux commands for users and enables those annotations to be shared.

What is...

Vera PDF

A free and open source tool for the validation of PDF/A files. Vera PDF Provides some support for other PDF variants.

What is...


A tool written in Python that characterizes JPEG2000 (JP2) files. Important in digitization workflows where JP2 is now taking a place for the savings in storage space over TIF.

What is...


A tool by Martin Hoppenheit to reduce the number of signatures in the DROID signature file, e.g. for the purpose of quicker identification in image format only digitization workflows.

What is...


A large scale, OAIS (Open Archival Information System) compliant, system that implements large pieces of the digital preservation workflow from ingest to delivery. Rosetta is maintained by the company Ex Libris.

What is...


RODA is an open-source digital repository designed for preservation developed in Portugal. The repository supports all the main functional components of the OAIS model. 

What is...


Open source digital preservation system maintained by Artefactual. Younger than Preservica and Rosetta, Archivematica has a growing user-base, and a different support model to the two mentioned.

What is...


Originally called Safety Deposit Box, Preservica is an OAIS compliant digital preservation system maintained by Preservica in Abingdon, Oxford, UK.

What is...

Safety Deposit Box

The first four implementations of the Preservica digital preservation system went under the name Safety Deposit Box, organisations such as The National Archives, UK, and Swiss Federal Archive, were some of the first to adopt this system.

What is...

Apache Tika

A tool maintained by the Apache Software Foundation capable of extracting metadata and content from a range of file formats including PDF, Microsoft Office, Rich Text Format, and XML.

What is...

A registry of digital forensics tools and training courses developed in 2016 that will prove useful for finding tools for dissecting and interpreting digital files for preservation and access.

What is the...

Just Solve the File Format Problem

A wiki style registry of file formats that can be edited by all users. It differs from PRONOM in the regard that anyone can add information, and so it is a good idea to submit something to this wiki first, or in concert with PRONOM, for the benefit of the community.

Just Solve It, is an initiative of the Internet Archive.

What is a...

Write Blocker

Forensics hardware that blocks the ability to write to a storage device, thus protecting data and its evidentiary value. Write blocking tools are available from companies such as Tableau and Wiebetech.

What is a...


USB controller and write blocker for legacy floppy disk drives. It allows us to use 3.5-inch and 5.25-inch disk drives on modern computer hardware. 

What is the...

SuperCard Pro

A USB controller and write blocker for legacy floppy disk drives, specifically 3.5-inch and 5.25-inch disk drives. One of a handful of alternatives to KryoFlux.

What is important about...


A useful way for those in digital preservation to connect with the community. An active forum with lots of branches out to other resources.

What is...


  • A portal, search engine, and API that connects metadata about content at Australian GLAM institutions.
  • TROVE makes this information findable.
  • Trove is a collaboration between National Library, Australia's State and Territory libraries.

What is...


  • A technical registry that describes tools useful for long term digital preservation.
  • Acts primarily as a finding and evaluation tool to help practitioners find the tools they need to preserve digital data.
  • COPTR collates this knowledge in one place instead of organisations competing against each other with their own registries.

What is...


  • 'Twitter’ archiving (twarc) is a command line tool and Python library for archiving Twitter JSON data.
  • Each tweet is represented as a JSON object that is exactly what was returned from the Twitter API.
  • In addition to letting you collect tweets Twarc can also help you collect data on users, and trends.

What is a...


  • Also known as ISO 28500:2009.
  • A standardised file format for storing the result of a web crawl – the output of a web archiving effort.
  • WARC files many aggregate WARC records.
  • WARC can encode any other file format – as you’d expect of any potential digital object on the web.

What is the...

Wayback Machine

A search engine, and API for the archived web. Hosted by the Internet Archive, based in San Francisco.