Recognizing a file's format based on markers inside the binary stream. The first stage of digital preservation recognised as 'knowing what you've got'.
File Format Extension
File extensions are the characters following the last full-stop in a filename, e.g. '.txt', '.exe', '.xls'. A clue to the file's format but not a guarantee.
File Format Signature
Binary markers inside a file that indicate its format and version. To find a signature means reading a file's content and so it is a more sure way of finding what a file is.
✨Magic✨ number is often used as a synonym for file format signature.
The etymology for the term dates back to the seventh version of the Unix operating system (1979).
The use of magic numbers grew as requirements for them did. The use of the phrase file format signature seems to have come about through the maturisation of the field of digital preservation.
The process of crawling a website in its current form and duplicating it, and its resorces (images, sound files, etc.) offline, or simply, elsewhere, that is, in a web-archive.
- Memento is first a project to support web archiving and finding of web archives in a standardised way.
- Memento is second, a synonym for the snapshot of a website - an archived website may be called a Memento.
- Memento is an intitiative from Los Alamos National Laboratory and Old Dominion University, in the US.
Automation of the web-archiving process. A tool crawls a website by looking at all of the links stemming from it and then visiting those one-by-one, potentially doing the same at the next site - figuratively, crawling.
Properties of an object, digital or otherwise, that prove it to be fixed - its state hasn't changed. The last-modified date of a file is a potential measure of fixity. A file's checksum value is a more robust measure of fixity.
From risk management, the formal statement of a risk is as follows:
"Because of x there is a risk that y which will result in z. "
The statement enables us to think about risk in terms of its impact and therefore steers us away from the concept of risk as in fear.
Impacts should be measurable, and real.
The process of technology becoming unreadable ot unusable. Digital preservation requires monitoring of potential obsolesence.
Hiding data from plain-view or use such that it is obfuscated, e.g. encryption, redaction, and password protection.
Data about another item, for example, the number of pages in a book, and the number of words. Metadata about a digital object can be anything that describes that file or something in the file, for example, a digital image's resolution (number of pixels along the x and y axis.)
The extraction of metadata from a digital object, often using tools that can read the file and export the information in a machine-readable form such as XML or JSON.
Characterization is whereby metadata crucial to the preservation of the digital object is recorded.
This information may describe the object itself or part of its technical environment.
Properties of individual records or groups of records that may be prioritised for preservation, and used as a measure of a successful ‘preservation action', e.g. if the number of pages in a record is considered to be important, it is a significant property we need to monitor and measure.
The process of selecting metadata about a digital object and encoding it into an alternative schema, e.g. for archival description, or preservation.
The use of digital techniques to support the scholarly study of the humanities (Literature, Archaeology, Architecture etc.).
Digital Archive (DIN (German Instituit for Standardization) Definition)
An organisation (consisting of people and technical systems) which has assumed responsibility for the long-term preservation and long-term availability of digital data and its provision for a specified designated community.
Digital Preservation (Library of Congress definition)
Digital preservation is the active management of digital content over time to ensure ongoing access.
Digital Preservation (AV Preserve definition)
Digital preservation is a function of digital curation, in which digital content is prepared and actively managed for long-term access.
Digital Preservation System
A system, or set of systems and tools, that enable digital preservation.
A system may be contrived of components for ingest, storage, preservation management, and access, as well as other functions.
Industry examples include RODA, Archivematica, Preservica, and Rosetta.
Archive Management System
- An archive management system commonly wraps functions that are not part of a digital preservation system.
- An archive management system enables the description of digital records and the provision of context relevant or one or more archival models.
- A connection must exist between the archive management system and a digital preservation system to allow digital records to be retrieved by their catalogue references.
- Content Management System/Enterprise Content Management System are contemporary methods of maintaining digital records in an organisation.
- Systems manage storage and retrieval of records across the organisation for all users.
- Systems will implement retention and disposal schedules, as well as wrapping records in suitable record keeping management and discovery metadata.
- A repository certified as trusted following an audit using the measures defined in ISO standard 16363:2012.
- A trusted repository will conform to measures surrounding Organizational Infrastructure. Digital Object Management. Infrastructure and Security Risk Management.
- Bodies providing audit and assessment must also conform to ISO standard 16919:2014.
A file format and its informational content that follows a set of rules defined in its specification is considered to be valid.
A file format that conforms to a structure defined by its specification is considered to be well-formed.
An item, or groupsing of items that constitute a record in a digital repository, e.g. a book is an intellectual entity made up of many pages. Different mechanisms of displaying or looking after this book may be called representations.
A standard is commonly a set of recommendations and principles, that may or may not require absolute compliance.
From the International Organization for Standardization's (ISO) perspective, standards provide specifications for products, services, and systems, that help ensure quality, safety, and efficiency.
In the digital preservation community, standards help to create a lingua franca as a platform to communicate upon.
The loss of data through degradation of a carrier medium, e.g. the loss of magnetic resonance on a floppy disk leading to the file allocation table (file directory) becoming unreadable.
The continuous monitoring of digital objects for changes in their bitstream. The monitoring of a file on a file system for bit rot.