Open Archival Information System (OAIS)

OAIS model. Image courtesy of Wikimedia Commons

Born out of the need to preserve space research data, OAIS is the cornerstone of digital preservation. It is a high-level conceptual model for a system that would accept, preserve, and provide access to digital content.  It provides an overview of terminology that is used to discuss digital preservation.  OAIS began as recommended practices for participating space agencies and has been adopted by the ISO.  It is not a standard; rather, it is a guideline for structuring digital preservation.  It is designed to be applicable to any archive or organization that creates materials that require long-term preservation. (OAIS reference model image courtesy of Wikimedia Commons)

This reference model:

  • provides a framework for the understanding and increased awareness of archival concepts needed for Long Term digital information preservation and access;
  • provides the concepts needed by non-archival organizations to be effective participants in the preservation process;
  • provides a framework, including terminology and concepts, for describing and comparing architectures and operations of existing and future Archives;
  • provides a framework for describing and comparing different Long Term Preservation strategies and techniques;
  • provides a basis for comparing the data models of digital information preserved by Archives and for discussing how data models and the underlying information may change over time;
  • provides a framework that may be expanded by other efforts to cover Long Term Preservation of information that is NOT in digital form (e.g., physical media and physical samples);
  • expands consensus on the elements and processes for Long Term digital information preservation and access, and promotes a larger market which vendors can support;
  • guides the identification and production of OAIS-related standards.

The OAIS environment is a result of the interaction of four entities:

  • Producers supply the information that the archive preserves;
  • Consumers, otherwise known as the designated community, use the preserved information;
  • Management establishes the broad policy objectives of the archive;
  • Archive administers the day-to-day functions.

In the context of the OAIS, information can exist in two forms: either as a physical object (e.g., a paper document, a soil sample), or as a digital object (e.g., a PDF file, a TIFF file) and  may be referred to collectively as the data object. Members of the Designated Community for an archive should be able to interpret and understand the information contained in a data object either because of their established knowledge base or with the assistance of supplementary “representation information” that is included with the data object.

An information package includes the following information objects:

  • Content Information which includes the data object and its representation information;
  • Preservation Description Information which contains information necessary to preserve affiliated content information;
  • Packaging Information which holds the components of the information package together; and
  • Descriptive Information which includes metadata about the object, allowing the object to be located at a later time using the archive’s search or retrieval functions

The OAIS model identifies three types of information packages:

  • Submission Information Package (SIP), which is sent from the information producer to the archive;
  • Archive Information Package (AIP), which is the information package actually stored by the archive; and
  • Dissemination Information Package (DIP), which is the information package transferred from the archive in response to a request by a consumer.

There are five OAIS functional entities that manage the flow of information from information producers to the archive, and from the archive to consumers.Together, these functions identify the key processes endemic to most systems dedicated to preserving digital information. The five functional entities include:

  • Ingest function is responsible for receiving information from producers and preparing it for storage and management within the archive. More specifically, the Ingest entity accepts information from producers in the form of SIPs, performs quality assurance checks on the SIP, generates an AIP from one or more SIPs and extracts Descriptive Information from the AIPs  (metadata for search and retrieval, thumbnail images for browsing, etc.). Finally, the Ingest function transfers the newly created AIPs to Archival Storage and the associated Descriptive Information to Data Management.
  • Archival Storage function handles the storage, maintenance and retrieval of the AIPs held by the archive. These responsibilities include receiving new AIPs from the Ingest function and assigning them to permanent storage according to various criteria (media requirements, expected utilization rates, etc.), migrating AIPs to new media as required, error checking, implementing disaster recovery strategies, and providing copies of requested AIPs to the Access function.
  • Data Management function coordinates the Descriptive Information pertaining to the archive’s AIPs, in addition to system information used in support of the archive’s operation. In particular, the Data Management function maintains and administers the database containing this information; executes query requests received from the Access function and generates result sets to be returned to the requestor; creates reports in support of the Ingest, Access or Administration functions; and performs updates on the Data Management database, including the addition of new Descriptive Information received from Ingest or new system data received from Administration.
  • Administration function manages the day-to- day operation of the archive. This includes negotiating submission agreements with information producers and performing system engineering, access control and customer services. The Administration function also performs regular audits of SIPs to assess their compliance with the submission agreement, and develops policies and standards related to the system’s data standards (e.g., data format standards, documentation requirements, storage, migration and security policies). This function also serves as an interface between the archive and two components of the OAIS environment: management and the Designated Community (Figure 1)
  • Access function helps consumers to identify and obtain descriptions of relevant information in the archive, and delivers information from the archive to consumers. This function involves the provision of a single user interface to the archive’s holdings for both search and retrieval purposes; generating a DIP in response to a user request by obtaining copies of the appropriate AIP(s) from Archival Storage; obtaining relevant Descriptive Information from Data Management in response to a query; and finally, delivering the DIP or query result set to consumers.