Additional Storage

While storage grids do exist in the United States (see the NSF program on Grid storage at Commercial Cloud Storage is another option for LTP. This solution is just beginning to gain traction in the US Academic community since it is a potential cost saver. A powerful motivator while the country wrangles through a deep recession. Cloud Storage provides the opportunity to outsource the storage function to large commercial vendors like Amazon and Google that run their own storage grids. For this storage option trust is a significant issue. Commercial vendors are subject to the natural business cycle and no firm is completely immune to failure or takeover. How to access or recover data when a business fails is of serious concern to the academic community.

Secure access to data is another problem identified with commercial cloud storage. In response to these concerns the Mellon Foundation recently sponsored a planning grant to understand how the academic community could take advantage of cloud storage without being at the mercy of the business cycle and to technically explore how commercial cloud storage could be overlaid with a service interface that would protect data from unauthorized access and automatically replicate data when a firm went out of business. Details about this initiative are available from the DuraSpace website.

Metadata Standards for Long-Term Storage

PREMIS (PREservation Metadata: Implementation Strategies) is the de-facto standard for the digital library community that specifies metadata entities recommended to ensure the long-term preservation (discovery, access, rendering and understandability) of digital data encapsulated in a vast array of file formats. An in-depth understanding of the PREMIS standard was not present in the group. This made it difficult to realistically evaluate PREMIS as a standard, which could be successfully applied to preserve anthropological data. However, in the absence of any other recognized standard, leveraging and extending this standard for the Anthropology community was strategically the right course of action. A policy question that needs to be resolved by some standards committee is how much of what elements, of this very elaborate standard, are needed by the anthropological community to meet their preservation purposes. It is not practical or affordable to capture data for all of the sub-elements in the PREMIS standard.

Existing Repository Software

Repository software used to ingest, save or preserve and access digital content used in the cultural heritage community is mostly open source. Repository software offerings that have gained significant traction in the digital library domain are (1) Fedora (2) DSpace (3) Greenstone (4) E-prints (5) Plone and (6) ContentDM from OCLC. It is important to note that the Fedora and DSpace communities have recently combined to form a consolidated community called DuraSpace. All of these application have out of the box client interfaces to there underlying data stores to simply the ingest, storage and search/access to data. In addition these repository systems have Application Programming Interfaces (APIs) that can be used to build customized web applications or web services for any of the aforementioned functions. Protocols such as OAI-PMH, OAI-ORE and SWORD, to name a few, have also been developed by the digital library community to make these systems interoperate so that data can be exchanged between systems.

Planning Models

The PLANETS project has published a preservation data model and created a tool PLATO for preservation planning. The model can provide two distinct views of stored data, one from the end-user perspective that facilitates search and discovery of preserved data, and the other from a preservation perspective that enables preservation treatments (media or format migrations) at the file set level that does not impact the end-user view or understanding of the data. Risk of data loss is inherent in any preservation treatment and the planning tool PLATO was designed to attenuate that risk. “The planning tool PLATO is a decision support tool that implements a solid preservation planning process and integrates services for content characterization, preservation action and automatic object comparison in a service-oriented architecture to provide maximum support for preservation planning endeavors.”1 Again in the absence of other available standards the group maintained that is was strategic for the anthropological community to leverage this standard for their community purposes.

  1. From Welcome to Plato, the Planets Preservation Planning Tool. []

Leave a Reply