Data and Metadata

Two terms are essential for understanding this report. The first is data; the second is metadata.

Data

There are many types of anthropological data. The table below represents some of the most important types:

Type Examples
Images Photographs, maps of excavation sites, biomedical images (e.g., radiographs)
Texts Field notes, annotations, excavation plans, manuscripts
Audio Recordings of songs, conversations, oral histories
Video Recordings of cultural events, conversations, archaeological excavations
Databases Database of measurements, lexical items, locations
3-D scans Scan of fossil or artifact

Table 1. Common types of data in anthropology

Any of these types of data may be stored in digital form. In the broadest sense, digital “data” are thus simply electronic coded forms of information. For anthropological purposes, a more pragmatic definition of data are measurements, observations or descriptions created or collected by a researcher. Different subfields vary widely in the types of things described (referents). For example, in cultural anthropology, the units or referents might be observed events, informant interviews, households or communities. In archaeology, the units might be settlements, quadrants of a grid, or artifacts. In linguistics the units might be lists of vocabulary items, recorded and transcribed texts, or grammatical patterns. In physical anthropology data units might be measurements, character states, scans, images, or even the fossil on which those were taken, genetic sequences or bases, behavioral observations, sonograms, phenological observations, or radiometric dates.

Throughout this report, for purposes of exposition, we will assume that anthropological data are collected by anthropologists, given that the anthropological community is our intended audience. However, many of the points made here will center around the digital preservation and access of anthropological data generally, whether collected by professional anthropologists or coming from some other source.

But storing data is not sufficient without the preservation of their context. Attention to metadata is essential to DPA.

Metadata

Metadata are comprised of descriptive documentation essential to informing the process of data creation, collection, management and preservation. Metadata provide information about the original referent, the collection processes, rules of collection, as well as descriptions of data management processes and provisions for access and use of the data (such as licensing of data to specify permitted uses). Metadata provide key contextual information to facilitate understanding and are intended to assist research within known and predictable scientific domain(s). As research questions in anthropology evolve, metadata may also enable discovery and use of archived data in as yet unanticipated fields of research. Thus, careful effort should be made to make the descriptive content of metadata intelligible to scientists beyond a very limited scientific expertise. Because new technology allows for reuse and expansion of archived data, as well as the creation of new persistent tagging, metadata creation is an ongoing process not a single event, metadata usefully may grow over time by accretion, asynchronously, by the efforts of properly qualified contributors. We anticipate that new data will be linked to older archived data through a continuous process that updates metadata and creates new metadata to inform evolving and expanding datasets.

[Previous: Vision Statement]      [Next: Digital Preservation and Access (DPA)]

One Comment

  1. Joel Halpern says:

    I think this is an excellent way of proceeding and I do note that this situation is very parallel to the way in which the UMass Archives is proceeding with my data which is part of a large collection. Information on the ways in which my archival materials are organized is available in the extensive Finders Guide at the UMass Library Archives Web site.

Leave a Reply