Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Metadata for Describing Digital Objects: Vocabularies

A guide for Appalachian State University faculty, staff and students who are creating metadata for a digital project

"A controlled vocabulary is a predetermined list of terms on a certain topic or of a certain type. These lists typically identify one preferred word or phrase for a given concept, and sometimes provide mappings from other terms for the concept to the preferred one."

Understanding Metadata: What is it? What is it for? (Riley, 2017, p. 17)

While metadata schema, such as Dublin Core, provide the overall structure of your metadata, controlled vocabularies, thesauri and other encoding schemes provide standardized, predetermined words, phrases or numerals for the purpose of ensuring metadata consistency. Below are examples of how schema and vocabularies work together to describe digital objects.

Example 1: Digitized photographs

If your digital project consists of digitized photographs of a Fourth of July parade from July 4, 1979, occurring in downtown Boone, N.C., and you are using the Dublin Core schema for your project, the metadata for one of the photographs might look something like this:

Title: Fourth of July Parade on King Street

Subject: Fourth of July celebrations

Date: 1979-07-04

Spatial Coverage: Boone (N.C.)

Type: Image

Format: color photographs

Extent: 3 MB

In this example, the words before the colon are the Dublin Core elements that provide a structure for the metadata. With the exception of the "Title" element, the words, phrases or numerals appearing after the colon are derived from a controlled vocabulary, thesaurus or encoding scheme, as illustrated below:

Subject: Fourth of July celebrations (Library of Congress Subject Heading or LCSH)

Date: 1979-07-04 (Extended Date Time Format or EDTF)

Spatial Coverage: Boone (N.C.) (Library of Congress Name Authority File or LCNAF)

Type: Image (DCMI Type Vocabulary)

Format: color photographs (Getty Art & Architecture Thesaurus)


Example 2: Digitized cassette tapes/audio files

For digitized, two-sided, audio cassette tapes under copyright, which feature a bluegrass artist called Francis Gillum, who was recorded playing live on May 20, 1987, your metadata may look like this:

Creator: Gillum, Francis

Subject: Bluegrass music

Date: 1987-05-20

Type: Sound

Format: mp3


00:31:51 (Side A)
00:31:04 (Side B)

Rights: In Copyright – Educational Use Permitted

The metadata in both Example 1 and Example 2 use the same metadata schema (Dublin Core), controlled vocabularies, thesauri and encoding schemes. Using consistent, standardized words and phrases enables users to more easily and efficiently search for your content. Moreover, it allows you to more easily and efficiently share your content with other institutions or systems using the same metadata schema, vocabularies, etc.

Controlled Vocabularies and Tags (video, 17:55)

During the first five minutes of this video, Dr. Pomerantz describes the use of controlled vocabularies and tags (uncontrolled vocabularies).


Profile Photo
Ashlea Green
Subjects: Metadata