Skip to Main Content

Research Data Services

What is Metadata?

What is metadata?

Data documentation is also commonly referred to as "metadata," or "data about data." It includes descriptive information about a particular data set, object, or resource, including (but not limited to) how it is formatted, and when and by whom it was collected.

Why is Metadata Important?

Why is it important?

By adding metadata completely and carefully (at the very beginning of your research project, as well as throughout its lifecycle!), it will ensure that the data is accessible for any user, and it will be easier for other researchers to cite your information. 

Metadata Examples and Guidelines


A wide range of metadata standards exists for researchers to choose from to document their project.

  • An example of a data standard is the DDI (Data Documentation Initiative), a standard designed for numeric data
  • Additional examples of metadata standards can be found at the Digital Curation Centre web site


In the boxes below are some of the general guidelines related to what should be documented regardless of the discipline or project type.

This metadata should be stored with the project data at the very least in a readme.txt file. If the information is included in an article or presentation, then you can reference that item so that the information can be accessed there.

When recording this information, think about how you would search for similar projects, and be sure to include that information so that other researchers in the field can easily access the materials.

Essential Fields

Name of dataset or research project that produced it. (Include both if applicable.)
Names and addresses of the group that created the data.
Unique identifier or number that is used to identify the data. This could be an internal project number or code to reference the data.
A brief synopsis of the project or data that another researcher can review quickly to see the relevance of the project to what they are seeking.
All the dates associated with the project. The most important is probably the release date of the data, but you'll eventually want to include:
  • start and end date of the project
  • time period covered by the data or project
  • maintenance cycle of the data
  • update schedule of the data
  • any other important dates that will help document the process and aid in preservation
Any known intellectual property rights held for the data or project.

Recommended Fields

Names and addresses of additional individuals that contributed to the project.
Keywords, phrases, or subject headings that will describe the subject or content of the data. (In adding these,
think of how you would search for the materials.)
Organizations or agencies that funded the research or project.
Access Information
The location of the data and how the researcher can access the materials. (Confidentiality can be addressed here
as well.)
The language(s) of the content.
If the data relates to a physical location, the spatial coverage should be documented.
The process of how the data was generated, include
the equipment software used including the version the experimental protocol data validation and quality assurance of the data any other relevant information
Data Processing
Documenting the alterations made to the data will aid in preservation of the data and record who made changes and
for what reasons at specific times.
Citations for the sources that were used during the project. (Include where the other data or material was stored
and how it was accessed when appropriate.)
List of File Names
List all of the data files associated with the project and include the file extensions. (e.g.,
File Formats
Format(s) of the data and any software that is required to read the data including the version. (e.g., TIFF, FITS,
File Structure
Organization of the data file(s) (and the layout of the variables when applicable).
Variable List
List of variables in the data files, when applicable.
Code Lists
Explanation of codes or abbreviations used in the file names, variables of the data, or the project over all that
will help the user understand the project. (e.g., "999" indicates a missing value in the data)
Date/time stamp for each file and use a separate identifier for each version.
Used to test if your file has changed over time. (This will aid in the long term preservation of the data and help
make it secure by tracking alterations.)
Related Materials
Links or location of materials that are related to the project. (e.g., articles, presentations, papers)
The recommended way to cite the data or the information needed.