The Climate and Forecast (CF) metadata conventions are conventions for the description of Earth sciences data, intended to promote the processing and sharing of data files. The metadata defined by the CF conventions are generally included in the same file as the data, thus making the file "self-describing". The conventions provide a definitive description of what the data values found in each netCDF variable represent, and of the spatial and temporal properties of the data, including information about grids, such as grid cell bounds and cell averaging methods. This enables users of files from different sources to decide which variables are comparable, and is a basis for building software applications with powerful data extraction, grid remapping, data analysis, and data visualization capabilities.
History and evolution
The CF conventions were introduced in 2003, after several years of development by a collaboration that included staff from U.S. and European climate and weather laboratories.[1] The conventions contained generalizations and extensions to the earlier Cooperative Ocean/Atmosphere Research Data Service (COARDS) conventions [2] and the Gregory/Drach/Tett (GDT) conventions.[3] As the scope of the CF conventions grew along with its user base, the CF community adopted an open governance model.[4] In December 2008 the trio of standards, netCDF+CF+OPeNDAP, was adopted by IOOS as a recommended standard (number 08-012) for the representation and transport of gridded data. The CF conventions are being considered by the NASA Standards Process Group (SPG) and others as more broadly applicable standards.[5][6]
Applications and user base
The CF conventions have been adopted by a wide variety of national and international programs and activities in the Earth sciences.[7] For example, they were required for the climate model output data collected for Coupled model intercomparison projects, which are widely used for the Intergovernmental Panel on Climate Change assessment reports.[8]
They are promoted as an important element of scientific community coordination by the World Climate Research Programme.[9][10] They are also used as a technical foundation for a number of software packages and data systems, including the Climate Model Output Rewriter (CMOR), which is post processing software for climate model data, and the Earth System Grid, which distributes climate and other data.[11][12][13] The CF conventions have also been used to describe the physical fields transferred between individual Earth system model software components, such as atmosphere and ocean components, as the model runs
.[14]
Supported data types
CF is intended for use with state estimation and forecasting data, in the atmosphere, ocean, and other physical domains. It was designed primarily to address gridded data types such as numerical weather prediction model outputs and climatology data in which data binning is used to impose a regular structure.[13][15] However, the CF conventions are also applicable to many classes of observational data and have been adopted by a number of groups for such applications.
Supported data formats
CF originated as a standard for data written in netCDF, but its structure is general and it has been adapted for use with other data formats. For example, using the CF conventions with Hierarchical Data Format data has been explored.[16]
Design principles
Several principles guide the development of CF conventions:
Data should be self-describing, without external tables needed for interpretation.
Conventions should be developed only as needed, rather than anticipating possible needs.
Conventions should not be onerous to use for either data-writers or data-readers.
Metadata should be readable by humans as well as interpretable by programs.
Redundancy should be avoided to prevent inconsistencies when writing data.
Specific CF metadata descriptors use values of attributes to represent
Data provenance: title, institution, contact, source (e.g. model), history (audit trail of operations), references, comment
Description of associated activity: project, experiment
Description of coordinates: coordinates, bounds, grid_mapping (with formula_terms); time specified with reference_time ("time since T0") and calendar attributes.
Meaning of grid cells: cell_methods, cell_measures, and climatological statistics.
A central element of the CF Conventions is the CF Standard Name Table. The CF Standard Name Table uniquely associates a standard name with each geophysical parameter in a data set, where each name provides a precise description of physical quantities being represented. Note that this is the string value of the standard_name attribute, not the name of the parameter. The CF standard name table identifies over 1,000 physical quantities, each with a precise description and associated canonical units. Guidelines for construction of CF standard names are documented on the conventions web site.
As an example of the information provided by CF standard names, the entry for sea-level atmospheric pressure includes:
standard name: air_pressure_at_sea_level
description: sea_level means mean sea level, which is close to the geoid in sea areas. Air pressure at sea level is the quantity often abbreviated as MSLP or PMSL.
canonical units: Pa
Software
CF-Python is a data analysis package built on a complete implementation (CFDM) of the CF conventions. The authors of the CFDM and CF-Python currently assert a desire to fully support all aspects of the CF conventions.
OriginPro version 2021b supports[17] netCDF CF Convention. Averaging can be performed during import to allow handling of large datasets in a GUI software.
The xarray Python library parses and decodes data stored according to CF Conventions.
The Iris Python library "draws heavily from the NetCDF CF Metadata Conventions as a source for its data model".[18]